We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Blatant Academic Fraud, OpenAI's New Sibling, a Killer Drone?!

2021/6/4

Last Week in AI

AI Deep Dive AI Chapters Transcript

People

Daniel Bashir

Sharon Zhou

Topics

Andrey Krennikov：OpenAI 的一部分成员由于对公司未来发展方向的差异，选择成立 Anthropic 公司，专注于 AI 安全性和可解释性研究。这反映了 AI 领域内部对 AI 安全性和商业化之间平衡的讨论。 Sharon Zhou：Anthropic 公司致力于提升大型生成模型的可控性、可解释性和鲁棒性，使其更安全可靠地应用于现实世界。这与 OpenAI 越来越商业化的方向形成对比，也体现了 AI 安全性研究的重要性。

Deep Dive

Chapters

The episode discusses the formation of Anthropic, a new AI company led by former OpenAI members, raising $124 million with a focus on making large generative models safe and usable.

Shownotes Transcript

Translations:

中文

Hello and welcome to Scanning Today's Let's Talk AI podcast where you can hear from AI researchers about what's going on with AI. This is our latest Last Week in AI episode in which you get summaries and discussion of some of last week's most interesting AI news.

For regular listeners, we'll note that there's a bit of a change of format this week. We won't have our usual summary segment until the end of the episode, but otherwise it'll be just about the same. So let us introduce ourselves. I am Andrey Krennikov, a third-year PhD student at the Stanford Vision and Learning Lab. I focus mostly on learning algorithms for robotics. And with me is my co-host...

I'm Dr. Sharon Zhou, a graduating fourth-year PhD student in the machine learning group working with Andrew Ng. I do research on generative models and applying machine learning to medicine and climate. And this week, Andre is pushing hard for the coral deadline in three weeks. Starting to, starting to. Yeah.

But yeah, you know, we'll see. Always the last two weeks is when things get hectic, but also when kind of hopefully everything magically comes together and then somehow the paper emerges from the ether. We'll see. Yes, that is exactly what happens. And we'll talk a bit about how the process might be a little bit not so great later in this episode.

Exactly. But before that, let's dive into our first story about AI research with this one from a couple of news sources covered how there was the rebel AI group that raises record cash after machine learning schism.

So this is all about this new company, Atropic, led by Dario Amodei, a former head of AI safety at OpenAI, which has raised $124 million in its first funding round.

So this is basically a bunch of people from OpenAI, including this Danielle Amodei and her very sibling, Dario Amodei, and a bunch of other big names of people who worked on GPT-3, interoperability, scaling laws, etc. Basically, various people from OpenAI starting this new company called

that appears to be maybe kind of a similar focus as OpenAI when it started. So it's very similar in terms of this kind of public benefit corporation that is researching kind of how to make AI beneficial for everyone. Yeah, so I know Sharon, you said you've been kind of hearing things about this front for a while. So I wonder if you have any kind of thoughts on this one.

Yeah, I'm really excited for Anthropic. They have been kind of in stealth for a while coming out of OpenAI and have, you know, raised and announced this huge round, which is super exciting. And their goal is to essentially help people.

AI, you know, especially these large generative models that we're seeing, help put guardrails on them, make sure that we can safely use them and actually just make them usable in general. Because as we know, GPT-3 can spew, you know, nonsense and sometimes pretty toxic nonsense that can hurt people or make people, you know, feel like this probably can't be used in this application from an enterprise perspective or generally from a user perspective. So

So that's what Anthropic is kind of trying to get at is, you know, can we can we add, you know, steerability to these models? Can we try to understand these models a bit more? And can these models have a bit more interpretability? Can we make them robust so that they could actually be deployed out in the real world? And do we and I think it's also we need to be able to incorporate evaluation up front, you know, and not just, you know, at the very end and post evaluation.

or something like that. We need to also integrate it into training and into the whole pipeline, you know, all of that data infrastructure in general. So I think that's what they're getting at, and that's really exciting. It'll take a decent amount of research as well as generally engineering

and also compute. So I imagine a decent amount of that money will go to compute. Um, and of course, efforts on, on working and creating kind of the new, uh, GPT-3, uh, or the new GPT-3 that can be usable, steerable, um, uh, by, by the world. And so that's really exciting. I'm really excited for them. Um, yeah. Yeah. Yeah. I think, uh,

This seems to be a trend that probably won't go away of bigger and bigger models. And as you said, you know, obviously one of the limitations of machine learning in general right now is exactly these things. Like we have really good

benchmark results, really good functionality, except for reliability, interoperability, making sure that there's no bias. All of these things are very much still active research problems, and especially for these giant models. So it seems like a very nice goal. And given the team and the background they bring, it seems like definitely something to look forward to.

And I think it's connected to the fact that open AI is moving, leaning more into their partnership with Microsoft and going, you know, more the product route that folks on the more research and safety side, you know, want to still focus on those elements. And so they still want to carve out that space and they still think, you know, it's useful, of course, that, you

that we're not ready yet to necessarily go directly to product with some of the models that are out right now. And so it makes sense that there is a little bit of this. And I think the article calls it a schism, but maybe it's not exactly. Maybe that's a little bit strong of a word. But there certainly is some kind of divide in terms of direction and vision. And so OpenAI is...

has launched actually a startup fund from themselves, and it's a $100 million fund, and they want to help support other companies in this space make a positive impact on the world with AI. Yeah, exactly. This whole notion of schism seems a bit dramatic. It doesn't seem like there is anything maybe negative so much as maybe just negative

wanting to work on different things between these different people. And yeah, it was interesting to see both of these announcements come out right next to each other of OpenAI having the startup fund and this Anthropic coming out.

It definitely seems like OpenAI is now more focused on commercializing and making use of AI and making advancements toward AGI than these safety and interoperability and these other kind of aspects that they've sort of had for a while, but hasn't been necessarily what most people associate with OpenAI.

I also have seen on Reddit and different discussion boards, I think opening eyes is a little divisive. Some people have gotten pretty cynical about it due to some of the PR they do, due to them changing from non-profit to for-profit. I've seen some very cynical takes on Anthropic, which is a bit sad. I do hope that people...

Similar to us, it seems, we think this is pretty positive and I'm pretty excited to see what Anthropic does. And I hope that's going to be the case. Maybe with some distance from OpenAI, people will also come to appreciate that this is a cool platform.

Yeah, the whole setup of companies is really interesting, especially now as I'm thinking about this as taking a step back, you know, with Signal taking off. It's a nonprofit, nonprofit, not a 501c company. And it's really trying to, you know, replace WhatsApp and Facebook.

Facebook Messenger, et cetera, all those messaging apps that don't really preserve user privacy and taking this other approach of, hey, we're just going to go blatant nonprofit here. And also just, you know, it makes me wonder about these AI companies where, you know, there is this for-profit aspect and angle. And I wonder if at the end of the day, you know, what kind of situation, what kind of setup of a company might make the most sense?

And it might be something more like Signal, depending on how the public takes it. And it can because I think people are becoming more and more cynical in general when it comes to technology, to be honest, and and what people are doing with that technology or companies are doing with that technology.

Yeah, exactly. And one question I've also seen raised is, you know, how will Anthropic make money? Which is also a question that was there for OpenAI. And eventually it seemed like OpenAI decided to start making money by commercializing. And likewise, you know, raising $124 million, that's a lot of money.

And right now it just seems to be a corporation to do research, which is not a great way to make money. So it'll be interesting to see what happens, if it'll be acquired, if they do go somehow the nonprofit round, it'll be more about charity. I don't know, but...

All I know is it's a great team and they seem to have a good goal. And at least for now, I am pretty excited to see where they go and what they do starting out. Yep, likewise.

And so on to our next article, which takes a pretty different turn on things. It's a blog post titled, Please Commit More Blatant Academic Fraud. It's a little bit tongue in cheek right there. And as you can probably guess, it's about...

academic fraud and, you know, how we kind of abuse or trying to game the system for not necessarily science, but for instead, I guess, climbing the academic ladder. And sadly, everything in that blog post, I pretty much agree with. And I think it's long due for someone to kind of holistically, I mean, write this all out, you know, and

Yeah. And of course, maybe just to go through a few things, like one big thing is cherry picking examples where your model looks good or even cherry picking whole data sets for you to test on. So you can confirm that your model has some kind of advantage is one thing they bring up. And I just want to say that this was like a huge motivating factor for me to do research on what I had done research on because I was so obsessed.

at how much cherry picking there was. I was like, this cannot be true. So that definitely is a huge problem in our space that I can confirm. Another one is making up new problem settings, new data sets, new objectives in order to claim victory on an empty playing field.

I will say this is something like companies do as well. This one I don't think is necessarily as feels as fraudulent, but of course it's kind of, you know, like, oh, well, that's you find novelty in some some direction. Then another point that I would agree with that is, I guess, a little bit more sketchy is proclaiming that your work is a quote promising first step.

in your introduction, despite being fully aware that nobody will ever build on it. That is so sad, so sad to me, but I could totally see like that is totally true. And I've yeah, I basically I feel like I've seen that and then tried to contact the authors, you know, either for code or for questions about their method. And they just are really, really kind of dicey and flaky about getting back to me.

any that ring true for you andre yeah yeah i mean uh definitely these examples that he cites are very you know something that we're all aware to some extent exists of you know not tuning your baselines a well kind of uh

not being overly careful in your evaluation, submitting a paper, even if there are some issues with it, just so it can be published. So yeah, some of the things he highlights are definitely quite common. Although I will say that later on in his blog post, there are some things that I found less agreeable. So

He says that because everybody is complicit in the subtle fraud, nobody is willing to acknowledge its existence. And the said result is that as a community, we have developed a collective blind spot around a depressing reality that even at top conferences, the media and published paper contains no truth or insight, which is pretty...

Pretty intense. I would say, first of all, people are willing to acknowledge these issues that are ways to game the system, that there are issues with reproducibility and with evaluation. But I don't agree that, you know, if we median paper, so like, you know, at least half, maybe more papers, you

published at top conferences are basically worthless. I do think that we should give some credit to researchers wanting to do something of value and not just publish whatever they can. So I will say that I think it's a little too pessimistic, a little too cynical

And, uh, of course also it's, it says, please commit more blatant academic fraud. Meaning later conclusion is, you know, given this issue of subtle fraud, what we should do is commit even more explicit, you know, blatant fraud so that we can actually address this issue. And, um, you know, uh,

not allow it and really remove it, which I think is a satirical kind of thing that obviously isn't a serious suggestion. But at the same time, that does beg the question, well, what is a serious suggestion? How can you root out these subtle, small ways of gaming the system? And to me, it seems like there's always going to be ways to game the system, and that's always going to be common

So, yeah, I think even though it's not ideal, you know, we don't live in an ideal world. And I think it's a little it's a little unfair to be so critical while not acknowledging that maybe it's a little inevitable. Although, of course, things could be better.

Yeah. I mean, for, for what it's worth, this is written by PhD students. So I would say in my darkest, my darkest moments of PhD, this is how I feel. So I, I don't, I don't blame him for, uh, being very, very cynical and, uh,

I do think there are problems. I don't completely follow the collusion ring thing unless it's kind of the implicit collusion rings that do form due to everyone implicitly agreeing and partaking in this kind of fraud.

If that's what that means, which he does mention and incite another another piece of work on. Yeah, it's it is it is very sad. And this is like a discussion that I've had with a lot of different researchers. So it is something that we discussed pretty out in the open. And it also has been interesting to discuss this with researchers.

a non-technical, someone outside of the AI community for sure, who hears, you know, myself and someone else talking about, you know, some of the issues that we have with, with publications and everything. And he was really, really enlightened and just didn't realize that he thought, you know, AI community is perfect, blah, blah, blah. Yeah.

Not companies, but academia. And I was like, oh, this is far from perfect, sir. And here's some things. And I'm glad I can share this article with him. But I think there is much to be improved, especially since at an extreme, this isn't just AI community. This is a lot of other...

scientific communities also suffer from some of these things. I feel like this is an academia kind of problem or research type of problem, incentives problem. Yeah, I think it's true that when you first enter research as an undergrad or master's or even PhD, you have this idealistic vision of research and pursuing the truth and you don't really know how the sausage gets made.

And when you're in there for a while, you start to understand that it's pretty messy and some aspects of it are not very pretty. There's politics, there's reviewing is a total mess. There's a lot of randomness as to what gets attention, what doesn't. There's a lot of things that could be improved.

And so it is good that this blog post, you know, with a very eye-catching title, brought attention once again to these issues because at some point you might get numb to it and, you know, just accept that this is the way things are. So yeah, hopefully, you know, we'll keep working on it. I think it's hard to address these subtle issues, but it's also something that

I will say I think most researchers want to do, and it's sort of like a community-wide system problem. And so it's possible, but it'll take work. And although this doesn't suggest how to do it, at least it kind of reminds us that the work does need to be done somehow. Yes, somehow. Anyways, our next question.

Moving on is more around applications. And it's titled AI could soon write code based on ordinary language. And this is from Wired. All right. So again, looping back to OpenAI and Microsoft, they just recently shared their plans to bring GBD3 online.

uh, the large language model to not just output natural language, but to output code, you know, based on your natural language description. So if you wanted to, you could just describe something that you want and GPT three could write the code for you. And it could be, you know, it could build a website. For example, it could, um,

I don't know, write something a little more powerful than that. And that is one of their big visions and that they're working towards. And so that's super exciting, I think. And I think there were a few examples of this when GPT-3 first launched. Some people were able to do that and create their own kind of HTML graphs using GPT-3 just out of the box. And so that was really, really cool. And now they're putting a more concerted effort

towards making that happen and not just making, you know, nice, pretty diagrams and graphs and pulling together, you know, U.S. presidents and their birthdays and their length of stay in office and all this other jazz, but something, but even more complex stuff and sinking research resources into that. And so I'm, I'm pretty excited for, for this direction. And I know some folks working, working on that now.

Yeah, there's been a good deal of research on this, basically taking English and converting it to programming. And so it's interesting to see it being commercialized. And here, I guess the idea is to translate natural language into its power effects thing, which is a simple programming language similar to Excel commands.

And given, I think, yeah, given how many people use the Windows Office suite, how many people use Excel, this is a very natural thing to work on and to empower people. So, yeah, it'll be interesting to see what else Microsoft does with GPT-3. But this is a nice kind of first step.

real commercialized large-scale application that really is not even possible without AI and without something as advanced as GPT-3 maybe.

Yep, so that's pretty much all there is to say. Pretty cool to see that being done. And then we can move on to our next application article, which is on modeling COVID-19. And so there's this article that is called, "All together now, the most trustworthy COVID-19 model is an ensemble."

So this article is basically kind of reiterating or reviewing what has happened with COVID-19 forecasting, which is, of course, a major effort seeing, you know, kind of what's

We can expect as far as infectious infections and deaths with COVID-19 week to week, month to month over the past year. And this is covering how there was the COVID-19 forecast hub, which aggregated and evaluated weekly results from many models and then generated a sample model, which is basically combining multiple predictions to make a single prediction that is most reliable.

Because making a model to predict anything is hard and something as complex as infections of disease is very hard. So they all didn't match

Really, each model had its own kind of predictions that may have varied quite a bit. So this ensemble technique, which is not new at all, but which is very important in this context, proved to be essential. So yeah, pretty neat. Do you have any thoughts on this one, Sharon?

Yeah. Well, one thing that's not surprising is that an ensemble model does do better than, than an individual model. So the ensemble basically puts together, you know, all different models kind of averages their results and it makes it so that the, if something wacky happens with one of them, it doesn't matter, you know, the rest can take care of it type of thing. And so this is something that we kind of know. Uh, so it's not super surprising, but it is exciting to, to see, you know, people, uh,

aggregating these models and actually and still working towards COVID forecasting. I know that ultimately a lot of these models, especially using deep learning, were not necessarily used for actual diagnosis, but it is nice to kind of go full circle and have this evaluation and publish results on this.

Yeah, yeah, exactly. This article is pretty nice maybe for people not aware of this idea of ensembling and also is a nice kind of overview of the effort in terms of probably just working to get all these people to collaborate and kind of combine their predictions is more so what's impressive here. So yeah.

Yeah, not too surprising, but also pretty cool. And then also looking at this article, apparently earlier this spring, a paper studying COVID forecasting appeared on the Met Archive preprint server with an offers list running 256 names long. So you can see how much of a collaboration there really was.

That is a lot of authors. I will say, even for machine learning, AI, even for physics, everything, that's a lot of authors. But that is an ensemble of authors. Exactly. I'm looking at it and the title of the paper is Evaluation of Individual and Ensemble Probabilistic Forecasts of COVID-19 Mortality in the U.S.,

And then actually title, which is at the very top of the first page, it's just like the entire first page is names, just like the whole thing. And then in the list of affiliations, there's 69 different organizations. So yeah, I think in that sense, even though maybe the results aren't surprising, that there is such a robust study is pretty cool. Right.

All right. And on to our next article that is more AI embedded in society. It's titled AI can write disinformation now and dupe human readers. And this is an article from Wired. All right. So as you know, AI can write disinformation now. We've been talking about this for a while in this podcast. But basically,

there actually has been, uh, some research done that looks at using GPD three to generate misinformation, uh, for the past six months. And this is a group at Georgetown university. And so GPD three, they've been getting GPD three to write stories around, you know, kind of having false narrative, um,

different news articles that have been altered by GBD3 to push some kind of bogus perspective and also, um, tweets kind of riffing on a particular points of disinformation. Um, and they, the group says that they could, uh, prove that the, that GBD3 was especially effective for automatically generating these short messages on social media and, uh, be able to, um,

be able to fool humans when reading these short little messages on social media. Yeah, it's interesting to see an actual study on this because it has been sort of hypothetical for a little while. I suppose maybe not too surprising that AI can generate tweets because how coherent do you really need to be to make

But yeah, I think it's nice to see actual research on this front. Now, we should note that this whole thing has been discussed on and off for a while. And one thing that's often noted is that we really shouldn't freak out too much because...

These countries that use misinformation could pretty easily just hire a bunch of people to be pretty convincing as is. And so these AI models aren't necessarily something to worry too much about, at least for now. When they can't really write articles, they primarily are doing fairly simple work that may not need to be automated.

But onto another story that may be something we should be worried about, I think. We have this story from Business Insider titled "A rogue killer drone hunted down a human target without being instructed to," a UN report says. And the summary here is that a lethal weaponized drone

called Kargu-2, which is a quadcopter, supposedly autonomously attacked a human during a conflict between Libyan government forces and a breakaway military faction. And so this was...

A drone that was directed to detonate on impact and apparently was operating in highly effective autonomous mode without a human operator. And yeah, this is pretty weird because so far...

Really, a lot of people have been concerned. There hasn't been any reported cases of robots without human operation being used in the military. And this may be the very first time that's happened. So kind of maybe something to be worried about more so in GPT-3. I don't know. What do you think, Sharon?

that's not great. You know, AI warfare, drone stuff, it's coming and it's maybe coming is the wrong word. It's here. Yeah. I mean, it's definitely something to worry about right now to think about. And I think, you know, it's no longer something necessarily to prevent, but to manage. So as we see a lot of conflict around the world, it's going to be more and more important.

Yeah, exactly. And one thing to note is in these discussions so far, there's really no regulation around this stuff about kind of autonomous weaponry where there's no human control and no human decision making. And the reason people are concerned is that this might enable, you know, scaling up of warfare might enable kind of more deniability in terms of, uh,

casualties that weren't intended. So there's a lot of things to be concerned about. And that this happened, there was actually an autonomous robot that exploded, it means that this is coming. And we knew this was coming, but now there's an actual example and it could become common very quickly. So definitely something to watch out for and

maybe be more concerned about than some of these other things we discussed. Right. And on to our last article, which is a bit funny or cringy rather. It's titled "Terrible News, Everyone! AI is Learning How to Post-Cringe."

And this is an article actually about replica and what they put out recently. So replica is an AI startup that we've mentioned in the podcast before that kind of does interesting things with synthetic speech. And as you'd be able to have this kind of avatar that you can interact with. And they recently had this hackathon where they,

these employees worked on capturing a live video of himself rapping and then transferring all that into kind of AI voices and 3D animation.

And it was obviously a little bit uncanny valley and weird. And the lip syncing was slightly off. And they mentioned they can definitely do better than that with AI, which is true. But it is a little bit cringy to watch since it's kind of a rap about AI by these kind of AI characters that the engineers of Replica put together for

and basically mined, you know, it was based on their own wrapping. Yeah, it's pretty funny. I guess I'm a little curious as to why Replica put this out in the first place. Really? Very true. Yeah, maybe they intended a little bit of cringe-based marketing. But yeah, it's all in good fun, I think. And in fact...

There's maybe when Replica emailed this to Verge or maybe when they applied, they had this disclaimer. They actually wrote a disclaimer that the video is deep in the heart of the Uncality Valley, that it was done for fun and using a new feature that's under development.

So yeah, I think they pretty much intended for this to be a little distracting thing that might get them a bit of attention. And yeah, I don't know. It's fun. But obviously, this wasn't necessarily the highest effort attempt at doing something like this. Yes, it was very, very interesting. I would say I encourage you to go check out maybe part of it, not the whole thing.

Exactly. But sometimes it's nice to have something silly and purely ridiculous being done with AI. In fact, AI is pretty good at really being silly and maybe we should have more of that to complement all this open AI and GPT-free stuff. True. Very, very true.

And with that, that's it for us on this episode. But be sure to stick around for a few more minutes to get a quick summary of some other cool news stories from our very own newscaster, Daniel Bashir. First off, a few recent advancements in research. According to New Scientist, an AI system developed at Tel Aviv University has disproved five mathematical conjectures despite not having any information about the problems.

As covered on Synced Review, DeepMind has presented something called neural algorithmic reasoning, a fusion of neural networks and algorithmic computation. The system can go from raw inputs to general outputs while internally emulating an algorithm.

Also covered in Synced Review, a research team from the University of Montreal and the Max Planck Institute for Intelligent Systems that includes Yoshua Bengio has developed a new reinforcement learning agent whose knowledge and reward function can be reused across tasks. By using a modular architecture and adopting a meta-learning approach, the agent can adapt to changes in distribution or new tasks.

Next up, we'll look at the business and application side. On May 25th, according to Electric, Tesla announced a transition of its autopilot and full self-driving technology to solely using computer vision based on cameras, without relying on its front-facing radar.

Second, as The Verge reports, Chinese autonomous vehicle startup Pony.ai has received a permit from the California DMV to test its driverless cars without human safety drivers in three cities. It becomes the eighth in California to receive a driverless testing permit, after three other Chinese companies and four US-based companies.

And finally, a few stories on the AI and society side. As VentureBeat reports, global analytics firm FICO and Carinium surveyed 100 C-level analytic and data executives to understand how organizations are deploying AI and ensuring ethical use.

Despite increasing demand for and use of AI tools, the survey found that 65% of companies can't explain how AI model decisions or predictions are made. FICO's State of Responsible AI report also shows that business leaders are putting little effort into ensuring that the AI systems they use are fair and safe for public use.

And finally, as covered in the conversation, the company behind the Vibra Image AI system claims the system can determine how someone is feeling, their personality, and their future behavior from head vibrations, despite a lack of viable evidence for the system's effectiveness.

And with that, thank you so much for listening to this week's episode of Skynet Today's Let's Talk AI podcast. You can find the articles we discussed here today and subscribe to our weekly newsletter with similar ones at skynettoday.com. Subscribe to us wherever you get your podcasts. And don't forget to leave us a rating and a review if you like our show. Be sure to tune in next week. All right.

Blatant Academic Fraud, OpenAI's New Sibling, a Killer Drone?! 39:02 Share

Last Week in AI

Deep Dive

Shownotes Transcript

Blatant Academic Fraud, OpenAI's New Sibling, a Killer Drone?!