We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode OpenAI’s InstructGPT, Meta’s New AI Supercomputer, China Regulates DeepFakes, AI Playes Tetris

OpenAI’s InstructGPT, Meta’s New AI Supercomputer, China Regulates DeepFakes, AI Playes Tetris

2022/2/6
logo of podcast Last Week in AI

Last Week in AI

AI Deep Dive AI Chapters Transcript
Topics
Andrey Krenkov 和 Sharon Zhou 讨论了 Meta 构建世界最快 AI 超级计算机的计划,这将用于支持其在神经网络和元宇宙等领域的研究。他们还探讨了使用机器学习改善心理健康的项目,该项目通过可穿戴设备收集生物识别数据,并利用机器学习预测症状。此外,他们还讨论了 OpenAI 推出的 InstructGPT,这是一个改进的文本生成模型,旨在减少有害内容的生成,并更好地遵循指令。他们还分析了 Meta 开发的能够平等地从视觉、书面或口语材料中学习的 AI 算法。最后,他们讨论了中国加强对深度伪造和其他 AI 合成系统监管的建议,以及美国国税局使用面部识别技术进行身份验证的措施。 Andrey Krenkov 和 Sharon Zhou 详细分析了各个 AI 新闻事件的技术细节和社会影响。他们讨论了 Meta 超级计算机的规模和计算能力,以及其对 AI 研究的潜在影响。他们还深入探讨了使用机器学习改善心理健康的挑战和机遇,包括数据收集、模型训练和隐私问题。关于 InstructGPT,他们分析了其在减少有害内容生成和改进指令遵循方面的优势和局限性,并比较了其与其他大型语言模型的异同。对于多模态学习算法,他们讨论了其在处理不同类型数据方面的优势,以及其在未来 AI 应用中的潜力。最后,他们分析了中国和美国在 AI 监管方面的不同方法,并探讨了这些方法对 AI 技术发展和社会的影响。

Deep Dive

Chapters
Meta, formerly Facebook, announces plans to build a supercomputer by mid-2022, aiming to be among the fastest in the world, with 16,000 NVIDIA A100 GPUs and designed specifically for AI research.

Shownotes Transcript

Translations:
中文

Hello and welcome to Scania Today's Last Week in AI podcast where you can hear AI researchers chat about what's going on with AI. As usual in this episode we'll provide summaries and discussion of some of last week's most interesting AI news. You can also check out our Last Week in AI newsletter at lastweekin.ai for articles we did not cover in this episode.

Before we start, by the way, if you have any feedback or thoughts about this episode, feel free to email us at [email protected]. We would appreciate hearing from you. And without out of the way, I'm one of your co-hosts, Andrey Krenkov.

And I am your other co-host, Dr. Sharon Zhou. And this week, we will discuss applications around how Meta is building the world's fastest AI supercomputer. We'll talk about research that OpenAI is rolling out for their new text generating models, which are less toxic. On the ethics side of things, we will be talking about increased regulation of deep fakes and

And finally, on a fun note, we'll be seeing the best game of Tetris you've ever seen by an AI. And with that, let's dive straight into our applications and business section. Our first article is titled Meta aims to build the world's fastest AI supercomputer.

So Meadow, which is again the parent company of Facebook, just released a statement that said they're going to build a research supercomputer, RSC, that's going to be the fastest, among the fastest in the world.

And it'll be done by mid 2022. And, you know, it is it's huge and they want it to work with neural networks with trillions of parameters, which we are just starting to see in research papers. What do you think of this, Andre? And also, what do you think of some of the details? Maybe you find those quite interesting, too.

Yeah, yeah. I mean, this is a fun story for sure. Just looking at the details here of, you know, they're getting 16,000 NVIDIA A100 GPUs, which is a lot of GPUs, you know.

Yeah, that is a lot of computation. They're planning to process exabytes of data. I can't even... I don't know what an exabyte is really. Is that like after terabytes? A thousand terabytes? I don't know.

And yeah, they're saying they'll have like 20 full speedups on computer vision and three full speedups on natural language. So it's a fun thing. And it's not very surprising, I suppose, that Mere is doing this. I think there's a fun detail I know from Google, like a buddy of mine worked there. He said...

Without asking anyone, if you're just working there, you can get 100 GPUs to use as you want. Just for anybody. Well, I don't know if anybody, but anyone in data science or whatever. So, you know, these big companies have a lot of compute infrastructure and it makes a lot of sense with stuff like GPT-3 that they'd invest more in that.

Yeah. To that note, actually, it's not like Facebook will have the biggest, like,

a supercomputer cluster, but it will be among the fastest with A100s like that. So it's exciting to see that. And also it points to what they're focusing their stuff on. I think it makes sense that they have, you know, they have like technically edge technology with the VR headsets. I don't know if that's quite edge, but yeah, they're going to be pushing towards that and creating the metaverse.

Yeah, it's neat also to note that this is an AI research cluster in the sense that this is not just a general supercomputer. It's designed for AI. So how storage is attached to the GPUs, how they communicate, all of that.

is pretty custom for this. So AI, kind of a big deal for these companies, turns out. I wonder how much this will cost. That would have been a fun number to know. But next up, as far as applications, we have deploying machine learning to improve mental health.

So this article has a sort of overview of this project, which is by this person, Rosalind Picard at MIT and the collaborator of Paola Pedrelli, who is an assistant professor in psychology at Harvard Medical School. And so these two people have been collaborating for five years. So Pedrelli is this

professor of psychology and Picard is an MIT professor of media arts and sciences and works on effective computing. And so we've had this five-year study, which is starting in 2016. So it's already long into it where they recruited patients with a major depression disorder. And then they had them wear these Empatica E4 wristbands.

And so they collected a lot of data, they pick up electro dermal skin activity and other biometric data. And the participants can also download apps that collect these logs and then they train machine learning to predict these sorts of symptoms. So I think this is really cool, actually. I think it's often

kind of surprising that we don't have better technology to track our well-being with any sort of quantitative measure. So I'd be excited to see if this works. What do you think, Sharon?

Yeah, I'd also be excited to see if this works. And these are quite, I think, the power pair of institutions with MIT and MGH, Massachusetts General Hospital. So, yeah.

I mean, I'm generally excited to see things in healthcare. I think this is a really interesting, you know, take around this kind of continuous monitoring and thinking about affective computing and seeing how that is, you know, emanating and being deployed in real settings here. So yeah, I'm excited to see where this goes. Definitely, yeah. And I guess this is,

almost over, we're on the fourth year of this five-year study, so soon enough hopefully we'll find out. I know I'm a big fan of my Fitbit, tracking my steps and sleep and exercise, so if we got better Fitbits that can tell me my mood, I would be a fan of that.

Yeah. And I mean, this is connected to the kind of stuff Apple has been doing straight from the iPhone. Right. So we're starting to see more and more wearables and non wearables, I guess, even just our smartphones being able to do some of this interesting continuous monitoring of our health. So I think this this makes sense to be doing this study and is part of this larger trend. For sure. Yeah.

And on to our research and advancements articles. And our first one is titled OpenAI rolls out new text generating models that it claims are less toxic. All right. So you've heard about GBD3, of course, which is their text based language model. That is. Sorry, let me restart that text based language model.

And onto our research and advancement section. Our first article is titled OpenAI Rolls Out New Text Generating Models That It Claims Are Less Toxic.

So we know that GPT-3 has some problems around spitting out toxic content over time. And OpenAI decided to really tackle that by introducing a new variant of it called Instruct GPT. And what Instruct is, is it pretty much tries to...

Go from more of like instructions or things you tell the model imperatively to do and then figures out, you know, underlying the underlying prompt to get it to generate whatever desired thing that you wanted from the instruction.

Um, so as a result, they found that this, uh, produced fewer untrue statements, also known as hallucinations from the AI and were able, was able to follow instructions much better. Um, and I'll say anecdotally, uh, I think people have said that they followed instructions better as higher fidelity, but there seemed to be at a loss of diversity. And of course, diversity is how you

it's how you view it, I guess, in terms of like, maybe sometimes that diversity could be really bad because it's spitting out toxic content. But of course, sometimes that means giving freer reign and more creativity to what it is saying. Yeah, yeah. So this is, I think, pretty exciting. I don't know if you would agree, Sharon. Yeah.

As we've said, we've discussed the problems that GPT-3 has in terms of having sexist and racist and other kind of problematic things that it spits out. And that's because it was just trained on all of the internet to do autocomplete. So it wasn't ever trained to do question answering or code completion. It was just trained to predict what comes next.

And so that led to these bad things where just the internet has bad things in it. And here the key idea is fairly straightforward, but still sort of is interesting and I worked, which is to just you do that and then you have humans provide these prompts, ask you to do stuff, and then other humans can rate different continuations.

you know, different things that GPT-3 spits out. So it's now being trained specifically to align with sort of human feedback and being trained from human feedback. And the reason that GPT-3 can't do that is that it's not scalable, right? You need to collect a lot of human feedback in order to train anything. But in this approach where they're fine tuning it, it seems that it's

actually was feasible, at least on a benchmark, it still works well. I do wonder if some of this loss of diversity is as a result of this extra supervised learning on a smaller data set. But I think it's definitely nice to see OpenAI releasing this method and results. And, you know, I'll be curious to see where they go next with this.

I think it's also great to see how, um, a model like GBT three that was so generally trained can be easily modified into something like instruct GBT, uh, because there are other models like T five where, uh,

language model where you do kind of give it a task and as and it goes and does it and here at gbd3 you know so general it wasn't really given a task except for next token prediction um so it wasn't explicitly told to do anything multi-task and here we're kind of letting it align with how humans think about tasks and what we want the language model to do and that can

can be adapted from the base GPT theory. And I think that's exciting because otherwise you don't want to be retraining models or training a completely different model from scratch to help with that alignment. Yeah, exactly. One of the sort of big issues with AI is often we can't predict what the models will do. And sometimes they might do something we would like them not to do.

And yeah, I've often thought whether we'll wind up with a sort of human in the loop kind of situation where we can actually modify things on the fly to make sure that it's better. And this is at least hinting that that may be possible, this sort of like just human evaluation of outputs being enough to get from GPT-3 to instruct GPT-3.

And on to our next story, we have meta-researchers build AI that learns equally well from visual, written, or spoken materials. They didn't actually build an AI, they developed an AI algorithm, but okay. So as the title implies, this is all about an algorithm that, unlike what is usually the case, can work with multiple modalities in roughly the same way.

So usually when we do machine learning, deep learning, the architecture is specialized to whatever data format you have. So audio, photos, text, these are all very different forms of data in terms of how big they are, the encoding, et cetera. And so the neural nets typically are different and the training regimes are different.

And that's been changing in recent years with Transformers. And now this paper, the claim is that you have one algorithm that trains for these different modalities equally well using roughly the same architecture as well. And that's what they call data to VEC, which is...

similar to GPT-3 and these other technologies actually, in that it's trained via this self-supervised paradigm. And they show that you can do kind of the same form of self-supervision as opposed to usually very different for different modalities for these different things like vision and speech and text.

So I don't think this is very surprising. Their approach itself is very intuitive, but as part of a strand towards unifying things, I kind of am a big fan. What do you think, Sharon?

Yeah, this is actually quite a generic approach, but it completely makes sense in terms of the direction of things to learn embeddings that are modality agnostic and using the users masking in a student teacher model for the for the student to learn from the teacher.

So it completely makes sense. And also in light of Meta's other research on, you know, kind of the lip syncing research that we touched on a week or two ago, it makes sense that they're looking into into these multimodal type models. And I think they very much make sense for the products that Meta puts out.

Definitely. Yeah. And it's also, I suppose, part of a strand we've also discussed in general where over the past year and I think really at the present, there's more interest in tackling multimodality in machine learning and deep learning, you know, as we've seen this clip. And since then, there's been a lot of work. And that is definitely like at the frontier of what is actually challenging and what I cannot do.

So I'm excited that these big companies like OpenAI and Meta are very much focusing on it. And on to our society and ethics section. The first article is titled China Proposes Increased Regulation of Deepfakes and Other AI Synthesis Systems.

So the CAC, the Cyberspace Administration of China, proposed new regulations to kind of govern deepfakes or in more broadly AI-aided synthesis systems. And that also includes VR, that includes text generation, that includes audio and all sorts of subsectors of AI media synthesis. And so...

China does produce actually a lot of academic research in this space. Um, so it, it is, uh, quite, uh, relevant. Um, and it also is a dangerous area. Uh, and I, I'm very surprised actually, maybe I should be this arise that the CAC, that at the cyberspace administration of China, uh,

did invite citizens to participate in contributing comments on their proposals. So I find that, you know, quite interesting and un-China-like, but it's good. I think it's good that they did that.

Yeah, this is quite interesting. This article goes into, this is quite a broad and has a lot in it where there is the scope that touches on six sectors. There's techniques for generating text content, technologies for editing voice content, for editing music, face generation,

editing images and video content, and even creating virtual scenes such as video construction. So basically any synthesis of media you can do with AI, this is touching on, which is fairly far-reaching

And yeah, I think it's cool or it's interesting to see this because we've talked a lot about sort of regulating deepfakes on and off over the years. And a lot of people generally have thought about it. But this goes well beyond that in a fairly reasonable way. And yeah, kind of maybe is interesting in that it shows that China in some ways is ahead of the US as far as kind of

predicting the impacts of these technologies and regulating what the private sector could do. Right, exactly. And specifically, the draft proposal stated that they would obligate deep synthesis service providers to register their applications with the state.

So they would have to, you know, specify and comply with all the necessarily unnecessary filing procedures. And so that's that's quite interesting to have to register if you're going to be doing any of this technology that seems to that seems to make sense in a way.

Yeah. I mean, I don't know if some kind of this would fly in the U S but no, I never, given deep states, um, there has been, you know, there are draft bills, uh, in Congress for it. And there's been some proposals for kind of doing, um,

stamps on data origins. So something like this may happen. I mean, we'll definitely need some sort of legal kind of framework for synthetic content. We've discussed how you can clone the voices of voice actors or their faces for ads. So you'll need some, like these things will need to be addressed as this broad idea of media synthesis.

And yeah, I guess it's interesting to see China actually, like the cyberspace administration of China already have regulations. And it seems like maybe their tech departments are a bit more advanced or up to date than we have in the US. Maybe not, I don't know.

And on to our next society and concerns and ethics story we have. IRS will require facial recognition scans to access your data, your taxes from Xamato. So according to an IRS spokesperson, users with the IRS.gov account will...

not be able to log in with a username and password, but we'll instead need to provide a government identification document or a selfie and copies of our bills to this verification company, ID.me. So yeah, this is interesting because you now need to, you can't just do username and password. Apparently you need to show your face to really verify it.

And there is an update in this article where it clarifies that most things you would need to do, like pay or file your taxes, you don't need to submit a selfie for. So this is a little bit of a clickbait article, not quite right. But still, it's interesting to see the ARS incorporating this ID.me technology.

It's funny because this clickbait title really made the rounds, I think, and people really reacted heavily to it. So yes, it was successful clickbait, but in delivering potentially inaccurate news. To be fair, the clickbait was because of a misleading quote they got from this IRS spokesperson. And it was later clarified that...

Yeah, that you don't need to log in, but to log in to this website, to their account, I believe they still do need to use a selfie. So they can file their taxes online without the account, but to log into the account itself, this is still the case.

That makes sense, yeah. But they didn't change the headline. Of course, Manhattan has the most nuanced headline in the world. So, okay. Okay, okay. I'm just saying, I have a high bar. Yeah, yeah, that's a good point. They say there's an editor's note just before, but they didn't say we didn't change the articles. So if you'll need taxes, you're taxes. It's kind of weird. Yeah.

Yeah, that's probably the case. But yeah, I guess more broadly, that aside, I feel like I could see this becoming way more typical for logging in in general. You know, for phones we have our finger scans and for phones we have facial scans already, right, for iPhones. And I wonder if, like, websites will start just doing this where you will need a password. Hopefully not, because that's not as secure, I'm pretty sure, but...

I guess who knows. It'll be a cat mouse game because it will be secure until, you know, people hold up a picture of someone else and then you have to do like a video and then people can hold up something else. So I think it'll be an interesting game for sure. Yeah. But yeah, FYI, now the IRS did it, maybe...

Reddit will do it next or Twitter or whoever. Nah, the IRS is always decades ahead of all the others. Yeah, that's true. They are the first adopters. But let's move on from the IRS and talk about our fun or neat or both stories, starting with...

Google AI tools bring back women in science to the fore. So this is a fun tool that Google has developed to use by curators at the Smithsonian to uncover and highlight the many roles women have played in science that maybe have been overlooked.

So this builds on Google Arts and Culture previous work, where they scanned almost 3 million images from museums and made them available to the public online. And this new thing basically enables a smarter sort of search and research in the Smithsonian archives. So it

basically looks at metadata, identifies the names of women even when they haven't been explicitly pointed out. So for instance, sometimes there's a husband's name and also looking at image records to cluster and group together similarities. So basically kind of gives you a richer view of

kind of the places where Ruman have contributed, even though it may not have been apparent from just reading about the work. And yeah, I don't know. This is very cool. It's nice to see Google having this arts and culture project that is doing these applications of AI that probably won't

make for a big business, but are certainly very cool and useful in the context of this museum. I think it can help them with branding a little bit. So maybe that's why. That's true. That's true. I suppose so. And I think it's someone's like 20% project or something like that. Yeah, this is definitely, you know, smaller scoped projects. But I don't know if I were to go to a Smithsonian and see this kind of tool, I

I would definitely find it pretty cool. So it's nice to see this example that Google developed, I'm sure is very robust and could be hopefully applicable to other museums and other archives.

Right. I've seen some museums actually start to implement some of these, you know, some AI techniques into their exhibits. And I think I think that's fantastic. I love seeing that come to the fore. And I also I feel like see a lot of that on Twitter as well, with a lot of AI artists becoming more and more prominent. Yeah, yeah, for sure. I think it'll be interesting.

to see kind of how museums can develop and change with AI. And certainly I believe that from a data analytics approach, they could definitely learn a lot by having cameras and seeing what people are doing.

and maybe figuring out how to best present art. But yeah, changing exhibits with this sort of thing, I think would be very cool and make museums more engaging, which certainly probably is a good idea. Right, right. And onto our last article, watch an AI play the best game of Tetris you've ever seen.

So this is a little bit of a click-baity title, but someone did train an AI model to play Tetris, but specifically towards clearing four lines of Tetris as frequently as possible.

And that way the AI model will do pretty daring things like waiting for the Tetris board to add out quite a bit so that it can clear those four lines. But it gets quite a few points for that. So you get extra bonus points for that. Yeah. Yeah. So it's, it's a neat thing to watch, you know, there's a gif of it going sped up and it's, you know, doing fantastic and clearing all these levels and it's,

It feels nice to see this amount of success. But yeah, as far as an AI, this is very simple as far as AIs go. It's for the most part not even trained. It's just doing search and sort of looking ahead at what's there and a little bit learning. So what's good and not good. There's actually...

Just for a fun note, there's a channel called CodeBullet on YouTube that does this all the time. There's a lot of videos on his channel. So for instance, one video that he made last year is called "I Created an AI to Destroy Tetris", which does exactly what he says, but better.

So yeah, I don't know. We'll link to it in the description. It's a fun video and this channel Code Bullet in general is quite fun. There's a lot of videos on developing AI for different games and it's informative and also entertaining.

definitely a fun side project if you want to get into AI. This is a good way to start, you know, building a Tetris AI, I suppose. You could definitely try that. Yeah, definitely. It's funny because they are also very good at writing clickbaity headlines. Yeah, apparently, yeah. This week we got clickbaited. We fell for it, but

No, no, no, no, we're here to dispel the debate. We will let you know not to read his articles. But yeah, it's a fun thing to watch. This video also has an accompanying YouTube video. It's like 26 minutes of AI playing the game and getting some ridiculously high scores. Fun little, you know, little thing.

And before we go, please let us know any of your thoughts you have directly to our email contact at lastweekin.ai. And thank you so much for listening to this week's episode of SkyNet Today's Last Week in AI podcast. You can find the articles we discussed here today and subscribe to our weekly newsletter with similar ones at lastweekin.ai.

And if you like this episode and the podcast, feel free to share it with your friends or with the social medias or leaving us a rating and a review on iTunes or just on your blog. You know, whatever you feel like, if you want to help us out, we would appreciate it. Help us out. Help us out. You know, if you listen this far in...

What are you doing? You know, anyway, but yes, be sure to tune in to our future episodes and stick around.