We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode ChatGPT introduces new feature to edit DALL-E images

ChatGPT introduces new feature to edit DALL-E images

2024/4/10
logo of podcast AI Education

AI Education

AI Deep Dive AI Chapters Transcript
Topics
OpenAI新功能介绍者:我今天要谈论的是OpenAI最近发布的一个令人兴奋的新功能更新,它允许用户直接在ChatGPT中编辑DALL-E生成的图像。这是一个非常强大的功能,它将彻底改变用户体验和创造潜力。这个更新已经同时在网页、iOS和Android平台上推出,这意味着它将惠及更广泛的用户群体。 该功能的核心在于,用户可以在生成图像后立即选择图像的特定部分,并使用简单的文本提示对其进行修改。例如,用户可以选中图像中的一只狗,并提示系统为其添加蝴蝶结。系统会根据提示,精确地修改图像中选定的部分,而不会影响图像的其他部分。 我个人测试了这个功能,发现它非常强大。我尝试生成了一张海盗船战斗的图片,并对其中的黑胡子进行了修改,尝试让他露出笑容。虽然由于黑胡子的胡子遮挡了嘴巴,效果并不完美,但该功能仍然展现出了强大的潜力。我还尝试将整艘海盗船修改成粉红色的汽车,结果也令人满意。 与Midjourney相比,ChatGPT的图像编辑功能在细节处理上可能还略逊一筹,但它具有其他工具不具备的独特优势。例如,ChatGPT可以根据用户对图像特定部分的指示进行精准修改,而无需重新生成整张图像。这将极大地提高图像编辑的效率。 目前,ChatGPT还不能直接编辑用户上传的图像,但这很可能是一个未来会添加的功能。总的来说,我认为这项功能非常令人兴奋,它将图像生成技术提升到了一个新的水平。它不仅可以帮助用户更轻松地创建理想的图像,还可能对图形设计等领域产生深远的影响,甚至可能颠覆Canva和Photoshop等传统图像编辑工具。 此外,这项技术未来还可能应用于视频编辑领域。我们可以想象,未来用户可以像编辑图像一样,选择视频中的特定部分,并使用文本提示对其进行修改。这将彻底改变视频制作的方式,并为创意内容创作带来无限可能。

Deep Dive

Chapters
OpenAI's latest update allows users to edit DALL-E images within ChatGPT across various platforms. This feature enables real-time modifications to generated images, addressing previous concerns about faked demos in other AI platforms. The update showcases integrity by demonstrating the generation process authentically.
  • ChatGPT now allows DALL-E image editing across web, iOS, and Android.
  • OpenAI's transparent demo builds user trust.
  • The feature allows for real-time image modifications based on user selections.

Shownotes Transcript

Translations:
中文

OpenAI just released a new update a couple of hours ago. I have not heard anybody talking about this, but I think it's absolutely fascinating is that you can now edit Dolly images in ChatGPT in a very interesting new way. I've seen this with some other programs. This is the first time I've seen OpenAI getting into this and it's really powerful. So I want to tell you a little bit about what they're doing and why I think this is important. So the first thing that I'll say is that they kind of made this announcement on

on LinkedIn and on X. They said you can now edit Dolly images and chat GPT across web iOS and Android. So this is impressive. This is, you know, sometimes people roll out, you know, update just to the web version and it comes to mobile later. This is already out on web. I've been playing with it, testing it, and apparently it's out on iOS and Android, which I haven't been using, but I highly recommend other people check it out if you have the app.

This is amazing. It's going to go into so many more people's hands. When I see them do a big rollout like this to all platforms and make it really says like, we want all of our users to use this as soon as possible. So essentially what you're going to be able to do here is you're going to be able to, once you generate an actual image, you're going to immediately be able to select parts of that image and edit them. So they give a demonstration where they have a dog that they're generated. They're like, you know, created an image of a cute

poodle celebrating a birthday. So it's like a dog with a hat and celebrating its birthday. They then go to edit that and they highlight two spots on the dog's head and they say, add bows to it. Now, a lot of people have been commenting on the demonstration they've done because they released essentially a clip to social media of this whole generation happening. And the video is like

over or it's like a minute long and literally most of the video is just you sitting there waiting watching this generation but it's able to you know go and actually generate bows that appear on the dog's head exactly where they highlighted which is impressive so what i do want to say is a lot of people commenting on this video there's some interesting comments on it i think all in all people

are kind of happy that they did this. Someone on the comments said, I appreciate that OpenAI chose to not speed up the video demonstrating the generation process in this preview. This shows integrity and helps set realistic expectations for the product's capabilities. An authentic preview goes a long way with potential users' trust and credibility is key in the age of AI. I actually agree with this. We had Google Gemini come up with a demo of their platform and they got

absolutely roasted because it was, you know, this platform where you could talk to it, it could see what you were seeing, it could create images and video and like it was doing all this crazy stuff. And then we found out that it was essentially faked or staged. They highly edited the video. They asked it questions.

before they essentially gave it like way longer prompts than they were telling us they were giving it. So it just looked like they could say, you know, what's this? And then it would say, oh, that's like you playing rock, paper, scissors. But in reality, they're like, I'm playing a game with my hands. It's very popular. What is it? And then it would respond, but they would cut out like all the context of

Anyways, it was just really sketchy and it lost a lot of trust, I think, from Google and Gemini. I'm sure they've learned their lesson. They're not going to do that. But I think OpenAI and other AI companies are also learning their lessons. And when they're giving these demos now, I think it's really interesting that it's, you know, they're literally just letting you watch the, they know that people would rather watch a full minute of an image loading than have to, you know, know that it's fake. So we know this is real. So I went and tested this new feature. I think it was really impressive. I just went to chat GPT. I'm like, oh my gosh, this is available right now.

And at first I thought it wasn't, to be honest, I had to go back and watch the video again to learn how to do it. So I'll let you know in case you want to try this, but I went and said, you know, create a photo of a pirate ship in battle with Blackbeard and his crew. It generated the image for me. And at first I was like, oh, there's no way to edit this. What you actually have to do is click on the image itself and it will then expand to full view. And in the top right-hand corner, there's something called select, which is essentially a tool where

You can change the size of the paintbrush so you can make it like a really big selector or you can change the paintbrush to be really small if you want to get some like smaller details in the image. I did a bunch of different things. One example was at first I selected, so I had to generate Blackbeard on a pirate ship. I selected his face and said to give him a grinning scowl. Now in the second version of the image that was generated, to be 100% honest, it, I, if

I mean, he's got a beard covering his mouth, but like the details are so like not precise that you can't really tell if he's scowling or grinning or whatever. I'm going to be honest. I think mid journey still is the best when it comes to image generation by, by quite a bit, but, but this is quite an impressive feature and I'm it's, there's some things here that I'm not seeing mid journey do. So for that reason, I do think it's interesting. I wanted to test it with something maybe a little bit more obvious. So I actually just went and selected, um,

the entire pirate ship, including the mast. And I just kind of went and selected the whole thing. And I told it to generate for me to turn essentially the pirate ship to be pink and make it a car. So it actually was able to do that. And it's, you know, it actually kind of looks like a car is just crashing into the pirate ship, which I guess is fine, whatever. It's its own like rendition. But to be fair, nothing in the image itself that I generated, while it looks like funny that a car is crashing into a pirate ship, nothing in it looks like

or like broken, I guess is the best way for me to explain it. Like Blackbeard is still standing on top of the car. There's some weird things coming out of it.

So I do think mid journey is better for image generation, but I'm very, very impressed with this tool. And I think you can, I think you'll be able to do some really impressive things. Now, something else that I think is quite interesting is the fact that you can do, you know, chat GPT is like linking in with Dolly. So you actually can do image uploads, right? Meaning you can select an image and upload it to chat GPT. Now,

When I originally discovered this, I wanted to see, you know, if it would be able to edit images that you uploaded. I wasn't actually able to see this exact capability. So for some people, like I think on the LinkedIn post, people were saying, great, I no longer have to spend hours explaining to my sister how to use Photoshop to edit her vacation photos. I thought this was kind of funny.

But at the same time, it's not like she could go and just upload her images in there and edit it, right? So it's not like it's completely taking over Photoshop. Although the tool, the selector tool reminds me a lot of the selection tool from Photoshop, if you're familiar with that. But unfortunately, when you do something, for example, like when you upload an image, you're not actually able to go and...

that image, which is honestly kind of unfortunate because I was looking forward to that particular feature and thinking that it would be pretty interesting to be able to go edit. Otherwise, the images are just there. So I'm sure this is a feature they're going to be adding in the future. There's all sorts of get arounds. There's ways you can do this with mid journey specifically. And there's a lot of different tools out there where you can upload photos of yourself and have it.

edit them. I was unfortunately unable to do this directly within ChatGPT. All in all, an amazing feature I'm really excited about. I think this is going to really take image generation to the next level because now instead of just generating an image and hoping it gets exactly what you want, you can generate the image

And in the past you would say, okay, do it again, but change this, do it again, but change that. And every time it would regenerate, it wasn't the exact same and it wouldn't change exactly what you wanted. Now you can literally select the part of the image you want to change and it can change it. I think this is going to be big for graphic design. This might be the way graphic design is going with Canva, Photoshop, and these other tools, I think are going to get very disrupted. So I think that there are, you know, hundreds of millions of dollars in this

area that is gonna get disrupted, whether it's today or tomorrow. I can see a world where OpenAI releases a lot more, many more of these image generation and editing tools, which I think is gonna be really powerful. You also have to start extrapolating where this is going, which, you know, right now it's like, okay, cool.

image, but next it's going to be video. So when you're doing Sora and you're doing video generation, I assume they're going to kind of follow the same precedent. You'll be able to select areas within the video and say, okay, you know, I have the actor and he's like running and, you know, skydiving off of a building. Now I want him to be wearing like a red shirt. Okay. I want it to be a blue shirt. Okay. I want him to be jumping into a helicopter. Like it's going to be very fascinating to see how that actual

video generation flow works, but I imagine they'll do, they'll do things like this where you select a character and you change it with a prompt and it's going to change what's happening in the video. So very exciting times, a lot coming down the pipe. I'll definitely keep you up to date on everything that is happening in this field. I think that we're going to see a lot of disruption, whether that's video, image, audio, multimedia, so many areas. Thanks so much for tuning in. If you wouldn't mind, I really, really appreciate it. If you could hit

the like button if you're on YouTube. Follow us if you're on Apple Podcasts or Spotify and leave us a review or a comment. I really appreciate every single comment, every single review. Read them all and I try to respond. Hope that you all have an amazing rest of your day.