We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Edit DALL-E Images with ChatGPT's Latest Update

Edit DALL-E Images with ChatGPT's Latest Update

2024/4/9
logo of podcast No Priors AI

No Priors AI

AI Deep Dive AI Chapters Transcript
People
主播
以丰富的内容和互动方式帮助学习者提高中文能力的播客主播。
Topics
主播:ChatGPT的最新更新允许用户以一种新颖有趣的方式编辑DALL-E生成的图像,这极大地增强了图像编辑能力,实现了在网页、iOS和Android平台上的全平台覆盖。这项功能允许用户直接选择图像部分并进行编辑,OpenAI希望所有用户都能尽快使用这项功能。在演示视频中,用户可以通过高亮选择图像区域,并添加新的元素,例如在狗的头上添加蝴蝶结。OpenAI在演示视频中展现了真实的图像生成过程,这有助于建立用户信任,避免像Google Gemini那样因视频造假而失去信任。主播亲身体验了ChatGPT的图像编辑功能,并认为这项功能令人印象深刻。ChatGPT的图像编辑功能需要用户点击图像,然后在右上角的“选择”工具中选择编辑区域大小,进行编辑。ChatGPT的图像编辑功能在细节处理上不如Midjourney,但仍然可以实现一些令人印象深刻的编辑效果,例如将海盗船变成粉红色的汽车。ChatGPT可以进行图像上传,但目前还不支持对上传图像的直接编辑。ChatGPT的图像编辑功能可以改变图像的特定部分,而无需重新生成整个图像,这对于图形设计等领域具有重要意义。ChatGPT的图像编辑功能可能会对图形设计行业产生重大影响,并可能扩展到视频编辑领域。未来ChatGPT的图像编辑功能可能会扩展到视频编辑领域,允许用户选择视频中的特定区域进行编辑。

Deep Dive

Chapters
This chapter explores the recent ChatGPT update that integrates DALL-E image editing. It discusses the cross-platform rollout and contrasts OpenAI's transparent demo with a previous, criticized demo from Google Gemini.
  • ChatGPT now allows DALL-E image editing on web, iOS, and Android.
  • OpenAI's demo emphasized authenticity, unlike a previous misleading Google Gemini demo.
  • The new feature lets users select parts of generated images and edit them with text prompts.

Shownotes Transcript

Translations:
中文

OpenAI just released a new update a couple of hours ago. I have not heard anybody talking about this, but I think it's absolutely fascinating is that you can now edit Dolly images in ChatGPT in a very interesting new way. I've seen this with some other programs. This is the first time I've seen OpenAI getting into this and it's really powerful. So I want to tell you a little bit about what they're doing and why I think this is important. So the first thing that I'll say is that they kind of made this announcement on

on LinkedIn and on X. They said you can now edit Dolly images and chat GPT across web iOS and Android. So this is impressive. This is, you know, sometimes people roll out, you know, update just to the web version and it comes to mobile later. This is already out on web. I've been playing with it, testing it, and apparently it's out on iOS and Android, which I haven't been using, but I highly recommend other people check it out if you have the app.

This is amazing. It's going to go into so many more people's hands. When I see them do a big rollout like this to all platforms and make it really says like, we want all of our users to use this as soon as possible. So essentially what you're going to be able to do here is you're going to be able to, once you generate an actual image, you're going to immediately be able to select parts of that image and edit them. So they give a demonstration where they have a dog that they're generated. They're like, you know, created an image of a cute

poodle celebrating a birthday. So it's like a dog with a hat and celebrating its birthday. They then go to edit that and they highlight two spots on the dog's head and they say, add bows to it. Now, a lot of people have been commenting on the demonstration they've done because they released essentially a clip to social media of this whole generation happening. And the video is like

over or it's like a minute long and literally most of the video is just you sitting there waiting watching this generation but it's able to you know go and actually generate bows that appear on the dog's head exactly where they highlighted which is impressive so what i do want to say is a lot of people commenting on this video there's some interesting comments on it i think all in all people

are kind of happy that they did this. Someone on the comments said, I appreciate that OpenAI chose to not speed up the video demonstrating the generation process in this preview. This shows integrity and helps set realistic expectations for the product's capabilities. An authentic preview goes a long way with potential users' trust and credibility is key in the age of AI. I actually agree with this. We had Google Gemini come up with a demo of their platform and they got

absolutely roasted because it was, you know, this platform where you could talk to it, it could see what you were seeing, it could create images and video and like it was doing all this crazy stuff. And then we found out that it was essentially faked or staged. They highly edited the video. They asked it questions.

before they essentially gave it like way longer prompts than they were telling us they were giving it. So it just looked like they could say, you know, what's this? And then it would say, oh, that's like you playing rock, paper, scissors. But in reality, they're like, I'm playing a game with my hands. It's very popular. What is it? And then it would respond, but they would cut out like all the context of

Anyways, it was just really sketchy and it lost a lot of trust, I think, from Google and Gemini. I'm sure they've learned their lesson. They're not going to do that. But I think OpenAI and other AI companies are also learning their lessons. And when they're giving these demos now, I think it's really interesting that it's, you know, they're literally just letting you watch the, they know that people would rather watch a full minute of an image loading than have to, you know, know that it's fake. So we know this is real. So I went and tested this new feature. I think it was really impressive. I just went to chat GPT. I'm like, oh my gosh, this is available right now.

And at first I thought it wasn't, to be honest, I had to go back and watch the video again to learn how to do it. So I'll let you know in case you want to try this, but I went and said, you know, create a photo of a pirate ship in battle with Blackbeard and his crew. It generated the image for me. And at first I was like, oh, there's no way to edit this. What you actually have to do is click on the image itself and it will then expand to full view. And in the top right-hand corner, there's something called select, which is essentially a tool where

You can change the size of the paintbrush so you can make it like a really big selector or you can change the paintbrush to be really small if you want to get some like smaller details in the image. I did a bunch of different things. One example was at first I selected, so I had to generate Blackbeard on a pirate ship. I selected his face and said to give him a grinning scowl. Now in the second version of the image that was generated, to be 100% honest, it, I, if

I mean, he's got a beard covering his mouth, but like the details are so like not precise that you can't really tell if he's scowling or grinning or whatever. I'm going to be honest. I think mid journey still is the best when it comes to image generation by, by quite a bit, but, but this is quite an impressive feature and I'm it's, there's some things here that I'm not seeing mid journey do. So for that reason, I do think it's interesting. I wanted to test it with something maybe a little bit more obvious. So I actually just went and selected, um,

the entire pirate ship, including the mast. And I just kind of went and selected the whole thing. And I told it to generate for me to turn essentially the pirate ship to be pink and make it a car. So it actually was able to do that. And it's, you know, it actually kind of looks like a car is just crashing into the pirate ship, which I guess is fine, whatever. It's its own like rendition. But to be fair, nothing in the image itself that I generated, while it looks like funny that a car is crashing into a pirate ship, nothing in it looks like

or like broken, I guess is the best way for me to explain it. Like Blackbeard is still standing on top of the car. There's some weird things coming out of it.

So I do think mid journey is better for image generation, but I'm very, very impressed with this tool. And I think you can, I think you'll be able to do some really impressive things. Now, something else that I think is quite interesting is the fact that you can do, you know, chat GPT is like linking in with Dolly. So you actually can do image uploads, right? Meaning you can select an image and upload it to chat GPT. Now,

When I originally discovered this, I wanted to see, you know, if it would be able to edit images that you uploaded. I wasn't actually able to see this exact capability. So for some people, like I think on the LinkedIn post, people were saying, great, I no longer have to spend hours explaining to my sister how to use Photoshop to edit her vacation photos. I thought this was kind of funny.

But at the same time, it's not like she could go and just upload her images in there and edit it, right? So it's not like it's completely taking over Photoshop. Although the tool, the selector tool reminds me a lot of the selection tool from Photoshop, if you're familiar with that. But unfortunately, when you do something, for example, like when you upload an image, you're not actually able to go and...

that image, which is honestly kind of unfortunate because I was looking forward to that particular feature and thinking that it would be pretty interesting to be able to go edit. Otherwise, the images are just there. So I'm sure this is a feature they're going to be adding in the future. There's all sorts of get arounds. There's ways you can do this with mid journey specifically. And there's a lot of different tools out there where you can upload photos of yourself and have it.

edit them. I was unfortunately unable to do this directly within ChatGPT. All in all, an amazing feature I'm really excited about. I think this is going to really take image generation to the next level because now instead of just generating an image and hoping it gets exactly what you want, you can generate the image

And in the past you would say, okay, do it again, but change this, do it again, but change that. And every time it would regenerate, it wasn't the exact same and it wouldn't change exactly what you wanted. Now you can literally select the part of the image you want to change and it can change it. I think this is going to be big for graphic design. This might be the way graphic design is going with Canva, Photoshop, and these other tools, I think are going to get very disrupted. So I think that there are, you know, hundreds of millions of dollars in this

area that is gonna get disrupted, whether it's today or tomorrow. I can see a world where OpenAI releases a lot more, many more of these image generation and editing tools, which I think is gonna be really powerful. You also have to start extrapolating where this is going, which, you know, right now it's like, okay, cool.

image, but next it's going to be video. So when you're doing Sora and you're doing video generation, I assume they're going to kind of follow the same precedent. You'll be able to select areas within the video and say, okay, you know, I have the actor and he's like running and, you know, skydiving off of a building. Now I want him to be wearing like a red shirt. Okay. I want it to be a blue shirt. Okay. I want him to be jumping into a helicopter. Like it's going to be very fascinating to see how that actual

video generation flow works, but I imagine they'll do, they'll do things like this where you select a character and you change it with a prompt and it's going to change what's happening in the video. So very exciting times, a lot coming down the pipe. I'll definitely keep you up to date on everything that is happening in this field. I think that we're going to see a lot of disruption, whether that's video, image, audio, multimedia, so many areas. Thanks so much for tuning in. If you wouldn't mind, I really, really appreciate it. If you could hit

the like button if you're on YouTube. Follow us if you're on Apple Podcasts or Spotify and leave us a review or a comment. I really appreciate every single comment, every single review. Read them all and I try to respond. Hope that you all have an amazing rest of your day.