We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
People
N
NLW
知名播客主持人和分析师,专注于加密货币和宏观经济分析。
Topics
NLW:Google 的 Veo 2 在视频生成领域取得了显著进展,其性能超越了 OpenAI 的 Sora,并在物理效果模拟和摄影技巧方面表现出色。Veo 2 的出现为视频生成技术在多个领域的应用带来了新的可能性,例如社交媒体内容创作、广告制作、电影制作等。在社交媒体方面,Veo 2 可以帮助品牌创建更具创意的视频内容,并改变社交媒体内容创作的方式。在广告制作方面,Veo 2 可以显著降低广告制作成本和时间,使广告制作更加民主化,小型公司也能轻松制作高质量的广告。此外,Veo 2 还可以实时响应文化热点事件,制作更具时效性的广告。在电影制作方面,Veo 2 可以用于创建场景镜头、B 卷和无人机镜头,并辅助故事板创作和头脑风暴,从而降低电影制作成本,并扩展创意领域。总而言之,Veo 2 的出现将带来更多、更个性化、更及时的视频内容,并推动视频生成技术的广泛应用。

Deep Dive

Key Insights

What are the key features of Google's VO2 video generation model?

VO2 can produce 2-minute clips with resolutions up to 4K, which is 4 times the resolution and 6 times the duration of OpenAI's Sora. It excels in physics and user control, understanding cinematography techniques, and offers professional-grade video generation.

Why is VO2 considered a leap forward in video generation AI?

VO2 stands out for its understanding of physics, allowing it to handle complex tasks like cutting a tomato or shuffling a deck of cards, which other models struggle with. It also replicates professional cinematography techniques, such as camera motion and lens effects.

How does VO2 compare to OpenAI's Sora in terms of performance?

VO2 outperforms Sora in preference and prompt adherence, particularly in handling physics and generating more realistic videos. It also offers higher resolution and longer clip durations, making it a more advanced consumer-facing model.

What are some current use cases for AI video generation like VO2 and Sora?

Use cases include social media creations, advertising, establishing shots, B-roll, and drone footage, storyboarding, and brainstorming for filmmaking and business activities. These tools are also poised to disrupt traditional stock video libraries and ad production workflows.

How is AI video generation impacting the advertising industry?

AI video generation is dramatically reducing the cost and time required for ad production. For example, an entire ad for eToro was created in just one and a half weeks, a process that traditionally takes much longer. This democratizes ad creation for smaller companies and allows for real-time responses to cultural moments.

What role does AI video generation play in social media content creation?

AI tools like Pika enable creative social media videos with effects like 'Cakeify' and 'Squish,' allowing brands to produce more engaging and unique content. This opens up new possibilities for creative expression and brand marketing on social platforms.

How might AI video generation change professional filmmaking?

AI video generation could lead to a renaissance in filmmaking by lowering production costs and expanding creative possibilities. Models like VO2 can imitate cinematography techniques and understand physics, making them suitable for high-level professional productions.

Shownotes Transcript

Translations:
中文

Today on the AI Daily Brief, Google has announced VO2, and today we're discussing the most interesting use cases that are available right now. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes. ♪

Hello, friends. Quick note before we dive into today's episode. The main episode, five use cases for VO2, got fairly long. And so we're just doing a main episode, no headlines today. We will be back with the headlines as normal tomorrow. But for now, let's dig in and see how people are actually getting value out of AI video generation right this moment.

Welcome back to the AI Daily Brief. Something really interesting has been happening recently, and it was summed up in this tweet from Riley Brown who said, why does it feel like Google is the underdog that everyone is rooting for?

Of course, what he's referring to is the fact that Google is really getting its groove back from a product perspective. We've talked extensively and we'll continue to talk about Notebook LM, but today it's all about VO, or more specifically VO2, which has really stolen a lot of thunder from OpenAI's Sora. What we're going to do today is to discuss a little bit the announcement and then dig deep on a set of use cases which the combination of VO2 and Sora open up.

So what is included in this announcement? First of all, it's both VO2 and Imogen 3, so it's not just their video generation model but also their image generation model, although our focus will be on video generation. VO2 can produce 2-minute clips and resolutions up to 4K. That's 4 times the maximum resolution of Sora and 6 times the duration, with both being industry-leading for a consumer-facing model. A big part of Google's selling point is improved physics and user control.

Physics in particular seems like a notable weak point for Sora. And the difference did seem clear across a number of clips that circulated yesterday. The model can generate a pair of hands cutting a tomato, a task that Sora failed at spectacularly. Then there's a very impressive video of a deck of cards being shuffled, a task that Dennis Kardonsky of Sovereign AI referred to as the Turing test for video.

There's this video of a truck speeding down a road and then veering off to go over a waterfall. And by the way, at this point, if you are a listener, I would suggest you either subscribe to the YouTube or just fire up Spotify where you can watch the video as well, because this is definitely an episode that benefits from the visual. Anyways, this truck video that we're referring to demonstrates a range of really tricky physics problems where other models have been challenged. Then, of course, there's the classic throwback to those Will Smith videos from about a year and a half ago with a successful video creation of a man eating spaghetti.

It's definitely the control of physics that has people most excited. AI design consultant Marco wrote, What stands out the most to me about Google VO2 model is that it appears to actually understand physics. That's a big leap forward. Another leap forward is the ability to recreate professional cinematography techniques like replicating camera motion and the look of different equipment. Google wrote, VO2 understands the language of cinematography. Ask it for a genre, specify a lens, suggest cinematic effects, and VO2 will deliver.

Ask for a low-angle tracking shot that glides through the middle of a scene or a close-up shot on the face of a scientist looking through her microscope and VO2 creates it. Video benchmarks are inherently subjective, but Google is claiming that VO2 outperforms Sora and other rival models on preference and prompt adherence. The model is now available through Google Labs' VideoFX platform, but you will have to join the waitlist for the moment, which is probably the biggest downside.

So as you would expect, there are a ton of comparisons to Sora. Marques Brownlee writes, if these handpicked examples are real, they look better than anything I've gotten out of Sora. And entrepreneur Bindi Reddy writes, Google has officially turned the tables on OpenAI. All you have to do is out announce and drown out the other side. OpenAI was hoping for a big press cycle against Google, given that their search is now free. However, Google stole the limelight with video and image models.

Still, I think for me, the conversation about whether VO2 or Sora is better is much less interesting, if only because it's very, very temporal. What's more interesting to me is thinking about, given where the overall state of video generation is, inclusive not only of Sora and VO2, but also Pika 2.0, Runway, and LumaLabs, what are the use cases that are actually online right now?

The first use case I want to discuss is social media creations. And this is really where Pika has tried to carve out a niche. For example, with Pika, they've preloaded a bunch of effects such as this Cakeify effect, which you can see in this video that looks like a hot air balloon in the sky, but then is actually a giant piece of cake.

There's also their squish effect, where people can take a photo of a daily life object, and then Pika will squish it in a video that's definitely purposed for social media. Similarly, there's a crush it feature, a melt it feature, a dissolve feature, which looks very much like what happened when Thanos snaps his fingers in Avengers Infinity War. And the point is that when it comes to really creative and cool social media videos, we absolutely have the tools right now to totally change what you can do.

Now, of course, thinking about this from a business context, that means that brands can be doing more creative social media generation right now.

The line, however, between social media and advertising is increasingly blurry. Pierrick Chevalier combined reference images of a woman, a Red Bull, a particular set of kiddie-eared headphones, and a neon gamer girl background to show how quickly a branded video could come together. He pointed out that we're just at the beginning, saying, just imagine the power once we achieve 100% object consistency. And when it comes to advertising, some companies are already jumping ahead and going full AI for their ads.

Last month, for example, eToro released a completely generated ad featuring a dancing bear and bull in the middle of Times Square. The results were far from perfect, particularly the scenes where the animals were dancing, where the physics were far from perfect. What's more interesting, though, was that the Doerr brothers, who produced the ad, said that the entire project was wrapped up in one and a half weeks from conception to final cut.

The idea of producing an entire ad in a week and a half is absolutely insane and totally game-changing. This doesn't mean that everyone is going to use AI for all sorts of advertising, but the dramatic collapse of the cost of advertising production will inevitably change the way that industry works.

Certainly it's going to democratize ad creation for smaller companies and brands. Video advertising is likely to move from something that requires an ad agency and a production team to something that an intern can whip up in a few days. Also, the speed of production means that people will be able to respond to pop culture and cultural moments with near real-time ad generation as well.

All of this hits my thesis that I've shared here before, that the word that most sums up the AI future is more. We're just going to have more of everything. And certainly we're going to have more advertising. The advertising is going to be more customized, more of the moment, probably more fleeting, and represent more of the world of business.

In the fashion and lifestyle space, you're already seeing a ton of this. Flare AI is a platform that specifically optimizes video and image models for ad creation. And back in October showed an example of a very professional looking commercial generated for Mulberry handbags that was made 100% with AI.

Salma on X, who focuses on AI product photography and video, has also done tutorials on creating ads for a makeup brand, once again, entirely using AI.

Whether you're an operations leader, marketer, or even a non-technical founder, Plum gives you the power of AI without the technical hassle. Get instant access to top models like GPT-4.0, CloudSonic 3.5, Assembly AI, and many more. Don't let technology hold you back. Check out Use Plum, that's Plum with a B, for early access to the future of workflow automation. Today's episode is brought to you by Vanta. Whether you're starting or scaling your company's security program, demonstrating top-notch security practices, and establishing trust is more important than ever.

Vanta automates compliance for ISO 27001, SOC 2, GDPR, and leading AI frameworks like ISO 42001 and NIST AI Risk Management Framework, saving you time and money while helping you build customer trust. Plus, you can streamline security reviews by automating questionnaires and demonstrating your security posture with a customer-facing trust center all powered by Vanta AI.

Over 8,000 global companies like Langchain, Leela AI, and Factory AI use Vanta to demonstrate AI trust and prove security in real time. Learn more at vanta.com slash nlw. That's vanta.com slash nlw. Today's episode is brought to you, as always, by Superintelligent.

Have you ever wanted an AI daily brief but totally focused on how AI relates to your company? Is your company struggling with AI adoption, either because you're getting stalled figuring out what use cases will drive value or because the AI transformation that is happening is siloed at individual teams, departments, and employees and not able to change the company as a whole? Superintelligent has developed a new custom internal podcast product that inspires your teams by sharing the best AI use cases from inside and outside your company.

Think of it as an AI Daily Brief, but just for your company's AI use cases. If you'd like to learn more, go to besuper.ai slash partner and fill out the information request form.

I am really excited about this product, so I will personally get right back to you. Again, that's besuper.ai slash partner. Today's episode is brought to you by Rocket Money. We are coming up on the beginning of the new year, and that is a perfect time to get organized, set goals, prioritize what matters most, which for many of us is going to be financial wellness. That's

Thanks to Rocket Money, those goals, especially around money, feel achievable. Rocket Money shows you all of your subscriptions right in one place, helping you easily cancel those that you've maybe forgotten that you're actually paying for. Rocket Money also pulls together all of your spending across your different accounts so that you can clearly track spending habits and see where you can cut back.

Rocket Money is a personal finance app that helps find and cancel unwanted subscriptions, monitors your spending, and helps lower your bills so you can grow your savings. Their dashboard gives you a clear view of your expenses across all of your accounts. You can easily create a personalized budget with custom categories.

You can see your monthly spending trends in each category to know exactly where your money is going. Rocket Money will even try to negotiate lower bills for you. They automatically scan your bills to find opportunities to save, and then you can ask them to negotiate. They'll deal with customer service so that you don't have to.

Rocket Money has over 5 million users and has saved a total of 500 million in cancelled subscriptions, saving members up to $740 a year when using all of the app's premium features. Cancel your unwanted subscriptions and reach your financial goals faster with Rocket Money. Go to rocketmoney.com slash AI breakdown today. That's rocketmoney.com slash AI breakdown.

Our third use case that is available right now is establishing shots, B-roll, and drone footage. This type of video is used in everything from ads to social media content to professional film. Big budget productions can afford to go out and get their own, but in many cases, this sort of imagery is purchased from stock libraries. That's a business model that seems very suspect in the future, as Vio and Sora are both already incredibly adept at these sort of establishing natural world shots.

AI and design Marco did an entire sizzle reel with this sort of shot, showing just what an incredible library of video imagery is now available to people on demand and based solely on their imagination. A fourth use case for AI video right now is storyboarding and brainstorming. One of the things that people are most excited about when it comes to Sora is the fact that they've built a storyboard timeline editor directly into the product.

So you can basically plan out an entire sequence of videos that add up to a complete story. Now, initially, this will, I think, be used by filmmakers. But in the long run, it wouldn't surprise me to start to see video brainstorming as something that actually becomes part of a broader set of business activities.

You could see internal teams doing video brainstorming to lay out their ads, even if they work with an ad agency. You could see events and marketing teams planning out and experimenting with what their event setup might look like at big trade shows.

One of the most important things to remember about generative AI is that it's very hard for us to not think about the one-to-one replacement phase. In other words, understand how it replaces things that already exist. For example, I was just talking about how stock video libraries are going to have a tough time given that VO and Sora can now create all the types of things that they've previously made money on. However, I think when the dust settles in a decade or so, the far more interesting use cases will be the things that simply weren't possible before.

So ultimately, I don't know whether trade show and event sponsorship planning is going to involve video storyboarding and brainstorming, but it wouldn't surprise me.

Lastly, as much as we've talked about these use cases that are for advertisers and businesses and social media creators, it's very clear that this is going to infiltrate Hollywood and professional filmmaking very soon. The fact that we're starting to understand physics, the fact that VO can imitate cinematography techniques, all of this means that these models are more ready for prime time, even at the highest levels of production, than they've ever been.

The coolest thing about that, though, is that it's not just going to be Hollywood that has access to them. There could be an absolute renaissance of film and video storytellers given the decreased cost of production in the expanded realm for creativity. As Andrew Marmon, a research engineer at Google DeepMind, put it, "...the world we will be able to create."

VO2 looks awesome. I'm excited to try to get a chance to play around with it. For now, though, that is going to do it for today's AI Daily Brief. Appreciate you listening as always. And until next time, peace.