We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Video generation with realistic motion

Video generation with realistic motion

2025/1/23
logo of podcast Practical AI: Machine Learning, Data Science, LLM

Practical AI: Machine Learning, Data Science, LLM

AI Deep Dive AI Chapters Transcript
People
P
Paras Jain
Topics
Paras Jain: 视频生成技术发展至今,取得了显著进展,但仍面临诸多挑战。首先,视频数据量巨大,且需要模型理解物理规律和现实世界的规则,这使得模型训练难度大,成本高。其次,高质量的视频数据难以获取和筛选,因为互联网上的大部分视频缺乏高质量的动态信息。再次,复杂的运动模拟,例如体操动作,也对模型提出了极大的挑战。最后,视频生成模型的评估需要结合定性和定量方法,既要考虑人类的视觉偏好,也要关注模型对物理规律的理解。Genmo 的视频生成模型发展历程经历了三个阶段,每个阶段都吸取了经验教训,并对模型架构和训练方法进行了改进。Genmo 开源其视频生成模型的决定,是基于模型大小和计算资源的权衡考虑,旨在平衡模型能力和社区的可及性。视频生成模型的训练对GPU资源消耗巨大,且长序列长度带来了额外的挑战。Genmo 的 Mochi 模型采用分阶段架构,先进行视频压缩,再进行扩散模型训练,以降低计算成本。Mochi 模型在运动模拟和指令遵循方面取得了显著进展,在基准测试中与顶级闭源模型不相上下。视频生成模型的应用场景涵盖娱乐和专业内容创作领域,例如替代素材视频、创意构思和视频编辑。基于视频生成模型的视频编辑技术正在发展,例如添加、删除或修改视频中的物体。未来的创造力将是人机协作的产物,人类负责提出创意,AI 负责放大和实现创意。Genmo 的长期愿景是通过视频生成技术推动人工智能领域的创新,最终实现对现实世界的理解和模拟。 Chris Benson: 就视频生成模型的评估方法提出了疑问,并与Paras Jain讨论了如何平衡定量和定性评估方法,以及如何设计有效的测试用例来评估模型对物理规律的理解。 Daniel Whitenack: 与Paras Jain讨论了视频生成模型的应用场景,以及如何将视频生成技术融入到大众的日常生活中。

Deep Dive

Chapters
This chapter explores the history and current state of video generation technology. It highlights the challenges in creating realistic motion and the increasing role of compute power in enabling larger models.
  • Video generation has lagged behind other AI advancements.
  • Creating realistic motion is a major challenge.
  • Compute power is crucial for scaling video generation models.
  • Sora's release was a watershed moment.

Shownotes Transcript

We seem to be experiencing a surge of video generation tools, models, and applications. However, video generation models generally struggle with some basic physics, like realistic walking motion. This leaves some generated videos lacking true motion with disappointing, simplistic panning camera views. Genmo is focused on the motion side of video generation and has released some of the best open models. Paras joins us to discuss video generation and their journey at Genmo.

Join the discussion)

Changelog++) members save 2 minutes on this episode because they made the ads disappear. Join today!

Sponsors:

  • Domo) – The AI and data products platform. Strengthen your entire data journey with Domo’s AI and data products.

Featuring:

Show Notes:

Something missing or broken? PRs welcome!)