We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Full-duplex, real-time dialogue with Kyutai

Full-duplex, real-time dialogue with Kyutai

2024/12/4
logo of podcast Practical AI: Machine Learning, Data Science, LLM

Practical AI: Machine Learning, Data Science, LLM

AI Deep Dive AI Chapters Transcript
People
A
Alexandre Défossez
Topics
Alexandre Défossez介绍了Kyutai实验室的背景、使命和研究方向,强调了其作为非营利性组织的独立性和在开源研究方面的贡献。他详细阐述了其最新研发的Moshi语音模型的特点,包括全双工实时对话、低延迟等,并比较了与其他商业实验室的差异。他还探讨了法国人工智能生态系统的现状和发展趋势,以及Kyutai在其中扮演的角色。此外,他还分享了关于开放科学的理念,以及如何通过开放科学促进人工智能领域的民主化。 Chris Benson和Daniel Whitenack作为主持人,引导Alexandre Défossez深入探讨了Moshi模型的技术细节、数据处理方法、模型规模选择以及未来研究方向等问题。他们还就法国人工智能生态系统、开放科学的意义以及大型语言模型与小型模型的比较等话题进行了深入的交流。

Deep Dive

Chapters
Kyutai, a non-profit research lab based in Paris, developed Moshi, a full-duplex, real-time speech-to-speech AI assistant. Moshi allows for fluid, human-like conversations with minimal latency and has potential applications in various fields.
  • Kyutai is a non-profit, open-source AI research lab funded by three donors.
  • Moshi is a full-duplex model, meaning it can listen and speak simultaneously.
  • Moshi has a latency of around 200 milliseconds.
  • Kyutai prioritizes on-device models, which are harder to protect as intellectual property but offer wider accessibility.
  • The French ecosystem is conducive to AI research due to a strong emphasis on mathematics, engineering, and PhD residencies in private companies.

Shownotes Transcript

Kyutai, an open science research lab, made headlines over the summer when they released their real-time speech-to-speech AI assistant (beating OpenAI to market with their teased GPT-driven speech-to-speech functionality). Alex from Kyutai joins us in this episode to discuss the research lab, their recent Moshi models, and what might be coming next from the lab. Along the way we discuss small models and the AI ecosystem in France.

Join the discussion)

Changelog++) members save 10 minutes on this episode because they made the ads disappear. Join today!

Sponsors:

  • Fly.io) – The home of Changelog.com — Deploy your apps close to your users — global Anycast load-balancing, zero-configuration private networking, hardware isolation, and instant WireGuard VPN connections. Push-button deployments that scale to thousands of instances. Check out the speedrun) to get started in minutes.

  • Timescale) – Purpose-built performance for AI Build RAG, search, and AI agents on the cloud and with PostgreSQL and purpose-built extensions for AI: pgvector, pgvectorscale, and pgai.

  • WorkOS) – AuthKit offers 1,000,000 monthly active users (MAU) free — The world’s best login box, powered by WorkOS + Radix. Learn more and get started at WorkOS.com) and AuthKit.com)

Featuring:

Show Notes:

Something missing or broken? PRs welcome!)