We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode We're All Finetuning Incorrectly // Tanmay Chopra // #304

We're All Finetuning Incorrectly // Tanmay Chopra // #304

2025/4/8
logo of podcast MLOps.community

MLOps.community

AI Deep Dive AI Chapters Transcript
People
D
Demetrios
T
Tanmay Chopra
Topics
Tanmay Chopra: 我认为当前AI系统与其潜力之间存在巨大差距。实际应用中,AI系统并未充分发挥其潜力,许多大型科技公司也承认这一点。我们目前处于一个AI系统潜力巨大但实际应用有限的过渡阶段。AI系统可以扮演两种角色:主管和助手。对于专家用户,AI工具可能降低效率;而对于新手用户,AI工具则能提供帮助。通用AI模型难以满足特定任务需求,专业化模型更有效率。传统的机器学习旨在处理无法解释的系统和流程,而现在的AI则需要用户在提示中解释其流程。评估AI系统的有效性需要考虑两个方面:AI系统本身是否优秀以及AI系统是否适合用户。有效的AI系统评估应基于可量化的指标,而非主观感受。AI系统评估的困难在于难以定义“优秀”的标准,以及技术指标与业务指标之间的映射问题。AI系统并非一劳永逸,需要持续的再训练和改进。在进行模型微调之前,应先尝试优化提示词,只有在提示词优化达到极限后才考虑模型微调。提示词优化无法改变模型的目标函数,而企业级任务通常关注的并非下一个词语。只有拥有模型所有权才能根据目标函数改进AI系统。构建AI系统应简化流程,目标是使其比提示词优化更便捷。LLM降低了构建AI系统的门槛,只需少量数据即可启动第一个版本。不应该依赖于大型语言模型的未来更新来改进AI系统,因为大型模型并不会关注所有任务。LLM在不同阶段具有不同的作用:在原型设计阶段具有灵活性,在生产阶段则擅长流畅性表达。通过设置阈值,可以提高AI系统的准确性和可靠性。当AI系统缺乏信心时,应避免给出答案,而应使用传统方法。未来的AI系统将成为传统机器学习系统流程中的一个组件。可以将LLM用作分类器,解决传统机器学习能够解决的问题。机器学习的经验和直觉是宝贵的,但需要时间积累。平台可以帮助AI工程师获得更丰富的经验和直觉。在AI系统达到一定成熟度后,需要考虑本地化部署,以提高效率和降低延迟。Python语言在机器学习领域拥有丰富的支持和资源,建议使用Python构建AI后端服务。是否迁移到Python语言取决于AI系统在业务中的重要程度。构建AI系统,模型并非最难的部分,模型周围的系统才是关键。 Demetrios: 当前的AI模型难以胜任复杂任务,例如视频编辑,直接使用专业软件效率更高。AI全面取代应用软件的设想在短期内难以实现。AI中心应优先关注那些能显著提升业务效益的重大问题。

Deep Dive

Shownotes Transcript

We're All Finetuning Incorrectly // MLOps Podcast #304 with Tanmay Chopra, Founder & CEO of Emissary.

Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter

// Abstract

Finetuning is dead. Finetuning is only for style. We've all heard these claims. But the truth is we feel this way because all we've been doing is extended pretraining. I'm excited to chat about what real finetuning looks like - modifying output heads, loss functions and model layers, and it's implications on quality and latency. Happy to dive deeper into how DeepSeek leveraged this real version of finetuning through GRPO and how this is nothing more than a rediscovery of our old finetuning ways. I'm sure we'll naturally also dive into when developing and deploying your specialized models makes sense and the challenges you face when doing so.

// Bio

Tanmay is a machine learning engineer at Neeva, where he's currently engaged in reimagining the search experience through AI - wrangling with LLMs and building cold-start recommendation systems. Previously, Tanmay worked on TikTok's Global Trust&Safety Algorithms team - spearheading the development of AI technologies to counter violent extremism and graphic violence on the platform across 160+ countries.Tanmay has a bachelor's and master's in Computer Science from Columbia University, with a specialization in machine learning.

Tanmay is deeply passionate about communicating science and technology to those outside its realm. He's previously written about LLMs for TechCrunch, held workshops across India on the art of science communication for high school and college students, and is the author of Black Holes, Big Bang and a Load of Salt - a labor of love that elucidated the oft-overlooked contributions of Indian scientists to modern science and helped everyday people understand some of the most complex scientific developments of the past century without breaking into a sweat!

// Related Links