We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Alignment is Real // Shiva Bhattacharjee // #260

2024/9/13

MLOps.community

AI Deep Dive AI Insights AI Chapters Transcript

People

Shiva Bhattacharjee

Topics

Shiva Bhattacharjee: 本期访谈主要围绕大型语言模型在法律领域的应用展开，重点讨论了微调和提示工程两种方法的优缺点及适用场景。TrueLaw公司专注于为律师事务所构建定制的AI解决方案，他们面临着如何平衡模型精度、质量和速度的挑战。在实际应用中，他们发现简单的RAG方法在法律领域效果不佳，需要结合查询重写等技术来提高检索质量。他们选择使用DSPy框架进行查询重写，因为它具有模块化、易于迭代改进等优点，并能通过暴露中间步骤来提高用户可见性。同时，他们也对嵌入模型进行了微调，以提高检索速度和质量，并根据特定律师事务所的要求对生成的答案进行微调，以确保答案的风格和格式符合要求。在基础设施方面，他们选择利用现有的模型训练服务，并构建了一个数据生成和微调流程的协调层，以提高效率和灵活性。他们还讨论了构建和购买基础设施的权衡，以及如何利用异步通信和持久工作流来处理大规模推理任务。总而言之，TrueLaw公司通过结合微调和提示工程，并利用合适的工具和基础设施，成功地构建了高质量的法律AI解决方案。 Shiva Bhattacharjee: 在选择技术栈时，TrueLaw公司优先考虑了快速迭代开发和成本效益。他们发现构建基础模型非常困难和具有挑战性，因此专注于微调现有的基础模型，并采用了一种类似于“复合系统”的方法，将多个小型语言模型组合起来协同工作。在基础设施方面，他们充分利用了现有的模型训练服务，并构建了一个能够在不同云平台上运行的协调层。他们还讨论了数据流的重要性，以及如何将模型的输出反馈到数据流中，以不断改进模型的性能。在选择构建还是购买方面，他们最初构建了一个内部消息系统，但后来为了处理大规模推理任务而采用了Temporal，这使得他们能够更轻松地处理长时间运行的推理和训练任务，并避免了自行构建复杂基础设施的麻烦。

Deep Dive

Key Insights

Why did TrueLaw choose DSPy over other tools like LangChain?

TrueLaw chose DSPy due to its modular nature, ease of customization, and better object hierarchy, which allowed for more efficient and transparent query rewriting and iterative improvements.

Why is fine-tuning necessary for domain-specific tasks in legal AI?

Fine-tuning is necessary for domain-specific tasks in legal AI because off-the-shelf models often lack the precision and quality required by lawyers, who are not skilled prompt engineers. Fine-tuning helps contextualize queries and align the output with the firm's specific expectations.

Why is embedding model fine-tuning cost-effective for legal AI tasks?

Embedding model fine-tuning is cost-effective because the models are relatively small and the main cost is generating contrastive data, which is cheaper compared to training large models from scratch.

Why did TrueLaw decide to use SaaS providers for infrastructure rather than building it in-house?

TrueLaw decided to use SaaS providers for infrastructure to leverage existing services, reduce costs, and focus on their core IP, which is data generation and fine-tuning. This approach is more efficient and scalable, especially for a startup with resource constraints.

Why did TrueLaw choose to use Temporal for managing long-running workflows?

TrueLaw chose Temporal for managing long-running workflows because it provided a robust and flexible workflow engine that handled retries, interruptions, and notifications, which would have taken significant time and effort to build in-house.

Why does Shiva believe his broad experience across different tech stacks has been beneficial?

Shiva believes his broad experience has been beneficial because it allows him to draw parallels between seemingly unrelated areas, apply fundamental principles across different domains, and understand the core concepts that are universally applicable in solving performance and system-level issues.

Chapters

Shiva discusses the use of DSPy in production, its modular nature, and how it compares to Langchain in terms of prompt engineering and modularity.

DSPy is used for its modularity and iterative improvement capabilities.
Prompt engineering in DSPy is more adaptable than in Langchain.
DSPy's modular design allows for easy integration of custom re-rankers.

Shownotes Transcript

Shiva Bhattacharjee) is the Co-founder and CTO of TrueLaw), where we are building bespoke models for law firms for a wide variety of tasks.

Alignment is Real // MLOps Podcast #260 with Shiva Bhattacharjee, CTO of TrueLaw Inc.

// Abstract If the off-the-shelf model can understand and solve a domain-specific task well enough, either your task isn't that nuanced or you have achieved AGI. We discuss when is fine-tuning necessary over prompting and how we have created a loop of sampling - collecting feedback - fine-tuning to create models that seem to perform exceedingly well in domain-specific tasks.

// Bio 20 years of experience in distributed and data-intensive systems spanning work at Apple, Arista Networks, Databricks, and Confluent. Currently CTO at TrueLaw where we provide a framework to fold in user feedback, such as lawyer critiques of a given task, and fold them into proprietary LLM models through fine-tuning mechanics, resulting in 7-10x improvements over the base model.

// MLOps Jobs board https://mlops.pallet.xyz/jobs

// MLOps Swag/Merch https://mlops-community.myshopify.com/

// Related Links Website: www.truelaw.ai

--------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Shiva on LinkedIn: https://www.linkedin.com/in/shivabhattacharjee/

Timestamps: [00:00] Shiva's preferred coffee [00:58] Takeaways [01:17] DSPy Implementation [04:57] Evaluating DSPy risks [08:13] Community-driven DSPy tool [12:19] RAG implementation strategies [17:02] Cost-effective embedding fine-tuning [18:51] AI infrastructure decision-making [24:13] Prompt data flow evolution [26:32] Buy vs build decision [30:45] Tech stack insights [38:20] Wrap up

Alignment is Real // Shiva Bhattacharjee // #260 40:20 Share