We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode 847: AI Engineering 101, with Ed Donner

847: AI Engineering 101, with Ed Donner

2024/12/24
logo of podcast Super Data Science: ML & AI Podcast with Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

AI Deep Dive AI Insights AI Chapters Transcript
People
E
Ed Donner
J
John Krohn
Topics
John Krohn: 本期节目中,Ed Donner 详细介绍了 AI 工程师(也称为 LLM 工程师)的角色,并提供数据表明 AI 工程师的需求与数据科学家相当。 Ed Donner: AI 工程师是一个混合型角色,它融合了数据科学家、软件工程师和机器学习工程师的技能。目前,美国约有 4000 个 LLM 工程师职位空缺,与数据科学家的职位空缺数量大致相同。AI 工程师的首要任务是为特定问题选择合适的 LLM。在选择 LLM 时,需要考虑数据质量和数量、评估标准以及预算、时间等非功能性因素。在构建 LLM 之前,最好先构建一个基线模型,以便进行比较和评估。选择 LLM 时,首先要决定使用闭源模型还是开源模型。通常建议先从闭源模型(如 GPT-4.0)开始进行原型设计,然后根据实际情况(如拥有大量专有数据、隐私要求或高推理成本)再考虑开源模型。AI 工程师使用 RAG、微调和自主式 AI 等技术来优化模型应用。 Ed Donner: 选择模型和技术通常需要进行反复试验。AI 工程师有时也负责模型的生产化部署,可以使用 Modal.com 等平台进行无服务器 AI 模型部署,也可以使用 Docker 和 Kubernetes 等技术构建完整的生产服务。对于自主式 AI 系统,可以使用专门的平台进行部署。

Deep Dive

Key Insights

What does an AI engineer do?

An AI engineer is a hybrid role combining data science, software engineering, and ML engineering. They select models, optimize them for specific tasks using techniques like fine-tuning and RAG, and deploy them into production. Their responsibilities include choosing the right LLM, building baseline models, and ensuring models meet business requirements.

Why are AI engineers in high demand?

AI engineers are in demand because they bridge the gap between data science, software engineering, and ML engineering. There are currently around 4,000 job openings for LLM engineers in the U.S., comparable to the number of data science jobs.

How do AI engineers decide which LLM to use?

AI engineers evaluate models based on data quality, evaluation criteria, and non-functional requirements like budget and time to market. They often start with closed-source models like GPT-4 for prototyping and may switch to open-source models if proprietary data or privacy concerns are involved.

What are some key techniques used by AI engineers?

Key techniques include fine-tuning models with domain-specific data, RAG (Retrieval Augmented Generation) for enhancing responses with relevant context, and agentic AI for creating autonomous, proactive systems that can solve complex problems and use tools.

What is RAG in AI engineering?

RAG (Retrieval Augmented Generation) is a technique where an LLM retrieves relevant documents or information from a database to improve its responses. It involves encoding the query into a vector and finding the closest matching documents to provide context to the model.

What is agentic AI?

Agentic AI refers to systems that can autonomously solve complex problems by breaking them into smaller steps, using tools, and even acting proactively beyond a single interaction. For example, an agentic AI could detect a price drop for a flight and notify the user without being prompted.

What are some important benchmarks for evaluating LLMs?

Important benchmarks include GPQA (Google Proof Question and Answers) for expert-level knowledge, MMLU Pro for language understanding, and BBHard (Big Bench Hard) for testing advanced capabilities like sarcasm detection. These benchmarks help evaluate model performance across various tasks.

How can AI engineers deploy models into production?

AI engineers can deploy models using platforms like modal.com for serverless deployment, Lightning Studios for seamless prototyping to production, or Docker and Kubernetes for full production services. For agentic AI, platforms like LandGraph and Crew AI Enterprise can be used to deploy multi-agent systems.

What is the Outsmart game, and how does it evaluate LLMs?

Outsmart is a game where four LLMs compete against each other in a strategic environment. Each model starts with 12 coins and must decide whom to take coins from and whom to give coins to, using private messages to strategize. The game evaluates how well models can form alliances and outsmart each other, providing an ELO rating based on their performance.

What are some useful leaderboards for selecting LLMs?

Useful leaderboards include Hugging Face's open LLM leaderboard, Vellum.ai for cost and context window comparisons, and LMArena.ai (formerly LMSYS) for head-to-head human evaluations. These leaderboards help compare models based on performance, cost, and hardware requirements.

Chapters
The podcast starts by defining the role of an AI engineer, highlighting its hybrid nature, combining data science, software engineering, and ML engineering. The high demand for AI engineers is discussed, emphasizing the similarities in job openings when compared to data scientists.
  • AI engineering is a hybrid role combining data science, software engineering, and ML engineering.
  • There are approximately 4,000 job openings for AI engineers in the US, comparable to the number of data scientist openings.
  • AI engineers select models, apply techniques like RAG and agentic AI, and deploy models into production.

Shownotes Transcript

Ed Donner co-founded AI-driven recruitment platform, Nebula.io, with The SuperDataScience Podcast’s host, Jon Krohn. Ed and Jon reminisce about how they launched their company, the growing opportunities for data scientists, how to choose an LLM, and today’s top technical terms in AI. 

Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected]) for sponsorship information.

In this episode you will learn:

  • (11:15) What an AI engineer does

  • (19:23) Defining today’s key terms in AI: RAG, fine tuning, agentic.

  • (27:09) How to select an LLM

  • (49:41) Pitting LLMs against each other in a game

  • (53:14) What to do once you’ve selected an AI model

Additional materials: www.superdatascience.com/847)