Our 201st episode with a summary and discussion of last week's big AI news! Recorded on 03/02/2025
Join our brand new Discord here!) https://discord.gg/nTyezGSKwP
Hosted by Andrey Kurenkov) and guest host Sharon Zhou Feel free to email us your questions and feedback at [email protected] )and/or [email protected])
Read out our text newsletter and comment on the podcast at https://lastweekin.ai/).
In this episode:
Timestamps + Links:
(00:00:00) Intro / Banter
(00:01:36) News Preview
Tools & Apps
(00:02:33) OpenAI announces GPT-4.5, warns it’s not a frontier AI model)
(00:07:22) Anthropic launches a new AI model that ‘thinks’ as long as you want)
(00:11:14) New Grok 3 release tops LLM leaderboards)
(00:16:43) Sesame is the first voice assistant I’ve ever wanted to talk to more than once)
(00:18:30) Google launches a free AI coding assistant with very high usage caps)
(00:20:45) Rabbit shows off the AI agent it should have launched with)
(00:22:23) Mistral’s Le Chat tops 1M downloads in just 14 days)
Applications & Business
(00:24:06) OpenAI Tops 400 Million Users Despite DeepSeek’s Emergence)
(00:27:37) Google’s new AI video model Veo 2 will cost 50 cents per second)
(00:29:52) HP is buying Humane and shutting down the AI Pin)
Projects & Open Source
(00:31:44) Microsoft launches next-gen Phi AI models.)
(00:33:47) OpenAI introduces SWE-Lancer: A Benchmark for Evaluating Model Performance on Real-World Freelance Software Engineering Work)
(00:37:12) SWE-Bench+: Enhanced Coding Benchmark for LLMs)
Research & Advancements
(00:40:00) Towards an AI co-scientist)
(00:42:52) Magma: A Foundation Model for Multimodal AI Agents)
Policy & Safety
(00:47:32) Demonstrating specification gaming in reasoning models)
(00:51:03) Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs)