We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“o3 Will Use Its Tools For You” by Zvi
01:41:36
Share
2025/4/19
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
What's In a Name?
My Current Model Use Heuristics
Huh, Upgrades?
Use All the Tools
Search the Web
On Your Marks
The System Prompt
The o3 and o4-mini System Card
Tests o3 Aced
Hallucinations: A Concern?
Instruction Hierarchy and Image Refusals
METR Evaluations: Task Duration and Misalignment
Apollo Evaluations: Scheming and Deception
Are We Underestimating Alignment Failures?
GPT-4.1: What Issues Remain?
Pattern Lab Evaluations: Cybersecurity
Preparedness Framework Tests
Biological and Chemical Risks (4.2)
Cybersecurity (4.3)
AI Self-Improvement (4.4)
Perpetual Shilling: A New Threat?
High Praise for o3
Syncopathy: What It Means
Mundane Utility Versus Capability Watch
o3: Offering Mundane Utility?
o3: Not Offering Mundane Utility?
o4-mini: What's Its Role?
Colin Fraser's Dumb Model Watch
Can o3 Forecast Accurately?
Is This AGI?
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.