We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode “o3” by Zach Stein-Perlman

“o3” by Zach Stein-Perlman

2024/12/20
logo of podcast LessWrong (30+ Karma)

LessWrong (30+ Karma)

Shownotes Transcript

I'm editing this post. OpenAI announced (but hasn't released) o3 (skipping o2 for trademark reasons). It gets 25% on FrontierMath, smashing the previous SoTA of 2%. (These are really hard math problems.) Wow. 72% on SWE-bench Verified, beating o1's 49%. Also 88% on ARC-AGI.


First published: December 20th, 2024

Source: https://www.lesswrong.com/posts/Ao4enANjWNsYiSFqc/o3)

    ---
    

Narrated by TYPE III AUDIO).