We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“Thought Crime: Backdoors and Emergent Misalignment in Reasoning Models” by James Chua, Owain_Evans
18:58
Share
2025/6/17
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
What is Emergent Misalignment in Reasoning Models?
Introduction to the Paper
How Do We Test for Misalignment?
Main Paper Figures: What Do They Reveal?
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.