We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“Model Organisms for Emergent Misalignment” by Anna Soligo, Edward Turner, Mia Taylor, Senthooran Rajamanoharan, Neel Nanda
12:16
Share
2025/6/17
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
What is Emergent Misalignment and Why Does It Matter?
Introduction to the Study
How Coherent is Emergent Misalignment?
Can Emergent Misalignment Occur in Smaller Models?
Does Full Supervised Finetuning Lead to Emergent Misalignment?
Is a Single Rank-1 LoRA Adapter Enough for EM?
What's Next for Research on Emergent Misalignment?
What Are the Key Contributions of This Work?
Acknowledgments
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.