We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)” by Neel Nanda, lewis smith, Senthooran Rajamanoharan, Arthur Conmy, Callum McDougall, Tom Lieberum, János Kramár, Rohin Shah
57:32
Share
2025/4/12
LessWrong (Curated & Popular)
AI Chapters
Transcribe
Chapters
What's the TL;DR?
Why Did They Start This Research?
What Was Their Main Task?
What Did They Conclude?
How Did They Train Chat SAEs?
Can SAEs Help with Out-of-Distribution Probing?
What Were the Results of the Probing?
Is It Surprising That SAEs Didn't Work?
How Can SAEs Be Used for Dataset Debugging?
What Happens When You Remove High Frequency Latents?
How Did They Evaluate Interpretability?
What Are the Final Conclusions?
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.