We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“SAE on activation differences” by Santiago Aranguri, jacob_drori, Neel Nanda
10:54
Share
2025/7/1
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
Why Do We Need to Understand Model Changes?
What is SAE on Activation Differences?
How Do We Identify Relevant Latents?
Exploring KL Dashboards
What is an Inhibitory Latent?
Understanding the Roleplay Latent
The Uncertainty Latent: What Does It Reveal?
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.