We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“7+ tractable directions in AI control” by Julian Stastny, ryan_greenblatt
25:13
Share
2025/4/28
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
Can You Do Elicitation Without Learning and Study Overelicitation?
How Can We Generate Indistinguishable Synthetic Inputs?
What Are the Implications of Teaching Models Synthetic Facts?
Why Should We Further Study Exploration Hacking?
Is Malign AI Agent Substitution a Real Concern?
How Does Data Poisoning Impact AI Control?
Bonus: 3 Slightly More Challenging Directions
Can Few-Shot Catastrophe Prevention Be Achieved?
What Countermeasures Can We Develop Against Collusion?
Steganography in AI: How Can We Detect and Prevent It?
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.