We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“The 80/20 playbook for mitigating AI scheming risks in 2025” by Charbel-Raphaël
10:42
Share
2025/6/1
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
What Are the Key Mitigation Strategies for AI Scheming Risks?
Why Are Architectural Choices Crucial for Ex-ante Mitigation?
How Do Control Systems Help in Post-hoc Containment?
What Are White Box Techniques and How Do They Detect Post-hoc Issues?
Can Black Box Techniques Enhance AI Security?
What Is Sandbagging and How Can We Avoid It?
How Effective Are These Combined Strategies in Reducing AI Risks?
What Is the Real Challenge in Enforcing AI Safety Measures?
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.