We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“MONA: Managed Myopia with Approval Feedback” by Seb Farquhar, David Lindner, Rohin Shah
17:10
Share
2025/1/23
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
Can Multi-Step Reward Hacking Be Reduced to Single-Step?
Exploring Different Kinds of Myopic Agents
Approval vs. Reward: What's the Difference?
Experimental Evidence: What Does It Show?
What Are the Limitations of MONA?
Where Do We Go from Here?
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.