We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“It’s hard to make scheming evals look realistic” by Igor Ivanov, dan_moken
07:47
Share
2025/6/2
LessWrong (Curated & Popular)
AI Chapters
Transcribe
Chapters
How Do We Detect Scheming in LLMs?
Our Pipeline
What Strategies Improve the Realism of Evaluation Scenarios?
Example of Grading a Rewritten Scenario
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.