We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

“Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?” by Alex Mallen, charlie_griffin, Buck Shlegeris

2025/3/26

LessWrong (30+ Karma)

No transcript made for this episode yet, you may request it for free.