We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“How training-gamers might function (and win)” by Vivek Hebbar
25:38
Share
2025/4/12
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
What are the core claims of reward-seeking models?
Characterizing reward-seekers: How do they operate?
When do models think about reward, and why?
What should we expect from schemers in AI models?
How do terminal reward seekers behave in unfamiliar scenarios?
What factors influence scheming and terminal reward seeking?
A story about goal reflection: What can we learn?
Thoughts on compression: How does it impact reward-seeking behavior?
Appendix: Distribution over worlds
Canary string
Acknowledgements
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.