We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
【人工智能】什么是强化学习中的奖励黑客 | Reward Hacking | OpenAI前安全主管翁荔最新长文 | 奖励函数 | RLHF | 古德哈特定律 | ICRH | 缓释措施
00:00
Share
2024/12/6
最佳拍档
Transcribe
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.