We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“Greedy-Advantage-Aware RLHF” by sej2020
28:23
Share
2024/12/28
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
What Motivates Greedy-Advantage-Aware RLHF?
A Simplified Look at PPO in RLHF
Introducing Greedy-Advantage-Aware RLHF
Evaluating the Algorithm
Main Results: What Did the Research Find?
Exploring Sharpness in Reward Hacking Agents
Efficiency: How Does It Compare?
Discussion: What Are the Implications?
Limitations and Future Work: Where Do We Go from Here?
Acknowledgements
Appendix
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.