We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“Vestigial reasoning in RL” by Caleb Biddulph
17:10
Share
2025/4/14
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
Why is RL not as smart as I thought?
How does vestigial reasoning occur in models?
Can we demonstrate vestigial reasoning through experiments?
What role does reward correlation play in reasoning?
Why does process supervision lead to longer CoTs?
What are the key takeaways from this discussion?
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.