We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“Try training token-level probes” by StefanHex
22:45
Share
2025/4/14
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
What Surprising Results Did the Research Yield?
Why Train Token-Level Probes?
How Was the Training Data Generated?
What Methods Were Used to Train the Probe?
What Were the Initial Results?
How Did the Probe Perform on Metrics?
What Insights Did the Probe Score Analysis Provide?
What Are the Key Limitations of This Approach?
What Are the Common Failure Modes of the Probe?
How Does Regularization Impact the Mean-Probe?
How Well Does the Probe Generalize?
Does the Individual-Token Probe Outperform the Mean Probe?
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.