We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“Reducing LLM deception at scale with self-other overlap fine-tuning” by Marc Carauleanu, Diogo de Lucena, Gunnar_Zarncke, Judd Rosenblatt, Mike Vaiana, Cameron Berg
12:23
Share
2025/3/13
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
What's the Summary of the Research on Reducing LLM Deception?
How Was the LLM Experimental Setup Designed?
What Were the LLM Experimental Results?
How Did the SOO Fine-Tuning Impact LLM Capabilities?
What Generalisation Experiments Were Conducted?
What Are Some Example Outputs from the LLMs?
What Are the Key Takeaways from This Research?
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.