We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions

LessWrong (30+ Karma)

Audio narrations of LessWrong posts.

Episodes

Total: 1064

One key hope for mitigating risk from misalignment is inspecting the AI's behavior, noticing that i

There's an implicit model I think many people have in their heads of how everyone else behaves. As

This is a link post. Definition In 1954, Roger Bannister ran the first officially sanctioned sub-4-m

Epistemic status: These are results of a brief research sprint and I didn't have time to investigat

Dario Amodei, CEO of Anthropic, recently worried about a world where only 30% of jobs become automa

Epistemic status: a model I find helpful to make sense of disagreements and, sometimes, resolve the

TL;DR: If we optimize a steering vector to induce a language model to output a single piece of harm

TL;DR: I claim that many reasoning patterns that appear in chains-of-thought are not actually used

Thanks to Linda Linsefors for encouraging me to write my story. Although it might not generalize to

I'm graduating from UChicago in around 60 days, and I've been thinking about what I've learned thes

Introduction This is a nuanced “I was wrong” post. Something I really like about AI safety and EA/r

Epistemic status: Noticing confusion There is little discussion happening on LessWrong with regards

In this post I lay out a concrete vision of how reward-seekers and schemers might function. I descr

Cross-posted from Substack. AI job displacement will affect young people first, disrupting the usua

Paper is good. Somehow, a blank page and a pen makes the universe open up before you. Why paper has

Summary OpenAI recently released the Responses API. Most models are available through both the new

It's generally agreed that as AIs get more capable, risks from misalignment increase. But there are

Google Lays Out Its Safety Plans I want to start off by reiterating kudos to Google for actually

Authors: Eli Lifland, Nikola Jurkovic[1], FutureSearch[2]This is supporting research for AI 2027. We