We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions

LessWrong (Curated & Popular)

Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you

Episodes

Total: 543

“Anomalous”, “glitch”, or “unspeakable” tokens in an LLM are those that induce bizarre behavior or o

This is the abstract and introduction of our new paper, with some discussion of implications for AI

The CakeImagine that I want to bake a chocolate cake, and my sole goal in my entire lightcone and e

This post offers an accessible model of psychology of character-trained LLMs like Claude. Epistemic

This is a link post.This is a blog post reporting some preliminary work from the Anthropic Alignment

One hope for keeping existential risks low is to get AI companies to (successfully) make high-assura

Cross-posted from Telescopic TurnipAs we all know, humans are terrible at building butterflies. We c

This is a link post.A story I wrote about living through the transition to utopia.This is the one st

This is a link post.Present alongside President Trump:  Sam AltmanLarry Ellison (Oracle executive ch

The AI Control Agenda, in its own words:… we argue that AI labs should ensure that powerful AIs are

I think a lot of people have heard so much about internalized prejudice and bias that they think the

(Both characters are fictional, loosely inspired by various traits from various real people. Be care

From AI scientist to AI research fleetResearch automation is here (1, 2, 3). We saw it coming and p

So we want to align future AGIs. Ultimately we’d like to align them to human values, but in the shor

Traditional economics thinking has two strong principles, each based on abundant historical data: Pr

All quotes, unless otherwise marked, are Tolkien's words as printed in The Letters of J.R.R.Tol

The anonymous review of The Anti-Politics Machine published on Astral Codex X focuses on a case stud

Crossposted from my personal blog. I was inspired to cross-post this here given the discussion that

TL;DR: There may be a fundamental problem with interpretability work that attempts to understand neu

Funding for $150bn training systems just turned less speculative, with OpenAI o3 reaching 25% on Fro