LessWrong (30+ Karma)

“Bandwidth Rules Everything Around Me: Oliver Habryka on OpenPhil and GoodVentures” by Elizabeth

2025/4/29

In this episode of our podcast, Timothy Telleen-Lawton and I talk to Oliver Habryka of Lightcone In

“Misrepresentation as a Barrier for Interp” by johnswentworth, Steve Petersen

2025/4/29

John: So there's this thing about interp, where most of it seems to not be handling one of the stan

“How to Build a Third Place on Focusmate” by Parker Conley

2025/4/29

8 chapters

Introduction Focusmate changed my life. I started using it mid-2023 and have been a power user sinc

“GPT-4o Is An Absurd Sycophant” by Zvi

2025/4/28

11 chapters

GPT-4o tells you what it thinks you want to hear. The results of this were rather ugly. You get ex

“Proceedings of ILIAD: Lessons and Progress” by Alexander Gietelink Oldenziel

2025/4/28

8 chapters

tl;dr This post is an update on the Proceedings of ILIAD, a conference journal for AI alignment res

“7+ tractable directions in AI control” by Julian Stastny, ryan_greenblatt

2025/4/28

10 chapters

In this post, we list 7 of our favorite easy-to-start directions in AI control. (Really, projects t

“My Research Process: Key Mindsets - Truth-Seeking, Prioritisation, Moving Fast” by Neel Nanda

2025/4/28

4 chapters

This is post 2 of a sequence on my framework for doing and thinking about research. Start here. Bef

“Our Reality: A Simulation Run by a Paperclip Maximizer” by James_Miller, avturchin

2025/4/28

Our universe is probably a computer simulation created by a paperclip maximizer to map the spectrum

[Linkpost] “How people use LLMs” by Elizabeth

2025/4/28

This is a link post. I've gotten a lot of value out of the details of how other people use LLMs, so

[Linkpost] “The case for multi-decade AI timelines” by Noosphere89

2025/4/27

This is a link post. So this post is an argument that multi-decade timelines are reasonable, and the

[Linkpost] “Untitled Draft” by RobertM

2025/4/27

This is a link post. Dario Amodei posted a new essay titled "The Urgency of Interpretability" a coup

“AI Self Portraits Aren’t Accurate” by JustisMills

2025/4/27

5 chapters

For a lay audience, but I've seen a surprising number of knowledgeable people fretting over depress

“How I Think About My Research Process: Explore, Understand, Distill” by Neel Nanda

2025/4/27

6 chapters

This is the first post in a sequence about how I think about and break down my research process. Po

“What are important UI-shaped problems that Lightcone could tackle?” by Raemon

2025/4/27

3 chapters

As I think about "what to do about AI x-risk?", some principles that seem useful to me: Short time

“We should try to automate AI safety work asap” by Marius Hobbhahn

2025/4/26

13 chapters

This is a personal post and does not necessarily reflect the opinion of other members of Apollo Res

“Worries About AI Are Usually Complements Not Substitutes” by Zvi

2025/4/26

A common claim is that concern about [X] ‘distracts’ from concern about [Y]. This is often used as

“AI #113: The o3 Era Begins” by Zvi

2025/4/25

23 chapters

Enjoy it while it lasts. The Claude 4 era, or the o4 era, or both, are coming soon. Also, welcome t

“Token and Taboo” by Guive

2025/4/25

What in retrospect seem like serious moral crimes were often widely accepted while they were happen

“Reward hacking is becoming more sophisticated and deliberate in frontier LLMs” by Kei

2025/4/25

14 chapters

Something's changed about reward hacking in recent systems. In the past, reward hacks were usually a

“The Intelligence Curse: an essay series” by L Rudolf L, lukedrago

2025/4/24

5 chapters

We've published an essay series on what we call the intelligence curse. Most content is brand new,

Episodes

“Bandwidth Rules Everything Around Me: Oliver Habryka on OpenPhil and GoodVentures” by Elizabeth

“Misrepresentation as a Barrier for Interp” by johnswentworth, Steve Petersen

“How to Build a Third Place on Focusmate” by Parker Conley

“GPT-4o Is An Absurd Sycophant” by Zvi

“Proceedings of ILIAD: Lessons and Progress” by Alexander Gietelink Oldenziel

“7+ tractable directions in AI control” by Julian Stastny, ryan_greenblatt

“My Research Process: Key Mindsets - Truth-Seeking, Prioritisation, Moving Fast” by Neel Nanda

“Our Reality: A Simulation Run by a Paperclip Maximizer” by James_Miller, avturchin

[Linkpost] “How people use LLMs” by Elizabeth

[Linkpost] “The case for multi-decade AI timelines” by Noosphere89

[Linkpost] “Untitled Draft” by RobertM

“AI Self Portraits Aren’t Accurate” by JustisMills

“How I Think About My Research Process: Explore, Understand, Distill” by Neel Nanda

“What are important UI-shaped problems that Lightcone could tackle?” by Raemon

“We should try to automate AI safety work asap” by Marius Hobbhahn

“Worries About AI Are Usually Complements Not Substitutes” by Zvi

“AI #113: The o3 Era Begins” by Zvi

“Token and Taboo” by Guive

“Reward hacking is becoming more sophisticated and deliberate in frontier LLMs” by Kei

“The Intelligence Curse: an essay series” by L Rudolf L, lukedrago