One of my takeaways from EA Global this year was that most alignment people aren't explicitly focused on LLM-based agents (LMAs)[1] as a route to takeover-capable AGI. I want to better understand this position, since I estimate this path to AGI as likely enough (maybe around 60%) to be worth specific focus and concern. Two reasons people might not care about aligning LMAs in particular:
Thinking this route to AGI is quite possible but that aligning LLMs mostly covers aligning LLM agents Thinking LLM-based agents are unlikely to be the first takeover-capable AGI
I'm aware of arguments/questions like Have LLMs Generated Novel Insights?, LLM Generality is a Timeline Crux, and LLMs' weakness on what Steve Byrnes calls discernment: the ability to tell their better ideas/outputs from their worse ones.[2] I'm curious if these or other ideas play a major role in your thinking. I'm even more curious about [...]
The original text contained 3 footnotes which were omitted from this narration.
First published: March 2nd, 2025
---
Narrated by TYPE III AUDIO).