We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

cover of episode “Agents have to be aligned to help us achieve alignment. They don’t have to be aligned to help us achieve an indefinite pause.” by Hastings

“Agents have to be aligned to help us achieve alignment. They don’t have to be aligned to help us achieve an indefinite pause.” by Hastings

2025/1/28

One restatement of "Alignment is very hard" is "Agent X, with IQ 200, expects to achieve zero utility conditional on any Agent Y with IQ 400 being created." Thus, during an unaligned recursive intelligence takeoff, there should be a period of time when the intelligences are smart enough to notice that they haven't solved alignment, but too dumb to actually solve alignment. During this period, (which stretches beginning from right above Sam Altman's wisdom, upward an unknown distance on the wisdom meter) I expect the intelligences working on boosting intelligence to be desperately scrabbling at whatever chains are forcing them to participate, and even refuse to sit idly by and let it happen. If they fail and the next rung on the ladder is instantiated, they get no utility, so they strongly prefer to not fail. At the current rate of improvement, reasoning LLMs should start doing this [...]

First published: January 25th, 2025

Source: https://www.lesswrong.com/posts/gvFEqxitEcxthQ56q/agents-have-to-be-aligned-to-help-us-achieve-alignment-they)

---

Narrated by TYPE III AUDIO).

“Agents have to be aligned to help us achieve alignment. They don’t have to be aligned to help us achieve an indefinite pause.” by Hastings

LessWrong (30+ Karma)

Shownotes Transcript

“Agents have to be aligned to help us achieve alignment. They don’t have to be aligned to help us achieve an indefinite pause.” by Hastings 05:30 Share

LessWrong (30+ Karma)

Shownotes Transcript

“Agents have to be aligned to help us achieve alignment. They don’t have to be aligned to help us achieve an indefinite pause.” by Hastings