In this new paper, I discuss what it would mean for AI systems to be persons — entities with properties like agency, theory-of-mind, and self-awareness — and why this is important for alignment. In this post, I say a little more about why you should care. The existential safety literature focuses on the problems of control and alignment, but these framings may be incomplete and/or untenable if AI systems are persons. The typical story is approximately as follows. AI x-safety problem (the usual story):
Humans will (soon) build AI systems more intelligent/capable/powerful than humans; These systems will be goal-directed agents; These agents’ goals will not be the goals we want them to have (because getting intelligent agents to have any particular goal is an unsolved technical problem); This will lead to misaligned AI agents disempowering humans (because this is instrumentally useful for their actual goals).
Solution [...]
First published: January 26th, 2025
Source: https://www.lesswrong.com/posts/Gh4fyTxGCpWS4jT82/why-care-about-ai-personhood)
---
Narrated by TYPE III AUDIO).