We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

“Cognitive Biases Contributing to AI X-risk — a deleted excerpt from my 2018 ARCHES draft” by Andrew_Critch

2024/12/4

** Preface** Several friends have asked me about what psychological effects I think could affect human judgement about x-risk. This isn't a complete answer, but in 2018 I wrote a draft of "AI Research Considerations for Human Existential Safety" (ARCHES) that included an overview of cognitive biases I thought (and still think) will impair AI risk assessments. Many cognitive bias experiments had already failed to reproduce well in the psychology reproducibility crisis, so I thought it would be a good idea to point out some that did reproduce well, and that were obviously relevant to AI risk. Unfortunately, one prospective coauthor asked that this content be removed, because of the concern that academic AI researchers would be too unfamiliar with the relevant field: cognitive science. That coauthor ultimately decided not to appear on ARCHES at all, and I retrospect I probably should have added back this material. Just to [...]

Outline:

(00:08) Preface

(01:42) Notes on definitions

(02:25) Risk Type 1b: Unrecognized prepotence

(06:38) Risk Type 1b: Unrecognized misalignment

First published: December 3rd, 2024

Source: https://www.lesswrong.com/posts/aoEnDEmoKCK9S99hL/cognitive-biases-contributing-to-ai-x-risk-a-deleted-excerpt)

---

Narrated by TYPE III AUDIO).

“Cognitive Biases Contributing to AI X-risk — a deleted excerpt from my 2018 ARCHES draft” by Andrew_Critch

LessWrong (30+ Karma)

Shownotes Transcript

“Cognitive Biases Contributing to AI X-risk — a deleted excerpt from my 2018 ARCHES draft” by Andrew_Critch 12:35 Share

LessWrong (30+ Karma)

Shownotes Transcript

“Cognitive Biases Contributing to AI X-risk — a deleted excerpt from my 2018 ARCHES draft” by Andrew_Critch