** Preface** Several friends have asked me about what psychological effects I think could affect human judgement about x-risk. This isn't a complete answer, but in 2018 I wrote a draft of "AI Research Considerations for Human Existential Safety" (ARCHES) that included an overview of cognitive biases I thought (and still think) will impair AI risk assessments. Many cognitive bias experiments had already failed to reproduce well in the psychology reproducibility crisis, so I thought it would be a good idea to point out some that did reproduce well, and that were obviously relevant to AI risk. Unfortunately, one prospective coauthor asked that this content be removed, because of the concern that academic AI researchers would be too unfamiliar with the relevant field: cognitive science. That coauthor ultimately decided not to appear on ARCHES at all, and I retrospect I probably should have added back this material. Just to [...]
Outline:
(00:08) Preface
(01:42) Notes on definitions
(02:25) Risk Type 1b: Unrecognized prepotence
(06:38) Risk Type 1b: Unrecognized misalignment
First published: December 3rd, 2024
---
Narrated by TYPE III AUDIO).