We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode “AI companies are unlikely to make high-assurance safety cases if timelines are short” by ryan_greenblatt

“AI companies are unlikely to make high-assurance safety cases if timelines are short” by ryan_greenblatt

2025/1/23
logo of podcast LessWrong (30+ Karma)

LessWrong (30+ Karma)

AI Chapters
Chapters

Shownotes Transcript

One hope for keeping existential risks low is to get AI companies to (successfully) make high-assurance safety cases: structured and auditable arguments that an AI system is very unlikely to result in existential risks given how it will be deployed.[1] Concretely, once AIs are quite powerful, high-assurance safety cases would require making a thorough argument that the level of (existential) risk caused by the company is very low; perhaps they would require that the total chance of existential risk over the lifetime of the AI company[2] is less than 0.25%[3][4].

The idea of making high-assurance safety cases (once AI systems are dangerously powerful) is popular in some parts of the AI safety community and a variety of work appears to focus on this. Further, Anthropic has expressed an intention (in their RSP) to "keep risks below acceptable levels"[5] and there is a common impression that Anthropic would pause [...]


Outline:

(03:19) Why are companies unlikely to succeed at making high-assurance safety cases in short timelines?

(04:14) Ensuring sufficient security is very difficult

(04:55) Sufficiently mitigating scheming risk is unlikely

(09:35) Accelerating safety and security with earlier AIs seems insufficient

(11:58) Other points

(14:07) Companies likely wont unilaterally slow down if they are unable to make high-assurance safety cases

(18:26) Could coordination or government action result in high-assurance safety cases?

(19:55) What about safety cases aiming at a higher risk threshold?

(21:57) Implications and conclusions

The original text contained 20 footnotes which were omitted from this narration.


First published: January 23rd, 2025

Source: https://www.lesswrong.com/posts/neTbrpBziAsTH5Bn7/ai-companies-are-unlikely-to-make-high-assurance-safety)

    ---
    

Narrated by TYPE III AUDIO).