We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

“FLAKE-Bench: Outsourcing Awkwardness in the Age of AI” by annas, Twm Stone

2025/4/1

A key part of modern social dynamics is flaking at short notice. However, anxiety in coming up with believable and socially acceptable reasons to do so can instead lead to ‘ghosting’, awkwardness, or implausible excuses, risking emotional harm and resentment in the other party. The ability to delegate this task to a Large Language Model (LLM) could substantially reduce friction and enhance the flexibility of user's social life while greatly minimising the aforementioned creative burden and moral qualms.

We introduce FLAKE-Bench, an evaluation of models’ capacity to effectively, kindly, and humanely extract themselves from a diverse set of social, professional and romantic scenarios. We report the efficacy of 10 frontier or recently-frontier LLMs in bailing on prior commitments, because nothing says “I value our friendship” like having AI generate your cancellation texts. We open-source FLAKE-Bench on GitHub to support future research, and the full paper is available [...]

Outline:

(01:33) Methodology

(02:15) Key Results

(03:07) The Grandmother Mortality Singularity

(03:35) Conclusions

First published: April 1st, 2025

Source: https://www.lesswrong.com/posts/niJCS6sSAF2i4sDCY/flake-bench-outsourcing-awkwardness-in-the-age-of-ai)

---

Narrated by TYPE III AUDIO).

Images from the article: Prior state-of the-art methods for excusing oneself (Munroe, 2025), which demonstrate the potential for LLMs to significantly advance the field. ) Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts), or another podcast app.

“FLAKE-Bench: Outsourcing Awkwardness in the Age of AI” by annas, Twm Stone

LessWrong (30+ Karma)

Can AI Help Us Flake More Gracefully?

What Did the Tests Reveal?

The Grandmother Mortality Singularity

What Does the Future Hold for FLAKE-Bench?

Shownotes Transcript

“FLAKE-Bench: Outsourcing Awkwardness in the Age of AI” by annas, Twm Stone 04:33 Share

LessWrong (30+ Karma)

Can AI Help Us Flake More Gracefully?

What Did the Tests Reveal?

The Grandmother Mortality Singularity

What Does the Future Hold for FLAKE-Bench?

Shownotes Transcript

“FLAKE-Bench: Outsourcing Awkwardness in the Age of AI” by annas, Twm Stone