We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

“Prover-Estimator Debate: A New Scalable Oversight Protocol” by Jonah Brown-Cohen, Geoffrey Irving

2025/6/17

Shownotes Transcript

Audio note: this article contains 33 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.

Linkpost to arXiv: https://arxiv.org/abs/2506.13609.

Summary: We present a scalable oversight protocol where honesty is incentivized at equilibrium. Prior debate protocols allowed a dishonest AI to force an honest AI opponent to solve a computationally intractable problem in order to win. In contrast, prover-estimator debate incentivizes honest equilibrium behavior, even when the AIs involved (the prover and the estimator) have similar compute available. Our results rely on a stability assumption, which roughly says that arguments should not hinge on arbitrarily small changes in estimated probabilities. This assumption is required for usefulness, but not for safety: even if stability is not satisfied, dishonest behavior will be disincentivized by the protocol.

How can we correctly reward desired behaviours for AI [...]

Outline:

(02:46) The Prover-Estimator Debate Protocol

(06:09) Completeness

(07:26) Soundness

(08:48) Future research

First published: June 17th, 2025

Source: https://www.lesswrong.com/posts/8XHBaugB5S3r27MG9/prover-estimator-debate-a-new-scalable-oversight-protocol)

---

Narrated by TYPE III AUDIO).

Images from the article: Diagram comparing original recursive debate with prover-estimator debate structures showing decision trees. ) Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts), or another podcast app.

“Prover-Estimator Debate: A New Scalable Oversight Protocol” by Jonah Brown-Cohen, Geoffrey Irving

LessWrong (30+ Karma)

What is the Prover-Estimator Debate Protocol?

Understanding Completeness in the Protocol

How Does Soundness Work in Prover-Estimator Debate?

What Future Research Directions Are Needed?

Shownotes Transcript

“Prover-Estimator Debate: A New Scalable Oversight Protocol” by Jonah Brown-Cohen, Geoffrey Irving 11:25 Share

LessWrong (30+ Karma)

What is the Prover-Estimator Debate Protocol?

Understanding Completeness in the Protocol

How Does Soundness Work in Prover-Estimator Debate?

What Future Research Directions Are Needed?

Shownotes Transcript

“Prover-Estimator Debate: A New Scalable Oversight Protocol” by Jonah Brown-Cohen, Geoffrey Irving