This is a link post.I’ve spent a lot of the last few years working on issues related to acausal cooperation. With LLMs being clearly dominant over recent years, I’ve now led a team to make a benchmark to figure out how good LLMs are at decision theory and whether and when they lean more CDT or EDT. We hope to expand this dataset in the future, including by incorporating questions that try to measure the updatelessness dimension. Hopefully, this dataset will be useful for future interventions aimed at improving acausal interactions. Abstract: We introduce a dataset of natural-language questions in the decision theory of so-called Newcomb-like problems. Newcomb-like problems include, for instance, decision problems in which an agent interacts with a similar other agent, and thus has to reason about the fact that the other agent will likely reason in similar ways. Evaluating LLM reasoning about Newcomb-like problems is [...]
The original text contained 8 images which were described by AI.
First published: December 16th, 2024
---
Narrated by TYPE III AUDIO).
Images from the article:
)
)
)
)
)
)
)
)
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts), or another podcast app.