We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode “A long list of concrete projects and open problems in evals” by Marius Hobbhahn

“A long list of concrete projects and open problems in evals” by Marius Hobbhahn

2025/3/23
logo of podcast LessWrong (30+ Karma)

LessWrong (30+ Karma)

Shownotes Transcript

We made a long list of concrete projects and open problems in evals with 100+ suggestions!

https://docs.google.com/document/d/1gi32-HZozxVimNg5Mhvk4CvW4zq8J12rGmK_j2zxNEg/edit?usp=sharing

We hope that makes it easier for people to get started in the field and to coordinate on projects.

Over the last 4 months, we collected contributions from 20+ experts in the field, including people who work at Apollo, METR, Redwood, RAND, AISIs, frontier labs, SecureBio, AI futures project, many academics, and independent researchers (suggestions have not necessarily been made in an official capacity). The doc has comment access, and further well-intentioned contributions are welcome!

Here is a screenshot of the table of contents:


First published: March 22nd, 2025

Source: https://www.lesswrong.com/posts/LhnqegFoykcjaXCYH/a-long-list-of-concrete-projects-and-open-problems-in-evals)

    ---
    

Narrated by TYPE III AUDIO).


Images from the article: ) Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts), or another podcast app.