We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

“Open Philanthropy Technical AI Safety RFP - $40M Available Across 21 Research Areas” by jake_mendel, maxnadeau, Peter Favaloro

2025/2/6

This is a link post. ** Open Philanthropy Technical AI Safety RFP - $40M Available Across 21 Research Areas** Open Philanthropy is launching a big new Request for Proposals for technical AI safety research, with plans to fund roughly $40M in grants over the next 5 months, and available funding for substantially more depending on application quality. Applications (here) start with a simple 300 word expression of interest and are open until April 15, 2025.

** Overview** We're seeking proposals across 21 different research areas, organized into five broad categories:

Adversarial Machine Learning *Jailbreaks and unintentional misalignment *Control evaluations *Backdoors and other alignment stress tests *Alternatives to adversarial training Robust unlearning

Exploring sophisticated misbehavior of LLMs *Experiments on alignment faking *Encoded reasoning in CoT and inter-model communication Black-box LLM psychology Evaluating whether models can hide dangerous behaviors Reward hacking of human oversight

Model transparency Applications of white-box [...]

Outline:

(00:14) Open Philanthropy Technical AI Safety RFP - $40M Available Across 21 Research Areas

(00:47) Overview

First published: February 6th, 2025

Source: https://www.lesswrong.com/posts/wbJxRNxuezvsGFEWv/open-philanthropy-technical-ai-safety-rfp-usd40m-available)

---

Narrated by TYPE III AUDIO).

“Open Philanthropy Technical AI Safety RFP - $40M Available Across 21 Research Areas” by jake_mendel, maxnadeau, Peter Favaloro

LessWrong (30+ Karma)

Shownotes Transcript

“Open Philanthropy Technical AI Safety RFP - $40M Available Across 21 Research Areas” by jake_mendel, maxnadeau, Peter Favaloro 03:31 Share

LessWrong (30+ Karma)

Shownotes Transcript

“Open Philanthropy Technical AI Safety RFP - $40M Available Across 21 Research Areas” by jake_mendel, maxnadeau, Peter Favaloro