This is a link post. ** Open Philanthropy Technical AI Safety RFP - $40M Available Across 21 Research Areas** Open Philanthropy is launching a big new Request for Proposals for technical AI safety research, with plans to fund roughly $40M in grants over the next 5 months, and available funding for substantially more depending on application quality. Applications (here) start with a simple 300 word expression of interest and are open until April 15, 2025.
** Overview** We're seeking proposals across 21 different research areas, organized into five broad categories:
Adversarial Machine Learning *Jailbreaks and unintentional misalignment *Control evaluations *Backdoors and other alignment stress tests *Alternatives to adversarial training Robust unlearning
Exploring sophisticated misbehavior of LLMs *Experiments on alignment faking *Encoded reasoning in CoT and inter-model communication Black-box LLM psychology Evaluating whether models can hide dangerous behaviors Reward hacking of human oversight
Model transparency Applications of white-box [...]
Outline:
(00:14) Open Philanthropy Technical AI Safety RFP - $40M Available Across 21 Research Areas
(00:47) Overview
First published: February 6th, 2025
---
Narrated by TYPE III AUDIO).