We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Is AI Weird Enough to Actually Make Scientific Discoveries?

Is AI Weird Enough to Actually Make Scientific Discoveries?

2025/3/9
logo of podcast The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

AI Deep Dive Transcript
People
N
Neil Lawrence
T
Thomas Wolff
Topics
Thomas Wolff: 我认为当前的AI模型,例如大型语言模型(LLM),虽然拥有海量知识,能够出色地回答问题,就像一个优秀的学生一样,但在进行真正的科学突破方面却存在局限性。它们擅长于在已知知识框架内进行推理和预测,却缺乏提出具有颠覆性意义的新问题、挑战现有理论的能力。历史上许多伟大的科学发现都源于对既有知识的质疑和突破,而这正是当前AI模型所欠缺的。 例如,哥白尼的日心说挑战了当时的宇宙观,CRISPR技术的应用也源于对既有认知的重新理解和创新性应用。这些突破并非简单的知识积累,而是对现有范式的根本性转变。 当前AI模型的评估标准通常侧重于其回答已知问题的能力,这使得它们更倾向于成为“优秀的学生”,而非具有独立思考和创新能力的“科学家”。要实现真正的科学突破,我们需要开发能够挑战自身训练数据、提出大胆假设、并基于少量线索进行归纳的AI模型。我们需要的是一个能够发现并质疑他人忽略的细节的“B等生”,而不是一个能够完美回答所有问题的“A等生”。 Neil Lawrence: 我在高中时期参加学术十项全能比赛的经历,让我对不同类型的优秀人才有了更深入的理解。那些在学术竞赛中表现出色的人,往往擅长于在既定规则和框架内取得高分,但他们并不一定具有创新和创业精神。历史上许多伟大的创新者,往往并非传统意义上的“好学生”,他们具有挑战现状、打破常规的勇气和能力。 将当前的LLM比作那些在学术竞赛中表现出色的学生,它们拥有海量的知识,能够快速地处理信息,但它们缺乏独立思考和创造的能力。因此,我们需要思考如何引导LLM以不同的方式思考,这是否需要改变提示方式,或者需要完全不同的架构? 科学发现的模式也值得我们思考,是独立的个体天才推动了科学进步,还是大量的AI模型协同工作,通过碰撞和迭代,最终产生突破性的成果?无论如何,Thomas Wolff的观点为我们提供了宝贵的思考方向,值得我们深入探讨。

Deep Dive

Shownotes Transcript

Translations:
中文

Today on the AI Daily Brief, will AI actually lead to scientific breakthroughs or not? The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes. ♪

Hello, friends. Welcome back to another long reads episode of the AI Daily Brief. This week, we have something really interesting. Hugging Face co-founder Thomas Wolff just wrote a blog post this week challenging a very prominent essay by Anthropic CEO Dario Amodei.

Dario had written this essay, which we read here as well, called Machines of Loving Grace, in which he talked about what he thought AI was going to do positively in the next century. One of the big areas was scientific breakthrough, and Thomas Wolff, it turns out, doesn't agree. The piece that he wrote was called The Einstein AI Model. And so first, what we're going to do, as we always do, is we're going to turn it over to the 11 Labs version of me to listen to this piece, and then we will come back and talk about it a little bit more.

I shared a controversial take the other day at an event, and I decided to write it down in a longer format. I'm afraid AI won't give us a compressed 21st century. The compressed 21st century comes from Dario's Machine of Loving Grace, and if you haven't read it, you probably should. It's a noteworthy essay.

In a nutshell, the paper claims that over a year or two, we'll have a country of Einstein sitting in a data center. And it will result in a compressed 21st century during which all the scientific discoveries of the 21st century will happen in the span of only 5 to 10 years. I read this essay twice. The first time I was totally amazed. AI will change everything in science in five years, I thought.

A few days later I came back to it, and while rereading I realized that much of it seemed like wishful thinking. What we'll actually get, in my opinion, is a country of yes-men on servers, if we just continue on current trends. But let me explain the difference with a small part of my personal story.

I've always been a straight-A student. Coming from a small village, I joined the top French engineering school before getting accepted to MIT for PhD. School was always quite easy for me. I could just get where the professor was going, where the exam's creators were taking us, and could predict the test questions beforehand. That's why, when I eventually became a researcher, more specifically a PhD student, I was completely shocked to discover that I was a pretty average, underwhelming, mediocre researcher.

While many colleagues around me had interesting ideas, I was constantly hitting a wall. If something was not written in a book, I could not invent it unless it was a rather useless variation of a known theory. More annoyingly, I found it very hard to challenge the status quo, to question what I had learned. I was no Einstein, I was just very good at school. Or maybe even, I was no Einstein in part, because I was good at school.

History is filled with geniuses struggling during their studies. Edison was called "addled" by his teacher. Barbara McClintock got criticized for "weird thinking" before winning a Nobel Prize. Einstein failed his first attempt at the ETH Zurich entrance exam. And the list goes on. The main mistake people usually make is thinking Newton or Einstein were just scaled-up good students, that a genius comes to life when you linearly extrapolate a top 10% student.

This perspective misses the most crucial aspect of science: the skill to ask the right questions and to challenge even what one has learned. A real science breakthrough is Copernicus proposing, against all the knowledge of his days, in ML terms we would say, despite all his training dataset, that the Earth may orbit the Sun rather than the other way around.

To create an Einstein in a data center, we don't just need a system that knows all the answers, but rather one that can ask questions nobody else has thought of or dared to ask. One that writes, what if everyone is wrong about this, when all textbooks, experts, and common knowledge suggest otherwise?

Just consider the crazy paradigm shift of special relativity and the guts it took to formulate a first axiom like, "Let's assume the speed of light is constant in all frames of reference," defying the common sense of these days and even of today.

Or take CRISPR, generally considered to be an adaptive bacterial immune system since the 80s until 25 years after its discovery, Jennifer Doudna and Emmanuelle Charpentier proposed to use it for something much broader and general: gene editing, leading to a Nobel Prize. This type of realization: "We've known XX does YY for years, but what if we've been wrong about it all along? Or what if we could apply it to the entirely different concept instead?"

is an example of outside-of-knowledge thinking, or paradigm shift, which is essentially making the progress of science. Such paradigm shifts happen rarely, maybe one to two times a year, and are usually awarded Nobel Prizes once everybody has taken stock of the impact. However rare they are, I agree with Dario in saying that they take the lion's share in defining scientific progress over a given century, while the rest is mostly noise.

Now let's consider what we're currently using to benchmark recent AI model intelligence improvement. Some of the most recent AI tests are for instance the grandiosely named "Humanity's Last Exam" or "Frontier Math." They consist of very difficult questions, usually written by PhDs, but with clear, closed-end answers. These are exactly the kinds of exams where I excelled in my field. These benchmarks test if AI models can find the right answers to a set of questions we already know the answer to.

However, real scientific breakthroughs will come not from answering known questions, but from asking challenging new questions and questioning common conceptions and previous ideas. Remember Douglas Adams' Hitchhiker's Guide? The answer is apparently 42, but nobody knows the right question. That's research in a nutshell.

In my opinion, this is one of the reasons LLMs, while they already have all of humanity's knowledge and memory, haven't generated any new knowledge by connecting previously unrelated facts. They're mostly doing "manifold filling" at the moment, filling in the interpolation gaps between what humans already know, somehow treating knowledge as an intangible fabric of reality. We're currently building very obedient students, not revolutionaries.

This is perfect for today's main goal in the field of creating great assistants and overly compliant helpers. But until we find a way to incentivize them to question their knowledge and propose ideas that potentially go against past training data, they won't give us scientific revolutions yet.

If we want scientific breakthroughs, we should probably explore how we're currently measuring the performance of AI models and move to a measure of knowledge and reasoning able to test if scientific AI models can for instance: 1. Challenge their own training data knowledge 2. Take bold counterfactual approaches 3. Make general proposals based on tiny hints

Ask non-obvious questions that lead to new research paths. We don't need an A-plus student who can answer every question with general knowledge. We need a B student who sees and questions what everyone else missed.

Today's episode is brought to you by Vanta. Trust isn't just earned, it's demanded. Whether you're a startup founder navigating your first audit or a seasoned security professional scaling your GRC program, proving your commitment to security has never been more critical or more complex. That's where Vanta comes in. Businesses use Vanta to establish trust by automating compliance needs across over 35 frameworks like SOC 2 and ISO 27001.

Centralized security workflows complete questionnaires up to 5x faster and proactively manage vendor risk. Vanta can help you start or scale up your security program by connecting you with auditors and experts to conduct your audit and set up your security program quickly. Plus, with automation and AI throughout the platform, Vanta gives you time back so you can focus on building your company. Join over 9,000 global companies like Atlassian, Quora, and Factory who use Vanta to manage risk and prove security in real time.

For a limited time, this audience gets $1,000 off Vanta at vanta.com slash nlw. That's v-a-n-t-a dot com slash nlw for $1,000 off. There is a massive shift taking place right now from using AI to help you do your work

to deploying AI agents to just do your work for you. Of course, in that shift, there is a ton of complication. First of all, of these seemingly thousands of agents out there, which are actually ready for primetime? Which can do what they promise? And beyond even that, which of these agents will actually fit in my workflows? What can integrate with the way that we do business right now? These are the questions at the heart of the super intelligent agent readiness audit.

We've built a voice agent that can scale across your entire team, mapping your processes, better understanding your business, figuring out where you are with AI and agents right now in order to provide recommendations that actually fit you and your company.

Our proprietary agent consulting engine and agent capabilities knowledge base will leave you with action plans, recommendations, and specific follow-ups that will help you make your next steps into the world of a new agentic workforce. To learn more about Super's agent readiness audit, email agent at bsuper.ai or just email me directly, nlw at bsuper.ai, and let's get you set up with the most disruptive technology of our lifetimes. All right, now we are back to the real NLW.

I absolutely love this piece. I actually had a weirdly proximate experience to this, if you'll indulge me for a minute. When I was in high school, I did a thing called Academic Decathlon. It's a national competition in the United States. At the time that I was doing it, there were something like 25,000 kids around the country, and it was very competitive.

A version of it was later featured in a Spider-Man movie, but that's neither here nor there. Basically, this thing was a 10-event academic competition that kids would study all year. And when I say study, I mean 5, 6, 10 hours a day. To put a fine point on this, I would literally skip school to study. I would go to school, but instead of going to my classes, I would go to the coach's office and just sit there and study all day.

For two years in a row, I was top five in the country, and I had a chance to learn a lot about the other kids who were also at the top of the list. The one thing they all shared was an insane willingness to work hard, but most of them, as I would later find out, tracking their time through college and then their careers, were very inside-the-box thinkers. They came from schools that had good programs, that knew what to do to turn out champions, and so they put in the work and got out the result.

I had always sort of thought that those people would go on to be very successful. And I guess by the qualifications of following a very specific career path, getting advanced degrees, and getting a consistent and well-paying job, they were. But none of them were disruptors. None of them were entrepreneurs. None of them were builders. And obviously, if you've heard of the stories of entrepreneurs, the most famous ones, the ones we hold up as societal examples, tended not to be those types of people.

They tended to be iconoclastic. Very often they were bad in traditional schools. They had a restlessness, a curiosity, a set of qualities that drove them to yearn for more and to be willing to play outside the rules of the system to get it.

Now, what I am not doing here is drawing any sort of value judgment on which of these is a better way to live. Lord knows, as someone who can't escape my entrepreneurial bent, a lot of points in my life would have been a lot easier if I had been one of those other types of kids. But I do think it's relevant for this conversation as we assume this straight line between the LLMs of today, which are basically like the best academic decathlon students you could have ever possibly imagined.

having read all the things, studied all the things, and who now can remember all the things and tell you all the things, but who aren't creating anything for themselves. Now, my question to Thomas would be, how hard would it be to take that base that we have now and get the LLM to quote-unquote think in different ways? In other words, does it require just a different prompt, or is it really about a totally different architecture that's necessary?

Given how much we point to scientific achievement and scientific advancement as the universally agreed upon upside of AI, I actually think that these questions are worth pondering and worth really digging into. Now, perhaps the big labs are, and have already come to some conclusions about how this is going to work.

Perhaps, for example, it's wrong to think about the independent iconoclastic genius as the model for LLMs when actually the way that scientific discovery is going to happen is a thousand different agents powered by all sorts of different LLMs smashing ideas against one another, running war game scenario testing, and seeing what comes up. Still, I'm really glad that Thomas wrote this post. I think it's very good food for thought, and I'm excited to see what people actually go do with it.

For now, though, that is going to do it for today's AI Daily Brief. Appreciate you listening, as always. And until next time, peace.