Today on the AI Daily Brief, Google's AI co-scientist shows the future not only of scientific research, but of multi-agent systems. Before that in the headlines, is meta-AI going enterprise? The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes.
We kick off today with some interesting intrigue around Meta and their AI strategy. This is something I've been watching closely. I think a lot of people have wondered, is Meta going to try to compete in the enterprise or corporate domain, or are they just going to try to own consumer AI, or at least the foundations and underpinnings of consumer AI?
Certainly, Meta is at core a consumer company. Facebook, WhatsApp, Instagram, these are all consumer products that touch a huge portion of the world's population. And even their interactions with businesses are mostly in the form of selling them ad space to sell to those consumers. Lama, though, is really interesting because Lama as a platform and as the vanguard US open source AI leader has a ton of potential in that enterprise sphere. And so I've been wondering if they were going to make a play for that space.
This got a little bit more interesting last November when Meta poached Clara Shi from Salesforce. She had been the CEO of Salesforce AI and joined Meta to explicitly start a new group that was building AI tools for business. Subsequent to joining Meta, it sounds like she went and recruited a bunch of people from throughout the company who had worked on some of the key products.
The speculation is now about what they're actually going to do. One thing that the information point out is that last year they had posted a role called Director of Public Senator Engagement that was responsible for, quote, building and leading a high-performing team from the ground up to drive adoption of safe and transformative solutions for AI across federal, state, and local government agencies. And just before it had posted that role, interestingly, Meta had also started to allow its AI models to be used by the U.S. military.
Right now, it's all still very mum, but you're starting to get the type of leaks and intrigues that suggest that maybe they're going to make bigger play in the enterprise. Certainly from the standpoint of super intelligent, that's something we're watching closely as it would absolutely be a player and change the calculus for big enterprise companies thinking about their AI and agentic strategies.
Speaking of big companies, apparently money talks and BS walks because despite years of lambasting Microsoft and all the other big companies, Salesforce is now in talks with Microsoft, Oracle, the company they were basically designed to disrupt, and Google about cloud deals to handle their AI.
Salesforce's president and chief engineering officer said in an interview earlier this week that they're in advanced negotiations with Microsoft, Google, and Oracle for a deal worth more than a billion dollars over the next several years. Salesforce apparently wants to rent the servers to run its customer management, AI agent, and other applications. Obviously, if you've been watching Salesforce at all, agent force is the big focus of the company. It is very clear to me, frankly, that Benioff is betting the entire farm on agent force and what it becomes.
A billion dollars isn't nothing, obviously, but will Microsoft get over the fact that Benioff has just been out here screaming about what a disaster Copilot has been? Get the popcorn because this is going to get interesting. Moving over into the world of startups, sort of, I don't exactly know what you would consider DeepSeek, given that they came out of a multi-billion dollar hedge fund. But in any case, DeepSeek, as the lab that has been putting out these models that have everyone in a tizzy, is apparently considering raising outside money for the first time.
This seems to be based on the reporting, both an opportunity consideration as well as a constraint consideration. Because it came out of this quantitative hedge fund, it hadn't needed to raise outside funding so far, but it now has so much use that it probably needs to resource up. The information reports that they have been fielding a bunch of inbound interest from people like Alibaba, as well as a number of Chinese state-affiliated funds. That includes China's Sovereign Wealth Fund, as well as their National Social Security Fund.
More broadly, it sounds like they're trying to consider how much they want to keep focusing on research and competing on that level versus building a revenue business based on the success of their products. At this stage, DeepSeek is officially in the conversation. They are a player in this AI battle. And I think a lot of the next moves they make over the next couple of months are going to tell us a lot about where they want to sit in that fight.
Lastly today, another AI unicorn round. Together has jumped to a $3.3 billion valuation in their latest funding. Together calls themselves the AI Acceleration Cloud, and they're highly focused on the enterprise. The round was led by General Catalyst, with the participation of Salesforce Ventures, NVIDIA, and also the venture capital fund of Aramco.
Now, Together was a unicorn even before this. They had raised previously $106 million last year at a $1.25 billion valuation. But tripling that valuation in just a few short months ain't bad, especially for a company that's not building its own foundation models. Anyways, that is going to do it for today's AI Daily Brief Headlines Edition. Next up, the main episode. Today's episode is brought to you by Vanta. Trust isn't just earned, it's demanded. Whether
Whether you're a startup founder navigating your first audit or a seasoned security professional scaling your GRC program, proving your commitment to security has never been more critical or more complex. That's where Vanta comes in. Businesses use Vanta to establish trust by automating compliance needs across over 35 frameworks like SOC 2 and ISO 27001. Centralized security workflows complete questionnaires up to 5x faster and proactively manage vendor risk.
Vanta can help you start or scale up your security program by connecting you with auditors and experts to conduct your audit and set up your security program quickly. Plus, with automation and AI throughout the platform, Vanta gives you time back so you can focus on building your company. Join over 9,000 global companies like Atlassian, Quora, and Factory who use Vanta to manage risk and prove security in real time.
If there is one thing that's clear about AI in 2025, it's that the agents are coming. Vertical agents by industry, horizontal agent platforms, agent-based platforms.
agents per function. If you are running a large enterprise, you will be experimenting with agents next year. And given how new this is, all of us are going to be back in pilot mode.
That's why Superintelligent is offering a new product for the beginning of this year. It's an agent readiness and opportunity audit. Over the course of a couple quick weeks, we dig in with your team to understand what type of agents make sense for you to test, what type of infrastructure support you need to be ready, and to ultimately come away with a set of actionable recommendations that get you prepared to figure out how agents can transform your business.
If you are interested in the agent readiness and opportunity audit, reach out directly to me, nlw at bsuper.ai. Put the word agent in the subject line so I know what you're talking about. And let's have you be a leader in the most dynamic part of the AI market.
Welcome back to the AI Daily Brief. Today, we are talking about something that I'm really excited to dig into. On the one hand, we're talking about AI and how it's going to advance science, a big theme and in fact, one of the big motivations for some of the leading actors in the AI space. But we're also going to be talking about the emergence of multi-agent systems, how agents come together, each with individual purposes, and are coordinated to do something much bigger.
When I look at this news and I look at what Google has shared, I don't just see some really cool advancements for science, although I certainly see that. I also see a template for the type of multi-agent system that is going to become absolutely ubiquitous in the years to come.
Now, before we get into Google's announcement, let's look at what Sam Altman had to say about scientific discovery and AI. There are many reasons I'm excited about AI. We touched on this a little bit earlier. The most, personally, the single thing I'm most excited about is what this is going to do for scientific discovery. I believe that, you know, if we can accelerate scientific discovery, we can do 10 years of science in one year and then someday 100 years of science in one year. What that'll do to quality of life, like solving our most pressing problems, addressing the climate,
making life just better in all sorts of ways, curing disease, that will be an incredible gift. And I think AI is finally going to enable that. So that's a really big, bold pronouncement. And it's something that Altman and others have said a lot. The question for many has, of course, been, okay, but how? Because one thing that's clear is that the current crop of LLMs isn't off making scientific discoveries on their own. So then, is it about underlying models and them just needing to be smarter? Or is it about something else?
Well, what it looks like after reviewing this new Google announcement is that it's not just about the model. It's about how specific agents using models come together to do work. So earlier this week, Google posted on their company blog, Today, Google is launching an AI co-scientist, a new AI system built on Gemini 2.0 designed to aid scientists in creating novel hypotheses and research plans.
Researchers can specify a research goal, for example, to better understand the spread of a disease-causing microbe, and the AI co-scientist will propose testable hypotheses along with a summary of relevant published literature and a possible experimental approach. AI co-scientist is a collaborative tool to help experts gather research and refine their work. It's not meant to automate the scientific process. Okay, so that's the overview, but where it gets really interesting is in their longer blog post where they actually explain this all.
First of all, it's clear right from the beginning that part of the opportunity comes from new long-term planning and reasoning capabilities of models.
The goal is really ambitious. They write beyond standard literature review, summarization, and quote-unquote deep research tools, the AI co-scientist is intended to uncover new original knowledge and to formulate demonstrably novel research hypotheses and proposals. So how does it do this? Google writes, to do this, it uses a coalition of specialized agents, generation, reflection, ranking, evolution, proximity, and meta-review.
In their most simple pictorial diagram, they show three elements. The first ingredient for this is test-time compute, this new approach that is underpinning reasoning models. Then they show this group or team of individual agents who all have a different function, organized under a supervisor agent, all of which are, of course, organized under a scientist. Then lastly, they have a research ideas tournament that shows how new ideas are proposed, evaluated, and refined. So let's get more into these specific agents.
First comes, of course, the scientist. The scientist in question specifies the research goal in natural language. Google also points out that they can suggest their own ideas and proposals and interact via a chat interface to guide the system throughout the process. After the scientist inputs the research goal, a research plan is configured and then the co-scientist multi-agent system goes into effect.
First up is the generation agent. The generation agent includes a literature exploration and simulates scientific debate. And at each step of the way, each agent is introducing a new set of research hypotheses that are being put into what they call ranking agent tournaments.
Google describes those tournaments in this way. Research hypothesis comparison and ranking with scientific debate in tournaments. Limitations and top win-loss patterns are summarized and provided as feedback to other agents. This enables iterative improvement in quality of research hypothesis generation, creating a self-improving loop. Point being that this approach to tournament-style ranking and review and contesting of different ideas happens at each agentic step.
Okay, so first we have that generation agent, then we move on to the reflection agent. The reflection agent does a full review with web search, a simulation review, it does a tournament review, and what they call deep verification. Next up is the evolution agent. Its job is to take inspiration from other ideas and then add simplification and do a research extension. Finally, there's the proximity check agent and the meta review agent, which ultimately formulates a research overview, which is sent back to the scientist for feedback.
All of this is organized by a supervisor agent, which allocates resources between the different specialized agents. And so again, here we are not only seeing the scaffolding of a co-scientist, but a scaffolding for how a multi-agent system might work for a variety of different problems.
When it comes to how they decide which are the best ideas, Google writes, the AI co-scientist leverages test time compute scaling to iteratively reason, evolve, and improve outputs. Key reasoning steps include self-play-based scientific debate for novel hypothesis generation, ranking tournaments for hypothesis comparison, and an evolution process for quality improvement. They continue the self-improvement relies on the ELO auto-evaluation metric derived from its tournament.
Due to their core role, we assessed whether higher ELO or ELO ratings correlate with higher output quality. And they found that they did. And so while all of this sounds really cool, what were the actual results in some of these early tests?
Well, first of all, when experts assessed the ideas from the AI co-scientists, they found that they have higher potential for novelty and impact as compared to other models. But they also tested them in real-world laboratory experiments around drug repurposing, proposing novel treatment targets, and elucidating the mechanisms underlying antimicrobial resistance. So around drug repurposing, which is basically about taking existing drugs and finding new therapeutic applications beyond their original intended use.
Google applied the AI co-scientist to, quote, assist with the prediction of drug repurposing opportunities and found that the AI co-scientist proposed novel repurposing candidates for acute myeloid leukemia. Google writes, subsequent experiments validated these proposals, confirming that the suggested drugs inhibit tumor viability at clinically relevant concentrations in multiple AML cell lines.
Next up, they moved to the even more complex-than-drug repurposing identifying of novel treatment targets. They focused on liver fibrosis, with the AI co-scientist identifying epigenetic targets with significant antifibrotic activity. Once again, they found a lot of success, even saying that the findings will be detailed in an upcoming report led by collaborators at Stanford University.
Finally, in their third validation test, they looked to generate hypotheses to explain bacterial gene transfer evolution mechanisms related to antimicrobial resistance. For this test, researchers instructed the AI co-scientist to explore a topic that had already been subject to novel discovery in their group, but had not yet been revealed in the public domain. And while the terminology here is so dense that I won't even get into it, the TLDR of this is that the AI co-scientist was able to independently propose a
discoveries that had already been made but had not yet been revealed. So that's three examples of really profound success. This is what led Professor Ethan Malik to say, we're starting to see what AI will accelerate science actually looks like. And that's incredibly exciting. However, again, as Rohit here points out, what's more interesting is that it's a way to make the multiple LLMs with tools idea actually work in practice.
I am very, very excited to continue to see what Google can do with this particular line of research and Google AI co-scientists specifically. But I'm also excited to see even more broadly these types of multi-agent systems come to actually be deployed and accomplish really interesting and novel things. I think a lot of 2025 and even 26 are going to be spent on very discrete agent experiments that are about a specific task or a specific workflow. But where it gets really interesting is in these multi-agent systems. And who knows, maybe this will happen even faster than I think.
For now, that's going to do it for today's AI Daily Brief. Until next time, peace.