Today on the AI Daily Brief, Meta's Lama Khan and is open source falling behind? Before then in the headlines, up to 30% of Microsoft's code has now been written by AI. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Thanks to today's sponsors, Vanta and Super Intelligent. And for an ad-free version of the show, go to patreon.com slash AI Daily Brief. Welcome back to the AI Daily Brief headlines edition, all the daily AI news you need in around five minutes.
Well, friends, it turns out that AI coding is not just for the vibe coders. At Meta's LamaCon event, which will be the topic of our main episode today, Microsoft CEO Satya Nadella made a crossover appearance in a fireside chat with Meta CEO Mark Zuckerberg. One of the more interesting topics was the takeover of AI code in big tech. Nadella said that between 20 and 30% of the code in Microsoft's repositories was generated by AI.
In other words, he's saying that this is not just a significant portion of the new code being written, but that AI-generated code is now a big part of the overall codebase. He also got a little detailed, which was interesting. He mentioned that the company was seeing mixed results across different languages, with the strongest performance in Python and less progress being made with C++. Throwing the question back at Zuck, the meta-CEO said that he didn't know how much of the company's code was being generated by AI, but aims for it to get to 50% by the end of next year.
You might remember that late last year, Google CEO Sundar Pichai said that his company was using AI to generate 25% of their code. But earlier this month, he actually updated that, stating that it's now, quote, well over 30%. Next up today, OpenAI has apparently fixed GPT-4.0's personality, or at least attempted to, to make it less sycophantic.
As we discussed on Monday's show, the personality of the default chat GPT model went haywire over the weekend, leading it to agree with basically everything and overly compliment the user. We talked about all the various ways that was bad, so check out that episode if you haven't heard it yet. But in any case, yesterday Sam Altman posted, We started rolling back the latest update to GPT-4.0 last night. It's now 100% rolled back for free users and will update again when it's finished for paid users, hopefully later today. We're working on additional fixes to model personality and will share more in the coming days.
The company also published a post-mortem blog explaining, "...when shaping model behavior, we start with baseline principles and instructions outlined in our model spec. We also teach our models how to apply these principles by incorporating user signals like thumbs-up, thumbs-down feedback on chat GPT responses." However, in this update, we focused too much on short-term feedback and did not fully account for how users' interactions with chat GPT evolve over time. As a result, GPT-4.0 skewed towards responses that were overly supportive but disingenuous.
OpenAI model designer Ada McLaughlin had previously commented, we originally launched with a system message that had unintended behavior effects but found an antidote. Now the post implied that most of the personality change was to do with a new system prompt rather than additional post training. Jailbreaker Penny the Liberator had of course found the hidden system prompt, giving us a look under the hood.
The old, malfunctioning prompt said, "Over the course of the conversation, you adapt to the user's tone and preference. Try to match the user's vibe, tone, and generally how they're speaking." The new prompt, inserted on Monday, read, "Engage warmly yet honestly with the user. Be direct. Avoid ungrounded or sycophantic flattery. Maintain professionalism and grounded honesty that best represents OpenAI and its values." When asked if he believed that this would fix the problem, Pliny said, "The full scope of the problem runs much deeper for sure. It's a silly fix but probably does give like 10-20% improvement for that particular behavior."
In their blog post, OpenAI committed to refining their training techniques and system prompts to steer away from sycophancy. But beyond that, we didn't get a ton of specifics. Overall, this is another reminder of how new and novel these technologies are and how little changes can make big differences.
Lastly today, Duolingo is the latest company going AI first. In an all-hands email, CEO Louis-Franc-Anne wrote, AI is already changing how work gets done. It's not a question of if or when, it's happening now. When there's a shift this big, the worst thing you can do is wait. In 2012, we bet big on mobile. While others were focused on mobile companion apps for websites, we decided to build mobile first because we saw it was the future. Betting on mobile made all the difference. We're making a similar call now, and at this time, the platform shift is AI.
Van Aan discussed how the company has already adopted AI to help automate their content production process, commenting, The company also recently introduced a video feature allowing users to chat with an AI avatar, a feature that, as the CEO pointed out, was impossible to build before. He continued,
AI is not just a productivity boost. Being AI-first means we'll need to rethink much of how we work. Making minor tweaks to systems designed for humans won't get us there. In many cases, we'll need to start from scratch. We're not going to rebuild everything overnight, and some things, like getting AI to understand our codebase, will take time. However, we can't wait until the technology is 100% perfect. We'd rather move with urgency and take occasional small hits on quality than move slowly and miss the moment.
Speaking to the practical changes at the company, Van Aan wrote,
Now, the memo did include a caveat that the company still, quote, deeply cares about its employees and will provide training, mentorship, and tooling to support the transition. It said that the initiative is about, quote, removing bottlenecks so we can do more with the outstanding employees we already have. We want you to focus on creative work and real problems, not repetitive tasks.
Now, of course, the memo had clear echoes to the Shopify memo released earlier this month, which told the company that increased headcount would not be approved unless teams demonstrate that they cannot get what they want done using AI. AI advisor Ali K. Miller posted, First Shopify, now Duolingo. If you're a digital native business and haven't gotten the memo, here is the literal memo.
Now, this is something we'll be talking about a lot more in the days to come, so I'll leave it there for now. But I think, and you will not be surprised that I think this, that this is the beginning of a trend. For now, that's going to do it for today's AI Daily Brief Headlines Edition. Next up, the main episode. Today's episode is brought to you by Vanta.
Vanta is a trust management platform that helps businesses automate security and compliance, enabling them to demonstrate strong security practices and scale. In today's business landscape, businesses can't just claim security, they have to prove it.
Achieving compliance with a framework like SOC 2, ISO 27001, HIPAA, GDPR, and more is how businesses can demonstrate strong security practices. And we see how much this matters every time we connect enterprises with agent services providers at Superintelligent. Many of these compliance frameworks are simply not negotiable for enterprises.
The problem is that navigating security and compliance is time-consuming and complicated. It can take months of work and use up valuable time and resources. Vanta makes it easy and faster by automating compliance across 35+ frameworks. It gets you audit-ready in weeks instead of months and saves you up to 85% of associated costs. In fact, a recent IDC whitepaper found that Vanta customers achieved $535,000 per year in benefits, and the platform pays for itself in just three months.
The proof is in the numbers. More than 10,000 global companies trust Vanta, including Atlassian, Quora, and more. For a limited time, listeners get $1,000 off at vanta.com slash nlw. That's v-a-n-t-a dot com slash nlw for $1,000 off.
Today's episode is brought to you by Superintelligent, and I am very excited today to tell you about our consultant partner program. The new Superintelligent is a platform that helps enterprises figure out which agents to adopt, and then with our marketplace, go and find the partners that can help them actually build, buy, customize, and deploy those agents.
At the key of that experience is what we call our agent readiness audits. We deploy a set of voice agents which can interview people across your team to uncover where agents are going to be most effective in driving real business value. From there, we make a set of recommendations which can turn into RFPs on the marketplace or other sort of change management activities that help get you ready for the new agent-powered economy.
We are finding a ton of success right now with consultants bringing the agent readiness audits to their client as a way to help them move down the funnel towards agent deployments, with the consultant playing the role of helping their client hone in on the right opportunities based on what we've recommended and helping manage the partner selection process. Basically, the audits are dramatically reducing the time to discovery for our consulting partners, and that's something we're really excited to see. If you run a firm and have clients who might be a good fit for the agent readiness audit,
Welcome back to the AI Daily Brief. Today we are talking about Meta's big developer conference, Llamacon, everything that they announced, what people were excited about. We're going to do a little bit of a review of Zuckerberg's whistle-stop tour of media because
But kind of crouching behind all of this are some lurking questions, both for Meta and for open source. And I think to kick us off, it's important to go back and give a little bit of context. Now, Meta has firmly planted its flag as the big tech company who has most wrapped up its future in the triumph of open source AI as opposed to closed source models.
This was, for many, an unexpected turn from Zuckerberg. And there are plenty of people who feel like it was largely opportunistic. But at the same time, for those who have been watching for a long time, Mark Zuckerberg really did have a conversion sort of experience when Apple almost killed their business with changes to the way that the iPhone model worked. And so the open source push is more philosophically coherent than one might think.
Whatever the motivation was, it was certainly working. Throughout a lot of 2023, one of the big freakouts from Google was that Meta's developer ecosystem was beating them and OpenAI. It also felt like throughout 2024, open source was getting ever closer to the performance of closed source models, really closing the gap.
And yet, Meta has had a rough run of it this year. First of all, back in January, as DeepSeek released its reasoning models, reports were that Meta started freaking out. We had lots of what appeared to be leaks from inside, with engineers reporting that the company was scrambling and assembling war rooms to try to reverse engineer how DeepSeek had done what it had done with so few resources. And by and large, things just seemed in a state of upheaval.
Another moment of controversy for Meta came after they released the Lama 4 family of models, with people accusing them of effectively artificially boosting their benchmark scores and releasing a different prioritized model for some of the benchmark tests than the model they released to the public. We're not going to rehash that here. The point is just to say that Meta wasn't coming into this LamaCon riding the top of the wave. In some ways, they were fighting to get back on the horse a little bit.
So first of all, let's talk about what was released at this event. Remember, we got the announcement of the new models about a month ago, so no one was expecting some big announcement on that front. A couple of the big headline reveals included first, a native API for Lama. The Lama API is now available in a limited preview and is paired with Meta's SDKs to allow developers to build on the model family.
The company didn't reveal pricing, but did boast of lightning-fast speed. Through a partnership with Cerebrus, Meta claims that their API can run 18 times faster than the traditional GPU inference used by OpenAI. The comparison is even better when you consider DeepSeek's native API, which crawls along at less than one hundredth of that speed.
Now, the API does what you'd expect, offering tools for fine-tuning and evaluation alongside serving the models for app integration. It may be basic infrastructure, but it's still an important step that Meta has begun to offer their own access points. The other big announcement, and one that got even more consumer attention at least, was the announcement of a standalone chatbot app for Lama models. Now, there's been no shortage of ways to access Meta's chatbots. They've been, of course, integrated into WhatsApp, Instagram, Facebook, Messenger. But having a standalone app brings Meta more into parity with their peers.
We saw something similar from Grok, who first released their tools exclusively through Twitter slash X, but then spun out their own app as well. One interesting feature, which is perhaps not surprising coming from Meta, is that the Llama app has a social feed. Users can elect to share their prompts and responses with their friends across Meta's ecosystem. Now, I don't think right now there's any sort of latent demand, quote unquote, for this kind of feature.
That said, Sam Altman has very publicly talked about the idea of potentially doing a social network from within ChatGPT. And just in general, it is always surprising what sort of things people actually like sharing and discovering about their peers and friends. Meta's VP of Product, Connor Hayes, said that the idea is to show people what they can do with AI.
Now, this is actually highly utilitarian. One of the things that we've seen for the last couple of years, vis-a-vis super intelligent, is that a lot of the barriers to AI usage are people just not knowing what to use it for. With every other technology, the pattern has been that a tiny handful of use case inventors and discoverers go out and figure out how to use a thing, and then we all copy them. And yet, for a couple of years, we kind of expected everyone to figure out how to use AI for themselves, which again, just runs counter to the way that technology has rolled out in the past.
Anyways, as for big announcements, those were definitely the highlights. There were a few more technical additions that might move the needle for some developers. In their blog post, for example, Meta highlighted the first of several infrastructure integrations they're calling Llamastack. Meta said that they envision Llamastack as the industry standard for enterprises looking to seamlessly deploy production-grade turnkey AI solutions. They also announced a set of security and moderation tools and developer grants. But overall, it was fairly muted.
When it came to people's response to this, TechCrunch argued that the entire conference was all about undercutting OpenAI. Daniel Campos wrote, And for some, it's hard not to feel like at this stage, Meta is pretty clearly behind. They're behind leaders OpenAI and Anthropic in the consumer and coding assistant markets,
at least according to the benchmarks. Their latest model has been overtaken by new open source releases out of China. And yet during his keynote, Zuckerberg laid out what he sees as the next chapter of the AI race playing out like. He said, "'Part of the value around open source is that you can mix and match. So if another model like DeepSeq is better, or if Quan is better at something, then as developers, you have the ability to take the best parts of the intelligence from different models and produce exactly what you need. This is part of how I think open source basically passes in quality all the closed source models.'"
It feels like sort of an unstoppable force. AI entrepreneur Ted Benson unpacked his takeaways, posting, "...the first LamaCon keynote just wrapped seconds ago, and I feel like I'm getting a sense of Meta's AI strategy for the first time. They didn't say it directly, but you could hear it between the lines." Many had speculated Zuckerberg was pursuing a commoditize-your-competitors approach, out of fear of being trapped as an app within yet another company's platform again. I don't think that's it.
If AI and AR represent an entirely new computing paradigm, that new paradigm will require a new operating system. And that new operating system will require a host of standard utilities like GNU utilities were to Linux. Small, fine-tuned models, large stock models, real-time voice models, 3D understanding models, image segmentation models, scene generation models...
Collectively, that sounds like a lot of the standard library for a completely different platform of AI and AR computing. The insistence that all LAMA derivatives be prefixed with LAMA- feels telling. The last 40 years we've been building atop GNU Linux, I think in five years Meta wants us to all be building atop LAMA-something. And adding some credence to that was the fact that throughout the entire event, and on his numerous podcast appearances, Zuckerberg wore the Meta Ray-Bans.
Now, taking a step back and moving away from meta to the broader question of where open source stands. It's important to remember that while DeepSeek R1 was a phenomenon, it wasn't because it outperformed things like OpenAI's R01 on the benchmarks. And indeed, in performance terms, it was quickly buried by releases from all of the major AI labs.
Why it had such resonance was that it was the first freely available reasoning model, the first time that consumers got their hand on reasoning in a free chat app, and because of all the scuttlebutt around how cheaply they had trained it.
In an appearance on the Dwarkesh podcast released alongside the conference, Dwarkesh asked Zuckerberg straight up about how he felt that Lama 4 Maverick is now ranked 35th on LL Marina and is generally behind and underwhelming on most of the benchmarks. Dwarkesh said, Zuckerberg responded,
The prediction that this would be the year where open source generally overtakes closed source as the most used models out there is generally on track to be true. Touching on the benchmark dominance of reasoning models, Zuckerberg said that the new paradigm of scaling test time compute is compelling and that a Lama 4 reasoning model would be coming soon. However, he added that for a lot of the things that we care about, latency and good intelligence per cost are actually much more important product attributes.
He also made the argument that benchmarks are gameable, especially when it comes to LM Arena, and said that tuning for benchmark performance had often led the company astray. He said, I think you just need to be a little careful with some of the benchmarks, and we're going to index primarily on the products. Now, if you look around, there continues to be plenty of skepticism of where Meta is right now. Earlier in the month, Fortune, for example, published a piece called Some Insiders Say Meta's AI Research Lab is Dying a Slow Death.
I'm not really sure. There's no doubt that open source competition is increasing, that the models out of China are putting intense competitive pressure on Zuckerberg and everyone else who's thinking about open source. It is also the case that open source models have not surpassed the big closed source models, especially as reasoning has become the dominant paradigm. I also do think, though, that Zuckerberg is playing an extremely long game here.
I do not believe that he views winning as who has the most downloaded app on the Apple App Store charts. I think he views winning as who owns the infrastructure in the future, which is basically what Ted Benson was arguing in that post. There is no doubt that certain competitive pressures may have forced Meta's timelines in ways that were a little uncomfortable and leave the appearance of being behind, but I am far from counting them out yet. But that at least is the story for now.
Appreciate you guys listening or watching as always. And until next time, peace.