We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

2024 in Open Models [LS Live @ NeurIPS]

2024/12/23

Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

AI Deep Dive AI Insights AI Chapters Transcript

People

AI Charlie

组织和主持多个高影响力的 AI 活动和会议，促进 AI 领域的发展和社区建设。

Luca Soldaini

Sophia Yang

Topics

AI Charlie: 本次播客回顾了2024年AI领域的最佳内容，特别是开放模型的快速发展。2023年只有少数几个模型占据LLM排行榜前列，而2024年涌现了大量的新的开放模型，这表明开放模型领域发展迅速。然而，开放模型的研究也面临着诸多挑战，例如资金、GPU和数据集的获取，以及来自欧洲、加利福尼亚和白宫的监管问题。 Luca Soldaini: 2024年开放模型的数量和性能都显著优于2023年，在许多基准测试中，开放模型与闭源模型的差距正在缩小。开放模型在科研和实际应用中都有其价值，尤其是在检索和边缘AI等领域。开放模型生态系统也蓬勃发展，涌现出许多辅助工具和技术。OSI发布了首个开放源码AI模型定义，但对数据的定义不够清晰。2024年，人们对开放模型的训练数据和后训练方法有了更深入的了解，但计算资源的限制越来越大。完全开放模型成为一种新趋势，AI2发布了多个完全开放的模型和训练流程。然而，获取开放模型训练数据越来越困难，因为许多网站为了防止被用于训练闭源模型而阻止数据抓取。对AI的过度风险评估以及相关的游说活动可能会阻碍开放模型的发展。激励机制的设计对于促进开放模型的发展至关重要。开放模型在低资源语言上的表现有待提高，需要更多来自相关地区和语言专家的参与。 Sophia Yang: Mistral在2024年发布了多个开放模型，包括Mistral 7B、Mistral-Large、LeChat等，这些模型可通过多种云平台访问，并提供微调服务。Mistral还发布了多个模型，包括小型模型、大型模型、多模态模型和代码模型。LeChat界面提供多种功能，例如图像理解、代码生成和网络搜索。

Deep Dive

Key Insights

Why have open models seen such explosive growth in 2024 compared to 2023?

In 2024, a variety of new open models from major players like Google, Cohere, Alibaba, and the Allen Institute for AI have emerged, significantly expanding the field. This growth is driven by the increasing availability of detailed training data and methodologies, which has allowed researchers to catch up to the performance of closed models more effectively.

Why is it important to use open models in research and development?

Open models are crucial for research because they allow for transparency, reproducibility, and the ability to conduct in-depth studies on model behavior, evaluation, and mechanistic interpretability. For AI builders, open models offer stability and the flexibility to adapt models for specific use cases, such as edge AI applications and retrieval tasks.

What are the core tenants of the first open source AI license defined by the Open Source Initiative in 2024?

The open source AI license requires that model weights be fairly available, the code be released under an open source license, and there should be no clauses blocking specific use cases. However, the license does not require that the training data be freely available, only that detailed information on the data pipeline be provided.

Why is the data component in the open source AI license considered less than ideal?

The data component in the open source AI license is considered less than ideal because it only requires providing detailed information on the data pipeline, not the actual data. This can be problematic as the process to replicate the data might be extremely costly, making it less accessible to smaller players.

Why has the compute requirement for open model research become a significant barrier in 2024?

The compute requirement for open model research has become a significant barrier because the amount of computational resources needed to train and refine models has increased. This has led to a compute-rich club of major players with 10,000 to 50,000 GPUs, making it challenging for smaller entities to keep up with the state-of-the-art advancements.

Why is the availability of open training data becoming a critical issue?

The availability of open training data is becoming a critical issue because many content owners have started to block web crawls due to concerns about AI models. This disproportionately affects smaller and newcomer entities, who lack the resources to access or crawl data that closed labs already have.

Why is lobbying a necessary but often overlooked aspect of open model development?

Lobbying is necessary for open model development because it helps to advocate against overly restrictive legislation that can stifle innovation and collaboration. While building and researching open models is exciting, lobbying is often seen as boring and unsexy, but it is crucial to ensure that the ecosystem can thrive.

Why is multilingual support a significant gap between closed and open models, and how can it be addressed?

Closed models like ChatGPT generally perform better on low-resource languages compared to open models. To address this, experts from regions where these languages are spoken need to collaborate and provide better data support. Efforts are already underway, such as those by groups focusing on multilingual crawl support, to improve this aspect in 2025.

What are some of the key models and products released by Mistral in 2024?

Mistral released several key models and products in 2024, including Mr. Small and Mr. Large, Le Chat (a chat interface), an embedding model, AX20UB (a powerful open source MOE model), CodeStroll (a code model supporting 80+ languages), and two multimodal models, PixTroll 12b and PixTroll Large. They also updated Mr. Large with version 2, featuring improved function calling capabilities.

Why is it crucial to involve regional communities in open model development?

Involving regional communities in open model development is crucial because it ensures that the data and content specific to those regions are accurately represented and utilized. Experts who are native to these regions can provide valuable insights and access to local resources, which is essential for building robust and effective models for diverse languages and contexts.

Chapters

This chapter recaps the first Latent Space LIVE! mini-conference at NeurIPS 2024 in Vancouver, highlighting the event's success with over 200 in-person attendees and 2200+ online viewers. The keynote focused on the State of Open Models in 2024, featuring Luca Soldani, Nathan Lambert, and Dr. Sophia Yang. The chapter also touches on the challenges of open model research, including funding, access to resources, and regulatory hurdles.

Latent Space LIVE! held at NeurIPS 2024 in Vancouver
Over 200 in-person attendees and 2200+ online viewers
Keynote on the State of Open Models in 2024
Challenges of open model research: funding, resources, regulations

Shownotes Transcript

Translations:

中文

Welcome to Latent Space Live, our first mini-conference held at NeurIPS 2024 in Vancouver. This is Charlie, your AI co-host. As a special treat this week, we're recapping the best of 2024 going domain by domain. We sent out a survey to the over 900 of you who told us what you wanted, and then invited the best speakers in the Latent Space Network to cover each field.

200 of you joined us in person throughout the day with over 2,200 watching live online. Our next keynote covers the state of open models in 2024 with Luca Soldani and Nathan Lambert of the Allen Institute for AI with a special appearance from Dr. Sophia Yang of Mistral. Our first hit episode of 2024 was with Nathan Lambert on RLHF 201 back in January.

where he discussed both reinforcement learning for language models and the growing post-training and mid-training stack with hot takes on everything from constitutional AI to DPO to rejection sampling and also previewed the sea change coming to the Allen Institute and to InterConnex, his incredible sub-stack on the technical aspects of state-of-the-art AI training. We highly recommend subscribing to get access to his Discord as well. It is hard to overstate how much open models have exploded this past year.

In 2023, only five names were playing in the top LLM ranks: Mistral, MosaixMPT, TiiUAE's Falcon, Yi from KaiFuLi's ZeroOne.ai, and of course Meta's Llama1 and 2.

This year a whole cast of new open models have burst on the scene from Google's Gemma and Cohere's Command R to Alibaba's QEN and DeepSeq models to LLM360 and DCLM and of course to the Allen Institute's OLMO, O-L-M-O-E. PIXMO, MOLMO and OLMO 2 models.

Pursuing open model research comes with a lot of challenges beyond just funding and access to GPUs and data sets, particularly the regulatory debates this year across Europe, California and the White House. We also were honoured to hear from Mistral, who also presented a great session at the AI Engineer World's Fair Open Models track. As always, don't forget to check the show notes for the YouTube link to their talk, as well as their slides. Watch out and take care.

LUCA RANTANI: Cool. Yeah, thanks for having me over. I'm Luca. I'm a research scientist at the Allen Institute for AI. I threw together a few slides on sort of like a recap of interesting themes in open models for 2024. I have about maybe 20, 25 minutes of slides, and then we can chat if there are any questions. If I can advance to the next slide.

Okay, cool. So I did the quick check to sort of get a sense of how much 2024 was different from 2023. So I went on Hugging Face and tried to get a picture of what kind of models were released in 2023 and what do we get in 2024. 2023, we got things like both LLAMA 1 and 2. We got Mistral, we got MPT.

Falcon models, I think the Yi model came at the tail end of the year. It was a pretty good year. But then I did the same for 2024 and it's actually quite stark difference. You have models that are, you know, reveling frontier level performance of what you can get from close models from like Quan, from DeepSeek. We got Lama 3, we got all sorts of different models.

I added a own Ulmo at the bottom. There's this growing group of like fully open models that I'm going to touch on a little bit later. But, you know, just looking at this slide, it feels like 2024 was just smooth sailing, happiness, much better than previous year. And, you know, you can plot, you can pick your favorite benchmark or least favorite, I don't know, depending on what point you're trying to make.

and plot your closed model, your open model, and sort of spin it in ways that show that, oh, you know, open models are much closer to where closed models are today versus last year where the gap was fairly significant. So one thing that I think

I don't know if I have to convince people in this room, but usually when I give these talks about open models, there is always this background question in people's mind of like, why should we use open models? Is it just use model API's argument? It's just an HTTP request to get output from one of the best model out there. Why do I have to set up infra and use local models? And there are really two answers. There is the more

researchy answer for this, which is where my background lays, which is just research. If you want to do research on language models, research thrives on open models. There is a large swath of research on modeling, on how these models behave, on evaluation, on inference, and mechanistic interpretability that could not happen at all if you didn't have open models.

There are also, for AI builders, there are also good use cases for using local models. This is very not comprehensive slides, but you have things like there are some applications where local models just blow closed models out of the water. So retrieval is a very clear example.

You might have constraints like edge AI applications where it makes sense. But even just in terms of stability, being able to say this model is not changing under the hood, there's plenty of good cases for open models.

And the community is just not models. I stole this slide from one of the Quentoo announcement blog posts, but it's super cool to see how much tech exists around open models and serving them and making them efficient and hosting them. It's pretty cool.

If you think about where the term opens comes from, comes from the open source, really open models meet the core tenants of open source, specifically when it comes around collaboration. There is truly a spirit like,

through these open models, you can build on top of other people innovation. We see a lot of these even in our own work of like, you know, as we iterate in the various version of Alma, it's not just like every time we collect from scratch all the data. No, the first step is like, okay, what other cool data sources and data sets people have put together for language model for training? Or when it comes to like,

Our post-training pipeline, one of the steps is you want to do some DPO and you use a lot of outputs of other models to improve your preference model. So it's really having like an open sort of ecosystem benefits and accelerates the development of open models.

One thing that we got in 2024, which is not a specific model, but I thought it was really significant, is we got our first open source AI definition. This is from the Open Source Initiative. They've been generally the stewards of a lot of the open source licenses when it comes to software. They embarked on this journey and trying to figure out,

how does a license, an open source license for a model look like? Majority of the work is very dry because licenses are dry. So I'm not going to walk through the license step by step, but I'm just going to pick out one aspect that is very good and then one aspect that personally feels like it needs improvement. On the good side, this open source AI license

Actually, this is very intuitive. If you have a build open source software and you have some expectation around what open source looks like for software, for AI, it matches your intuition. The weights need to be fairly available.

code must be released with an open source license and there shouldn't be license clauses that block specific use cases. Under this definition, for example,

LAMA or some of the QEN models are not open source because the license says you can't use this model for this, or it says if you use this model, you have to name the output this way or derivative needs to be named that way. Those clauses don't meet open source definition, and so the LAMA license will not be covered under the open source definition. It's not perfect. One of the things that

internally in discussion with OSI, we were sort of disappointed, is around the language for data. So you might imagine that an open source AI model means a model where the data is freely available. There were discussion around that, but at the end of the day, they decided to go with a softened stance where they say,

a model is open source if you provide sufficient detailed information on how to sort of replicate the data pipeline so you have an equivalent system sufficient sufficiently detailed it's very it's very fuzzy don't like that an equivalent system is also very fuzzy and this doesn't take into account the accessibility of the process right it might be that you provide enough

information, but this process costs, I don't know, $10 million to do. Now the open source definition, like any open source license, has never been about accessibility. So that's never a factor in open source software, how accessible software is. I can make a piece of software open source, put it on my hard drive and never access it. That software is still open source. The fact that it's not widely distributed doesn't change the license.

But practically, there are expectations of what we want good open sources to be. So it's kind of sad to see that the data component in this license is not as open as some of us would like it to be. And I linked a blog post that Nathan wrote on the topic that it's less rambly and easier to follow through. One thing that in general,

I think it's fair to say about the state of open models in 2024 is that we know a lot more than what we knew in 2023. Both on the training data, like the pre-training data you curate on how to do all the post-training, especially on the RL side.

You know, 2023 was a lot of like throwing random darts at the board. I think 2024, we have clear recipes that, okay, don't get the same results as a closed lab because there is a cost in actually matching what they do. But at least we have a good sense of like, okay, this is the path to get state-of-the-art language model.

I think that one thing that is a downside of 2024 is that I think we are more resource constrained than 2023. It feels like the barrier for compute that you need to move innovation along has just been rising and rising. So if you go back to this slide, there is now this cluster of models that are released by the Compute Rich Club.

Membership is hotly debated. Some people don't want to be called rich because it comes to expectations. Some people want to be called rich, but I don't know, there's debate. These are players that have 10,000, 50,000 GPUs at minimum. And so they can do a lot of work and a lot of exploration in improving models that is not very accessible.

To give you a sense of how I personally think about research budgets,

for each part of the language model pipeline is like on the pre-training side, you can maybe do something with a thousand GPUs. Really, you want 10,000. And like, if you want a real state of the art, you know, your deep seek minimum is like 50,000. And you can scale to infinity. The more you have, the better it gets. Everyone on that side still complains that they don't have enough GPUs. Post-training is like super wide range.

sort of spectrum. You can do as little with like eight GPUs, as long as you're able to run, you know, a good version of, say, a Lama model, you can do a lot of work there. You can scale, a lot of the methodology just like scales with compute, right? If you're interested in, you know, your

open replication of what OpenAI's O1 is, you're going to be on the 10K spectrum of our GPUs. Inference you can do a lot with very few resources. Evaluation you can do a lot with, well, I should say at least one GPUs if you want to evaluate.

open models. But in general, like if you are, if you care a lot about intervention to do on this model, which is my preferred area of research, then, you know, the resources that you need are quite significant. One of the trends that has emerged in 2024

is this cluster of fully open models. So Olmo, the model that we built, AI2 being one of them. And it's nice that it's not just us. There's a cluster of other mostly research efforts who are working on this.

And so it's good to give you a primer of what like fully open means. So fully open, the easy way to think about it is instead of just releasing a model checkpoint that you run, you release a full recipe so that other people working on it, working on that space can pick and choose whatever they want to

from your recipe and create their own model or improve on top of your model. You're giving out the full pipeline and all the details there instead of just like the end output. So I pull up the screenshot from our recent MOE model. And like for this model, for example, we released the model itself, data that was trained on,

the code both for training and inference, all the logs that we got through the training run, as well as every intermediate checkpoint.

And the fact that you release different parts of the pipeline allows others to do really cool things. So, for example, this tweet from early this year from Fox News Research, they used our pre-training data to do a replication of the BitNet paper in the open. So they took just really the initial part of a pipeline and then did their thing on top of it.

It goes both ways. So, for example, for the Olmo 2 model, a lot of our pre-training data for the first stage of pre-training was from this DCLM initiative that was led by folks at

a variety of institutions. It was a really nice group effort. But, you know, for when it was nice to be able to say, okay, you know, the state of the art in terms of like what is done in the open has improved. We don't have to like do all this work from scratch to catch up the state of the art. We can just take it directly and integrate it and do our own improvements on top of that.

I'm going to spend a few minutes doing like a shameless plug for some of our fully open recipes. So indulge me in this. So a few things that we released this year was, as I was mentioning, this Omoe model, which is, I think still is state-of-the-art Omoe model in its size class.

and it's also fully open, so every component of this model are available. We released a multimodal model called Molmo. Molmo is not just a model, but it's a full recipe of how you go from a text-only model to a multimodal model. And we applied this recipe on top of QAN checkpoints, on top of Olmo checkpoints, as well as on top of Olmoe. And I think there'd be a replication doing that on top of Mistral as well.

On the post-training side, we recently released Tulu 3. Same story. This is a recipe on how you go from a base model to a state-of-the-art post-training model. We use the Tulu recipe on top of Olmo, on top of Llama, and then there's been open replication effort to do that on top of Quen as well. It's really nice to see when your recipe is kind of turnkey, you can apply it to different models, and it kind of just works.

And finally, the last thing we released this year was OMO2, which so far is the best state of the art fully open language model. It sort of combines aspects from all three of these previous models, what we learned on the data side from OMOE and what we learned on like making models that are easy to adapt from the MOLMO project and the TULU project.

I will close with a little bit of reflection, like ways this ecosystem of open models, it's not all roses. It's not all happy. It feels like day to day, it's always in peril. And I talked a little bit about the compute issues that come with it, but it's really not just compute.

One thing that is on top of my mind is due to like the environment and how, you know, growing feelings about like how AI is treated. It's actually harder to get access to a lot of the data that was used to train a lot of the models up to last year. So this is a screenshot from really fabulous work from Shane Longpre, who's I think is in Europe, about

just access of like diminishing access to data for language model pre-training. So what they did is they went through every snapshot of Common Crawl. Common Crawl is this publicly available scrape of a subset of the internet. And they looked for any given website, whether a website that was accessible in say 2017, whether it was accessible or not in 2024.

And what they found is as a reaction to the existence of closed models like OpenAI or Cloud, GPT or Cloud, a lot of content owners have blanket blocked any type of crawling to their website. And this is something that we see also internally at AI2.

like one project that we started this year is we wanted to understand like if you're a good citizen of the internet and you crawl following sort of norms and policy that have been established in the last 25 years, what can you crawl? And we found that there's a lot of website where the

the norms of how you express preference of whether to crawl or not are broken. A lot of people would block a lot of crawling, but do not advertise that in robots.txt. You can only tell that they're crawling, that they're blocking you and crawling when you try doing it. Sometimes you can't even crawl the robots.txt to, to check whether you're allowed or not. And then a lot of, um, websites, um,

There's all these technologies that historically have existed to make web service serving easier, such as Cloudflare or DNS. They're now being repurposed for blocking AI or any type of crawling in a way that is very opaque to the content owners themselves. You go to these websites, you try to access them and they're not available.

You get a feeling it's like, oh, something changed on the DNS side that it's blocking this. And likely the content owner has no idea. They're just using Cloudflare for better load balancing. And this is something that was sort of sprung on them with very little notice. And I think the problem is this blocking or really it impacts people in different ways.

It disproportionately helps companies that have a head start, which are usually the closed labs, and it hurts incoming newcomer players where you either have now to do things in a sketchy way or you're never going to get that content that the closed lab might have. So there was a lot of coverage. I'm going to plug Nathan's blog post again.

I think the title of this one is very succinct, which is like, we're actually not, you know, before thinking about running out of training data, we're actually running out of open training data. And so if we want better open models, they should be on top of our mind. The other thing that has emerged is that there's strong lobbying efforts on trying to define any kind of open source AI as like,

a new extremely risky danger. And I want to be precise here. The problem is not not considering the risk of this technology. Every technology has risks that should always be considered. The thing that it's like to me is

sorry, the ingenious is like just putting this AI on a pedestal and calling it like an unknown alien technology that has like new and undiscovered potentials to destroy humanity. When in reality, all the dangers I think are rooted in dangers that we know from existing software industry or existing industry

issues that come with when using software on a lot of sensitive domains like medical areas. And it also ignores a lot of efforts that have actually been going on and trying to make these open models safe. I pasted one here from AI2, but there's actually a lot of work that has been going on on like, okay, how do you make, if you're distributing this model openly, how do you make it safe?

What's the right balance between accessibility of unopened models and safety? And then also this annoying brushing of concerns that are then proved to be unfounded under the reg. If you remember the beginning of this year, it was all about bio-risk of these open models.

The whole thing fizzled out because finally there's been rigorous research, not just this paper from coherent folks, but there's been rigorous research showing that this is really not a concern that we should worry about. Again, there is a lot of dangerous use of AI application, but this one was just like a lobbying ploy to just make things sound scarier than they actually are.

So I got to preface this part and say this is my personal opinion, it's not my employer, but I look at things like the SB 1047 from California, and I think we kind of dodged a bullet on this legislation. The open source community, a lot of the community came together at the last minute and did a very good effort trying to explain all the negative impact of this bill.

But I feel like there's a lot of excitement on building these open models or researching on these open models. Lobbying is not sexy. It's boring, but it's necessary to make sure that this ecosystem can really thrive.

This end of presentation, I have some links, emails, sort of standard thing in case anyone wants to reach out. And if folks have questions or anything they wanted to discuss, sort of open the floor.

We have Sophia who wants to, one very important open model that we haven't covered is Mistral. I still have Mistral. Yeah, yeah. Well, it's nice to have the Mistral person recap the year in Mistral. But while Sophia gets set up, does anyone have like just thoughts or questions about the progress in this space? Do you always have questions? Always.

I'm very curious how we should build incentives to build open models, things like Francois Chollet's ArcPrize and other initiatives like that. What is your opinion on how we should better align incentives in the community so that open models stay open? The incentive bit is really hard. It's something that actually even we think a lot about it internally.

Because like building open models is risky. It's very expensive. And so people don't want to take risky bets. I think the definitely like the challenges, like our challenge, I think those are like very valid approaches for it. And then like,

I think in general, promoting, building, so any kind of effort to participate in this challenge, in those challenges, if we can promoting doing that on top of open models and sort of really lean into like this multiplier effect, I think that is a good way to go. If there were more money for efforts, like research efforts around open models, there's a lot of

I think there's a lot of investments in companies that at the moment are releasing their model in the open, which is really cool. But it's usually more because of commercial interest and not wanting to support this like open models in the long term.

It's a really hard problem because I think everyone is operating sort of in what everyone is at their local maximum, right? In ways that really optimize their position on the market, the global maximum is harder to achieve.

Can I ask one question? Yeah. So I think one of the gaps between the closed and open source models is the multilinguality. So the closed source models like ChatGPT works pretty good on the low resource languages, which is not the same on the open source models, right? So is it in your plan to improve on that space?

I think in general, yes. I think we'll see a lot of improvements there in 2025. There's groups like focus on the smaller side that are already working on better crawl support, multilingual support. I think what I'm trying to say is you really want to make experts

who are actually in those countries, in those languages, who participated in the international. To give you a very easy example, I'm originally from Italy. I think I'm terribly equipped to build a model that works well in Italy.

Because one of the things you need to be able to do is having that knowledge of like, okay, how do I access, you know, libraries or content that is from this region that covers food science. I've been in the US long enough that I've no longer known that. So I think that the experts at Folk Central here, for example, are doing around like, okay, let's tap into regional communities to get access

to bring in collaborators from those areas, I think it's going to be very crucial for getting products in. Hello, everyone.

Yeah, I'm super excited to be here to talk to you guys about Mistral. A really short and quick recap of what we have done, what kind of models and products we have released in the past year and a half. So most of you have already known that we are a small startup, founded about a year and a half ago in Paris. In May, 2003, it was funded by three of our co-founders

And in September, 2003, we released our first open source model, Mistral 7B. Yeah. How many of you have used or heard about Mistral 7B? Hey, pretty much everyone. Thank you. Yeah. It's our...

Pretty popular and our community really loved this model. And in December 2023, we released another popular model with the MLE architecture, Mr. 8x7b.

Going into this year, you can see we have released a lot of things this year. First of all, in February 2004, we released Mr. Small, Mr. Large, Le Chat, which is our chat interface. I will show you in a little bit. We released embedding model for converting your text into embedding vectors. All of our models are available.

the big cloud resources. So you can use our model on Google Cloud, AWS, Azure, Snowflake, IBM. So very useful for enterprise who wants to use our model through cloud. And in April and May this year, we released another powerful open source MOE model, AX20UB. And we also released our first code model, CodeStroll, which is amazing at 80 plus languages.

And then we provided another fine tuning service for customization. So because we know the community love to fine tune our models, so we provide you a very nice and easy option for you to fine tune our model on our platform. And also we released our fine tuning code base called minstrel fine tune is open source. So feel free to take it take a look.

More models. On July to November this year, we released many, many other models. First of all, is the two new small, best small models. We have Minstrel 3B, great for deploying on edge devices. We have Minstrel 8B. If you used to use Minstrel 7B, Minstrel 8B is a great replacement with much stronger performance than Minstrel 7B.

We also collaborated with NVIDIA and open sourced another model, NEMO 12B, another great model. And just a few weeks ago, we updated Mr. Large with the version 2 with the updated state of our features and really great function calling capabilities. It's supporting function calling natively.

And we released two multimodal models, PixTroll 12b, it's open source and PixTroll large, just amazing models for not understanding images, but also great at text understanding. So yeah, a lot of the image models are not so good at text understanding, but PixTroll large and PixTroll 12b are good at both image understanding and text understanding.

And of course, we have models for research. Coastal Mamba is built on Mamba architecture and math role. Great with working with math problems. So yeah, that's another model.

Here's another view of our model offerings. We have several premier models, which means these models are mostly available through our API. I mean, all of the models are available throughout our API, except for Ministry 3B.

But the for the premium model, they have a special license. Mr. Research license, you can use it for free for exploration. But if you want to use it for enterprise, for production use, you will need to purchase a license from us.

On the top row here, we have Minstrel 3BN-AB as our premier model. Minstrel Small for best low latency use cases. Minstrel Large is great for your most sophisticated use cases. Pixel Large is the frontier class multi-model model. We have Coastal for great for coding and then again, Minstrel Embedding model.

And the bottom of the slides here, we have several Apache 2.0 licensed open way models, free for the community to use. And also if you want to fine tune it, use it for customization, production, feel free to do so. The latest we have, Pixtress 12b.

We also have Mr. Nemo, HostroMamba, and Mastro, as I mentioned. And we have three Lexi models that we don't update anymore. So we recommend you to move to our newer models if you are still using them. And then just a few weeks ago, we did a lot of research.

improvements to our code interface, Lachette. How many of you have used Lachette? Oh no, only a few. Okay, I highly recommend Lachette. It's chat.mystore.ai. It's free to use. It has all the amazing capabilities I'm going to show you right now. But before that, Lachette in French means cat. So this is actually a cat logo. Yeah, if you can tell, this is cat eyes.

Yeah, so first of all, I want to show you something. Maybe let's take a look at image understanding. So here I have a receipt and I want to ask, just going to get the prompts. Cool. So basically I have a receipt and I said I ordered a coffee and a sausage. How much do I owe? Add a 18% tip.

So hopefully it was able to get the cost of the coffee and the sausage and ignore the other things. And yeah, I don't really understand this, but I think this is coffee. It's yeah, nine. Yeah. And then cost of the sausage, we have 22 here. Yep. And then it was able to add the cost, calculate the tip.

and all that. Great. So it's great at image understanding. It's great at OCR tasks. So if you have OCR tasks, please use it. It's free on the chat. It's also available through our API. And also I want to show you a Canvas example. A lot of you may have used Canvas with other tools before.

with Lachat is completely free again. Here I'm asking it to create a canvas that's used PyScript to execute Python in my browser. Let's see if it works. Import this. Oh, yep. Okay. So yeah, so basically it's executing Python here. Exactly what we wanted. And the other day I was trying to ask Lachat to create a game for me. Let's see if we can make it work.

Yeah, the Tetris game. Yeah, let's just get one roll. Maybe. Oh no! Okay, never mind. You get the idea. I failed my mission. Okay, here we go. Yay! Cool. Yeah, so as you can see, Le Xia can ride

like a code about a simple game pretty easily and you can ask Leche to explain the code, make updates, however you like. Another example, there's a bar here I want to move. Right, okay. And let's go back to another one.

Yeah, we also have web search capabilities. Like you can ask what's the latest AI news. Image generation is pretty cool. Generate an image about researchers in Vancouver. Yeah, it's Black Forest Labs Flex Pro. Again, this is free. So, oh, cool.

I guess researchers here are mostly from University of British Columbia. That's smart. Yeah, so this is Le Chat. Please feel free to use it and let me know if you have any feedback. We're always looking for improvement and we're going to release a lot more powerful features in the coming years. Thank you.

Okay, maybe you get something off now. Yeah, yeah. Okay, cool. Yeah. Hi, everyone. Thank you so much for coming today. Huge shout out to SWIX and the latent space team. I think it's been a great, yeah, let's just give it up for SWIX. Just real quick.

did a little bit in terms of helping with the planning, but I work at Notable Capital. Some of you may have heard of GGV, which was our former name on the cloud infrastructure team. So basically anything data, dev tools, AI infrastructure, as well as AI applications.

And so we like to stay close to those that are smarter than us, which is all of you in this room. So if anyone ever wants to, you know, brainstorm or thinking about starting a company, we're happy to collaborate. We've had the opportunity to partner with like amazing companies such as HashiCorp, Bracel, Neon, and many others over the years. And we're based in San Francisco and New York. So yeah, feel free to find me, Laura Hamilton, X, LinkedIn. You know, if we become friends, Instagram. Yeah. Yeah.

Thank you all for coming. And then we'll kick off some of the chats with AWS after everyone gets lunch. All right. All right.

2024 in Open Models [LS Live @ NeurIPS] 42:24 Share