OpenAI has added a new model to chat GPT and that is the model GPT 4.1. Now this is a brand new model that they just, you know, came out with or dropped. This is actually released back in April, but they never ever added it to the chat GPT platform. So today on the podcast, I'm gonna be breaking down what the new model does, why they didn't release it before. Some of the controversy around this release, some people speculate it was due to safety reasons and other things. It's officially live. We're going to be getting into all of that.
Before we do, I wanted to say that my startup AI Box has officially launched our very first product, which is an AI Box Playground. This is essentially a place where you can get access to all of the top AI models on one platform, and you're able to test all of them for $20 a month. So you don't need subscriptions to the 20 to 40 different top AI companies anymore. You can pay $20 a month, get access to everything, and use them on a per need basis. You essentially get tokens every month, and you can use those towards whatever you want.
We have access to audio models like 11 Labs, access to some of the top image models like OpenAI, of course, and a lot of other ones that you may not have used or heard of that actually are really impressive. And we also have something called the media storage. This is a place where every file that you ever create gets stored there. You can go back and easily find what conversations you had all of your time.
you had and what prompts you used in order to generate different things like images or audio, you can click a little button in the media storage on whatever your, um, you know, media is that you created, go and view the actual chat that used to generate it. There's a ton of cool features in here, like comparing back and forth between different AI models, getting
this, you know, multiple AI models to run the same prompt, for example, and comparing things side by side. So a lot of really cool features we've added in here. If you have more ideas, we are rapidly developing and adding new features. So we would love to hear from you on what you'd want to see. You could check it out in the description. It is AI box dot AI. All right, let's get into this new model from open AI.
The thing that I found really interesting here, and I'll just break the news, essentially this GPT 4.1 is specifically designed for math and for coding. So this is something that OpenAI seems like they are really, I don't want to say struggling with, but it's essentially the one area that it's running away from them. They're getting their main competitors. Claude is kind of smoking them with Claude Code. Everyone's using that. Even Google Gemini is making some big grounds they just recently launched.
announced that the new Google Gemini chatbot can now integrate and more easily analyze GitHub projects. So it's really building directly into GitHub, which is owned by Microsoft, who heavily invested in OpenAI, but yet Gemini is making some big moves in that space. So this code area is really, really valuable. A lot of companies are looking at it. So much so that OpenAI is actually about to acquire for $3 billion one of the top
AI coding companies, which is called Windsurf. It's pretty much the most popular one. Cursor is probably the second most popular and has about a $1 billion valuation based off its last round of funding. But Windsurf for $3 billion is looking to be acquired by OpenAI. And they're pulling a bunch of moves here. Now,
I think that kind of the acquisition with windsurf and the timeline for that is probably what's pushed them to make this new GPT 4.1 model live on chat GPT. So if you go over to chat GPT, you can go and hit the dropdown. What's interesting is it doesn't actually show in kind of their
priority AI models, you got to click on their more model section. That's where you're going to see GPT 4.5, which is a quote unquote research preview. And then you see GPT 4.1 and 4.1 mini. Now, what a lot of people have asked is like, okay, why the heck would I be using GPT 4.1 when I could just use GPT 4.5? Isn't 4.5 better than 4.1 or 4.1 mini? And it's actually interesting. OpenAI specifically said that for coding tasks, GPT 4.1 is going to be better than
than GPT 4.5. This is getting to like kind of a weird place where we're coming out with newer models, quote unquote, or allegedly newer models, more advanced models that are worse at certain tasks than older models. So it's like they got, you know, this old model could do X, Y, and Z really well, but the new model can do mostly everything better, but not this specific thing. And so it kind of gets to a weird place for open AI where you're mixing and matching what
model you have. That's why they have their dropdown with, you know, four different models to choose from. And then in their more model section, you've got three more. So really, if you're on chat, you have seven options to choose from of what you're going to talk to. I've talked at length about how this is a terrible marketing thing and how other models, other companies are doing a great job. XAI, for example, with Grok just has, you know, you can use the old version of Grok or you can use Grok 3. Now they have new features within Grok
uh three which is like you know i do like a deep research kind of a dive or like they have like a think button where it gives it more compute and it really thinks and i found great results with that that is more what i would like to see from open ai even if it is completely switching the model i just want an easy ui now they've created some ui inside of the search box but i think it's a little ridiculous they have a search button for the internet which is fine they have a deep research
which is if you want like a really extensive document. And I understand that, that one, I think keep it. Then they have a create image. Now, in my opinion, if you're coming here to create an image and you know you can create images,
You should just say what you want it to create an image of and it should just know and automatically generate. And it actually does that, but maybe they're just trying to prompt new people to tell them that they can create an image and they can just type it here. So maybe it's kind of a marketing thing. But in any case, it's not like incredibly useful. I mean, it's redundant. You could just talk to the model and tell it to create an image and you don't need a button that specifically does that. But in any case, if you click the create image button, it just adds the text automatically.
into the chat that says create image. And now you're good to go. And actually, maybe that's not a bad idea to tell people that you can create image. I might actually steal that for AI box. So, uh, you know, for all my flaming donut, don't get mad at me. If you go over to AI box and see me add that to my, um, to my search bar. All right. So here's, what's actually embedded and what's interesting about this new GPT 4.1 model. So this
This came out back in April, but it was only released for developers on the API platform, meaning average people on chatgpt.com could not use it. Only if you had a developer account with an API access token to OpenAI, embedding it into kind of like your software or project that you're building. This is only for developers. And you could say like, hey, that's fine. It's a code tool. Only developers need code tools. Developers know how to get access. But in reality, I think a lot of people, even developers are using...
might be using directly like clot or other platforms and they don't maybe want to have to go through that headache in order to use it on kind of like a special portal that they might make. So it's just getting embedded in software. Now, why did they do this? Why didn't they just roll it out on chat GPT.com like everyone else? That is where the controversy comes in. So some people are saying,
This is due to safety concerns and that they didn't release a proper safety report. So they essentially got a bunch of criticism for this and they claimed that opening up like so a bunch of researchers that were talking about this claim that opening I was lowering their standards around transparency in their AI model opening. I argued that despite it.
being faster, GPT 4.1 being faster compared to GPT 4.0. The model was not a frontier model and because of that it didn't need the same safety reporting as some of the more capable models. So OpenAI's response was like, "Yeah, we didn't release the safety report like you're kind of criticizing us for, but that's just because this isn't really our frontier model. It's kind of just like our side model. We're just letting developers use. It doesn't really need as much vetting." Now,
If I'm being 100% honest, I'm not actually advocating for more safety reviews on these models for code generation model. I'm not super concerned about that. That's just not my wheelhouse. I'd rather get the model sooner than focus a ton on safety. So that's just me personally.
But at the end of the day, it's kind of interesting that that was opening eyes response. So what exactly can this do? According to Sheki Amdo, this new model is gonna help software developers who are using ChatGPT to write or debug code. Those are kind of the two specific
things. And it is, essentially is better at instruction following compared to GPT-4.0. And it's also faster than the O-series reasoning model. So it's not necessarily a reasoning model. It's much faster. It's better at code. It's kind of interesting because some people like the reasoning models for code and evidently they've kind of moved away from it in this particular update. So that is, I think, definitely an interesting kind of fact. So
This is what they specifically said. They said GPT 4.1 doesn't introduce new modalities or ways of interacting with the model and doesn't surpass O3 in intelligence. This means that the safety considerations, while substantial, are different than frontier models. That's kind of their head of safety explaining why they didn't do a lot of safety testing on this.
this. As I mentioned, this is very interesting timing for this model to come out because we have a ton of competition. We have, of course, OpenAI now trying to get their $3 billion acquisition of Windsurf pushed through. But we also have
a ton of other players putting out coding tools. We have cursor that allegedly I think opening, I might've made a bid to acquire them as well. It didn't go through. So then it went with windsurf. There's kind of rumors, but then we also have of course, Gemini connecting more deeply with GitHub. We have Claude code, which is
run away with most developers and is increasing in popularity. And so I think there's just a ton of competition and it's going to be fascinating to see who is the ultimate winner in this space. All right. Thank you so much for tuning in. Make sure to go check out the AI box platform. If you are interested in
getting a platform that can let you chat with all of the text, image, and audio models all inside one chat, switch between all the models in the same chat, use different models that are good at different things. Like we've been talking about in this podcast today, some models are great at code. Even some older open AI models are great at code and some of them are worse at code. In the AI box platform, you have the ability to go and start a new chat. And we specifically have
GPT 4.1 that we've been talking about in this episode and 4.1 mini and 4.1 nano. We have all of these here. You could test them all out if you're interested in coding, or you can use all of the other Chatship team models and Anthropic and DeepSeek and Google and Meta and Microsoft and Mistral and NVIDIA, all of them. All right. So go check it out. AIbox.ai. Thanks so much for tuning into the podcast today. I will catch you in the next episode.