This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. There's a new breed of large language models that you probably haven't used.
Even the dorky among us that follow generative AI every day, so many people haven't really used or figured out or found use cases for these new reasoning models like OpenAI's O1 and O1 Pro. So today I'm going to be specifically going into O1 Pro, telling you exactly what it is, telling you who it's for,
how it works and ultimately if it's worth the price tag. Yeah, you seen the price tag on this new 01 Pro? $200 a month.
There's other things that are included in that, but I think we've kind of been spoiled by how cheap and available large language models have been, especially as prices continue to plummet. And then you see something like a $200 a month subscription for a model like O1 Pro, and you're like, what is this and is it worth it?
All right, we're going to be tackling that and hopefully a lot more on today's edition of Everyday AI. What's going on, y'all? My name is Jordan Wilson. I'm the host of Everyday AI. This thing's yours. This is your cheat code. This is a daily live stream podcast and free daily newsletter helping us all not just keep up with AI, but how we can actually use it to grow our company and our career. This is how you become the smartest person in your company, in your department at AI.
All right. And we all need it, whether you know it or not. Right. I've been saying this for many years. Even if you don't think that you're a dorky or a tacky person, we all have to learn how to get the most out of AI. All right. And where you start doing that is, well, here you're listening, but also at youreverydayai.com. That is our website. There, you need to go sign up for the free daily newsletter. Each and every day, we recap
our show for the day, as well as a lot of other information in that daily newsletter. You know, the latest news, trends, fresh finds from across the internet, tutorials, everything you need to know.
It is your guide every single day. Go read it. It doesn't take long. Also on our website, there's more than 430 episodes that you can go listen to from the world's leading experts, all for free, sorted by your category. So go click on that AI Learning Tracks or that episodes on our website. Go find whatever you care about. It can be marketing, legal, tech, governance. I don't care. It's all there for you for free. So make sure you go check that out.
Speaking of checking things out, Monday, January 20th, mark your calendars, y'all. We are next week going to be doing five episodes. These are not just our 2025 AI predictions, but they are more like a roadmap on how to deal with everything that's coming. Yeah, I literally spent
thousands of hours in 2024, talking to smart people, thinking about AI, reading about it, writing about it. This is the culmination of all that. You are not going to want to miss it. Get your board together. Get your team together. You need to tune in. All right. Before we get into today's show, I'm excited about it. Let's first go over the AI news for the day. Livestream audience, thanks for joining. Got a couple of questions for you. Let me know.
All right. So first, Google is partnering with the Associated Press to enhance Gemini with real-time news updates. So Google's AI chatbot Gemini is set to integrate real-time news from the Associated Press, marking a pretty significant collaboration between the tech giant and a major news publisher. So the deal allows AP to deliver a continuous feed of real-time information to the Gemini app.
AP's chief revenue officers emphasized the importance of this collaboration, highlighting a commitment to nonpartisan reporting and accurate journalism. Financial terms of the agreement have not been disclosed, raising questions about compensation and how AP's content will be credited within the Gemini app.
This is another piece of news related to this. This Google and AP deal comes literally at the same time as OpenAI announcing a new partnership with Axios to expand local newsrooms. So OpenAI will fund Axios' local newsroom expansion into a couple cities, Pittsburgh, Kansas City, Boulder, and Huntsville, marking the first time it has directly funded newsrooms in a publisher deal.
So this three-year partnership allows OpenAI to use Axios journalism for chat GPT responses while providing Axios access to AI tools for content creation and distribution systems. Yeah, I'm actually excited. We're going to have a specific episode coming up very soon. So make sure you tune in on AI's impact specifically on journalism. I was a former journalist. So, you know, it's going to be a conversation near and dear to my heart. Last piece of AI news, Microsoft has unveiled new usage-based software
AI powered copilot chat for corporate users. So, uh, Saudi and Nadella, Microsoft's chairman and CEO introduced a new category of PCs and of copilot with built-in generative, uh, generative AI tools at a recent event. So the newly launched Microsoft 365 copilot chat offers an alternative to the existing copilot service, which costs organizations $30 per employee per
per month. So now this is based on usage. So yes, Microsoft 365 Copilot chat based on usage. So the new model allows organizations to pay based on actual usage, right? So if you have thousands of employees at your company, maybe everyone's not ready to, you know, throw down a couple million dollars. So the new model, like I said, allows organizations to be charged for actual use with charges calculated per message sent starting at just one cent per message.
which could encourage wider adoption among companies. So Copilot chat like normal Copilot can summarize documents, fetch web information and create task performing agents. So unlike the traditional Microsoft 365 Copilot, which is also integrated into applications like Word and Excel, Copilot chat is accessible via the Microsoft 365 Copilot app on various platforms. All right. So let me know. Do you guys want to see more on Copilot chat? Let me know.
You know, all right. I'm excited to get going. So let's talk about it. Open AI's 01 Pro. Livestream audience, thanks for joining us. Have you all used this? Same thing, podcast peeps. I always say, check the show notes. You can reach out to me on LinkedIn, email the show. I want to hear from you all, right? I can only make this better the more information you tell me, but I'm curious for our livestream audience. Are y'all using Open AI's 01 Pro?
01 Pro, let me know, but let's just dive straight into it. Enough chit-chat. So here's the gist of 01 Pro. All right, so it is a reasoner model.
Okay. It's two different classes. So the GPT family of models, those are transformer models, right? This is just the easiest way to separate the two. And a lot of other companies, you know, Google came out with their kind of flash thinking, which is a reasoning model. Amazon Nova has a reasoning version, DeepSeek, the Chinese AI company. So just in the last like four weeks,
just about all the big tech companies have said, all right, we need a reasoning model because we see how powerful it is. So O1 is a reasoner model different than, than a GPT model, right? GPT 4.0 you know, all these other models that we've been using for, you know, now two plus years, they've been these kind of transformer models and they're completely different. And we're going to talk about the difference between the two, but that's the biggest thing. It uses chain of thought thinking kind of under the hood and,
And I like to say when you're working with chat GBT, like do any of you have different types of colleagues, right? For those of us that may be still going in office or are hybrid, you have those colleagues, you know, you're getting work done, but you have those colleagues that work by talking.
Right. You're just going back and forth, back and forth. And then you have those employees or, you know, coworkers that work by putting their headphones on. You know, you talk to them once and then you check in with them at the end of the day. Right. Those are those deep work employees that put their headphones on. They just crank it. Think of that as like those two different types of large language models. Right. There's some people.
that you work with, right? That require a lot of talking, right? Require a lot of conversation, require a lot of collaboration, you know, and that's how they work best. And then there's others just like, they don't want to talk, dump all the information on them, give them their instructions, let them ask questions up front if they have any, but then they go to work and you see them later. So the latter, that is what these new O1 models are. All right. So as an example for our live stream audience here, you probably see it on the screen, uh,
I took a screenshot of this, but yeah, generally when I ask these model questions, it's going to take anywhere from four to 15 minutes to get a reply. At least how I use a one or at least a one pro. So I do use the, you know, the chat GPT pro account, right? So that is $200 a month. So I'll, I'll, I'll share what that includes, but you know, for the most part, when I'm using a one pro, I know this is my deep work colleague. I'm still using GPT four Oh,
all the time, all day, every day. I like using it in Canvas mode. I just started really using the new task mode, which if you listened to our show yesterday, we went over that new mode. By the way, for those of you that shared the episode yesterday, I spent probably...
about three hours on a document that showed you how to use tasks and to go over this concept of task concept, task stacking. If I'm being honest, it's one of the best documents I've ever created in my life. So if you haven't gone and shared that task episode, go do that and I'll share that document with you.
you. All right. So that's the gist of it. This is a reasoning model. This is not a GPT model. It takes a long time to think it uses this chain of thought process under the hood, right? So you really have to use it at the right time for the right purpose, for the right reason.
Hey, this is Jordan, the host of Everyday AI. I've spent more than a thousand hours inside ChatGPT and I'm sharing all of my secrets in our free Prime Prompt Polish ChatGPT course that's only available to loyal listeners like you. Check out what Mike, a freelance marketer, said about the PPP course. I just got out of Jordan's webinar.
It was incredible, huge value. It's live, so you get your questions answered. I'm pretty stoked on it. It's an incredible resource. Pretty much everything's free. I would gladly pay for a lot of the stuff that Jordan's putting out. So if you're wondering whether you should join the webinar, just make the time to do it. It's totally worth it. Everyone's prompting wrong, and the PPP course fixes that.
If you want access, go to podpp.com. Again, that's podpp.com. Sign up for the free course and start putting ChatGPT to work for you. So let's just take a quick look.
at the tiers or subscriptions. So there is a free version of ChatGPT. I've had a lot of ChatGPT shows this week because it's early in the year. A lot of people are asking and we had some popular episodes like a year ago and I'm like, yo, these are old. So literally this week, we did an episode on ChatGPT free versus plus. All right. So you have your free ChatGPT, which is actually...
pretty good now, right? I used to tell people don't touch it with a 10 foot pole. It's dangerous. It's not like that anymore. The free version is pretty good. You have the $20, uh, plus plan that I think a lot of people are on, uh, that just gives you essentially almost all the features except the Oh one pro model. So even on the chat GPT plus you have, Oh,
01, it's much more limited in terms of messaging and you have an 01 mini. But on the pro plan, so if you want to use 01 Pro, which is technically OpenAI's most powerful large language model,
You do have to pay $200 a month to be on that pro plan, a one pro, but there's a lot of other, I guess, features and benefits. So when we talk about Sora, you have way more usage in Sora. You have unlimited usage of GPT-4.0, whereas, you know, normally even on a chat GPT plus plan, you,
you run into limits. So essentially you get more limits on the pro plan. It's, it's unlimited, um, GPT four Oh, it's unlimited advanced voice mode where on the plus plan is a little limited. And then you get access to all one pro, which you can only access via, uh, that $200 a month subscription. Uh, yes, it is confusing. Yeah. Allison's kind of talking about the naming here. Um, it is weird because the plus, uh,
Is that $20 a month where a lot of companies like Microsoft, they have their, uh, the $20 a month is called pro. So even I get confused all the time, but yeah, so chat TPT plus $20 pro $200, not to be confused with all these other ones that say pro for the $20 tier. Yeah. Uh, all right.
So let's talk about how OpenAI describes their model. So they say more thinking power for more difficult problems. So they say ChatGPT Pro, in this instance, they're talking about O1 Pro actually, provides access to a version of our most intelligent model that thinks longer for the most reliable responses. In evaluations from external expert testers, O1 Pro mode produces more reliably accurate and comprehensive responses,
especially in areas like data science, programming, and case law analysis. Compared to both 01 and 01 Preview, 01 Pro mode performs better on challenging machine learning benchmarks across math, science, and coding. All right. So yeah, speaking of benchmarks, essentially 01 Pro, it's PhD level, right? It's no longer where you...
Really have to work. So that's the other thing. I think with like a model like GPT-4-0, you can get these, you know, quote unquote, PhD level responses. You just have to have a master's degree level to get it there, right? It's different with O1, especially with O1 Pro.
You don't have to have a lot of experience to get it to that kind of, we'll just say, quote unquote, PhD level. It kind of can do it on its own because it uses this kind of under the hood, you know, step by step chain of thought reasoning. Right. So it's weird. And I've talked about this all the time. And if you've taken our free prime prompt polish course, you understand that.
The GPT-4-0 family of models is extremely capable, but to get the most out of it, you got to know the basics of like prompt engineering, right? Without getting too technical, there's things called shots, right? And when working with a transformer model or a transformer family of models, regardless of if you're talking, you know, ChatGPT, Gemini, Claude, et cetera, right? If you do some basics of prompt engineering, it's going to be better. So a five shot prompt as an example is always going to do better than a zero shot prompt. So what that means is,
And I'm going to oversimplify it here. So sorry, machine learning PhDs. A shot is when you give a model an example. In input and output, you tell it good, bad, why. That's what I like to say. Input, output pairing, good, bad, why. So you are essentially shotting this model. So that's what...
The O1 and technically O3, right? Like OpenAI teased the O3 model. It's not out. I don't think it'll be out anytime soon, but this O family of models kind of goes through that process on its own, right? It doesn't give itself examples, but it goes through that chain of thought thinking. And I'm going to show some examples for you live on the screen. But from a benchmark perspective, the gains are huge, right? So some of the biggest gains are
between like the O one family and the GPT four family are in math. I mean, you're, you're automatically like Olympiad, uh, you know, math Olympics, like gold, silver metal, right? So it's smarter than 99.9999995% of humans.
Uh, in math physics, same thing, huge jump from, uh, Oh four to, or sorry, uh, to GPT four Oh, and the Oh one getting ahead of myself, skipping, skipping over Oh three, you know, four Oh, Oh three Oh one, uh, alphabet soup already, uh, other categories. So yeah, mathematics, physics, uh,
LSATs, right? So, you know, they have actual models take exams. So huge. So obviously, if you are in software development, if you are in research, if you work in anything that has to do with complex math, complex equations, business intelligence, right?
Uh, if you are essentially working with numbers, working with research, I think, Oh, one makes the case for itself, but I'm going to keep going. Uh, and we're going to talk about even some everyday use cases. So first you might've, you might've got confused, right? Cause I'm dropping all these different, uh,
Words on your head. Oh, one, this, oh, one, that, oh, one, right? Because technically, oh, one has been out for a while. So we saw oh, one preview in oh, one mini in September. So yeah, all those other big companies that have, that are releasing these kinds of quote unquote reasoning models. Uh, this is just the last few weeks. Open AI has been here for a few months since September. They released oh, one preview in a one mini.
And then in December, they essentially knocked the preview off of it and said, okay, now this is a one. So technically, if you think of power and not all these models are here, but you have 01 mini, 01 preview, which is now no longer there, 01 and 01 pro. So 01 pro and the kind of full version of 01 are new ish. They've only been out for a couple of weeks. I've been using it. I
I didn't get it right away, probably about a week or so after it was released. So I've been using it now for about three weeks pretty heavily. All right. So let's just go over quick in live stream audience. Yeah. Keep getting your questions in. I'm going to try to tackle some of these at the end. Mark says, it looked like a lot of work, Jordan. Thanks for sending it. Oh yeah. That's the task thing from yesterday. Task stacking. Everyone slept on the chat GPT tasks yesterday.
Even I think OpenAI missed the point. All right. So let's go over the bullet point details. I'm going to go through this quickly because I'm going to show you here at the end of the show. For a podcast audience, I'm going to try to walk you through it. O1 in action for even non-technical reasons. So what is O1 Pro? All right. Let's go through all the bullet points here. I want to make sure I give you all the details. So...
O1 Pro is OpenAI's premium model available to chat GPT users for $200 a month. Also, there's other third-party platforms where you can just pay by usage, right? You can't get access to all the other tools and all that. But here's the thing, at least right now, the O1 model, it doesn't have all the other tools anyways, right? It doesn't have internet access, right? The O1 Pro model, right?
The O1 Pro, you can actually upload files, which is nice. And you can with that on O1 as well. Whereas previously, the O1 family of models, you couldn't upload files. And still, on O1 Mini, you can't upload files anyways. But it's described, O1 Pro is described as an AI colleague for complex tasks. And it's more about reasoning than collaboration.
So how does it work? Well, like we talked about, enhanced reasoning. It has this chain of thought processing to kind of enable better logical breakdowns. Accuracy and reliability. So we shared about this on the show before. OpenAI kind of went over this like four of four reliability concept, right? Where there's a little more variability in the kind of GPT family of models. With the O series of models, there's much more reliability.
And also, this is specialized for professionals. If you are a specialized professional, O1 is for you. So it's strong in STEM, coding, legal, and data science. Advantages.
What the heck are the advantages of this? Well, like we said, enhanced reasoning. So the chain of thought processing under the hood enables better logical breakdowns. We talked about the four of four reliability. So that's essentially, you know, when OpenAI did their internal benchmarks, they wouldn't just say do it once and like, oh, okay, yeah, this passes. They would do it actually four times for more consistent results. All right.
So where does it excel? So I said, hey, here's where you can use it. There's strong use cases here. It excels in scientific research. So analyzing data sets, developing hypothesis, designing experiments, financial modeling, forecasts, complex calculations, legal workflows, right? Analyzing case law, summarizing documents, right?
It excels at anything STEM related, right? Anything. So it specializes in kind of synthesizing and analyzing dense data sources. And here's the thing, and I'm going to show you some more non-technical use cases at the end and how I'm using it. We all have access to data.
Data over the past five to 10 years, data used to be something for the geeks. Now we all have access to data. There's more and more data being collected, which is why I think there's actually broad use cases for people to be using the O1 Pro models.
So what is it good for kind of already, uh, talked about this, but this is what people are always asking. Like, right. Where does Excel, what's it like, what is it good for? Who should use it? So I, I want to tackle this from all areas. So, uh, who is it good for? So professionals in STEM finance law and healthcare, great healthcare, uh,
use cases as well. So any users with high stake tasks requiring accuracy and advanced reasoning. So developers also great for anyone in software development coding. So handling intricate coding and debugging requirements professionals in fields like medicine that require precision. All right, now let's do the breakdown. People are always asking, well, what's the difference?
Should I just be using 4.0? Should I be using O1 Pro? The way I like to say that, think of these two chatty colleagues, right? For the most part, we've been kind of spoiled by having these transformer models that are highly capable. If you know how to use them, I would still say 80%.
of the business world has no clue how to use something like ChatGPT, something that has now become synonymous with AI and has name recognition like Google, people still don't know how to use it, right? And people in positions of leadership, kind of scary. But O1 Pro excels in complex reasoning. GPT-4.0 excels in, well, being fast, speed, general tasks, right? Also, GPT-4.0 has access to more tools, right?
which is important. So you can upload files in O1 Pro, but you can't use things like Canvas. You can't use things like Tasks. You can't use things like
Chat GPT search, connect it to the internet. That's an important thing because when you are using a non-connected model, you have to keep in mind, you better be feeding it a ton of up-to-date data or whatever you are asking it should hopefully not require a lot of up-to-the-minute real world information because that model does not have it. So how should you prompt O1 Pro? This is where it's completely different. And again, think of my comparison.
Does anyone else have that chatty coworker? And then the coworker that just has the headphones on, right? I'm the latter. You never would have guessed, right? Someone that just talks about AI nonstop and sometimes talks for way too long. But, you know, speaking of like even my journalist days, right? I used to be the guy. I would go in there after an assignment. I would go talk to my editor, get all the information I would need, right? Then I would go sit in the back, put my, you know, headphones on. They weren't even headphones.
I don't know if this makes me weird. I would have like the, you know, sound canceling. There's no music. I would just put those things on. I would just go to work, do the whole thing. Right. But then there's other people, you know, Hey, they want to check in at every single, every single point. Right. But that's, that's the difference. So with Oh one,
You have to begin with, number one, a lot of data, a lot of context. You need to have clear and structured prompts to define task parameters effectively. You need to provide examples or templates to guide the model's output format. And you need to use concise yet informative phrasing to maximize response relevance. If y'all have taken our free Prime Prompt Polish course, yes, we're going to have new ones in 2025.
Give me a minute. I'll explain why later. But, you know, we walk through something called refine queue. So if you have taken our free course and there's been like, I don't know, 8000 of you use that refine queue method for setting up your first prompt to 01. You're still going to have to answer a question or two because that's the, you know, how we set up that refine queue. But try that out. It's going to work fairly well. So let's get to the big question.
Is it worth $200 a month? So let's talk about the pros and the cons. Well, the pros are high
High accuracy and reliability in complex domains. Unique reasoning capabilities, especially if you are in some of those more technical professions, software development, engineering, anything with math, research, data, science, STEM, right? If you're there, yeah, it probably is. Probably a no-brainer. What about for everyone else? Because there's cons. $200, it's not cheap. Although if I'm being honest,
I think we've been spoiled by these free and $20 a month world-class state-of-the-art models that essentially now have mini rag, right? We've been spoiled, right? Because the big companies, they know a lot of them are losing money, right? Like OpenAI reportedly lost, I don't know, $4 or $5 billion in 2024 because they're not worried about making money. What we are getting, if you're a power user, you're getting way
way more than that $20 a month, right? OpenAI CEO, Sam Altman said, even on this $200 pro plan, they're losing a ton of money is what he said. This is still relatively cheap, whether we're talking about the $20 a month, or if you have a use case for it, I think even the $200 a month, pretty affordable, right? All things considered. And we're going to see an example of that. So ultimately there's the pros and the cons. It's having a PhD level program
companion that will think about things, give you better results, higher accuracy if you know how to direct it, but it is much slower, right? So if you're used to just jabbing back and forth and that's how you like working with large language models and you don't see any problems right now with outputs, then it's not for you.
But I think it's actually for more people than you think. I think people are literally just thinking, oh, oh, one, that's for, you know, engineers, data scientists, researchers, et cetera. I don't think so. So you also need to just ask yourself, yeah, do you need the advanced reasoning and professional tools? Are there use cases in your domain that are worth that premium price?
So you really have to ask those questions. There's no blanket answer. I think if anyone asked me, you know, chat GPT free or chat GPT plus, it's easy. I don't care what you're doing. Chat GPT plus $20. It's a steal, right? I've always said all along, if chat GPT plus was $200, I would still pay for it. Right? Obviously I have a chat GPT pro plan.
All right. So let's look live. All right. And please keep getting your questions in. I'm just scrolling through the comments. So, uh, thanks everyone, uh, for, you know, getting, getting your questions in. I'm going to try to tackle them, uh, at the end. I'm just scrolling through all the comments, looking for question marks. So yeah, make sure if you do have a question, uh,
Then, you know, get it in. Douglas talking about our show yesterday. Great task write up. I think yesterday was the first day. I saw the link to your post on your website. Yeah. All right. Let's get after it. Let's do some stuff live here. So bear with me, y'all. All right. Here's what we're going to do.
Live stream audience, as always, I never know if this works or if my audio is still coming through. Can you let me know? Can you all see my screen? And can you see what's going on here? So I'm going to explain to you what I'm doing after I get this started. Okay. So I'm going to be copying and pasting a bunch of information in here. All right. Give me a second. There we go.
So this is information that I've exported from my podcast stats, right? I really want to make sure I have. Okay, good. All right. Thanks, y'all. All right. Everyone says they can see. Thanks, y'all. Thanks, y'all. Okay. So
I am in 01 pro mode. All right. I'm going to tell you what these, uh, I'm going to read this to you, but I, I'm going to get it going first because like I said, this might take a couple of minutes. All right. So here's my first kind of a tip, right? Uh, provide a lot of context. I'm going to walk you through the context that I provided, but also something that's changed recently. Uh, uh, I don't know when it was probably a year ago. You can run concurrent chats now.
Generative AI, even 01 Pro, it's generative. You can run the same prompt even when you give it a ton of information. You might get very different things. You might get similar things. So I'm going to go ahead, even though it might slow it down, I'm going to follow my best practices, right? If I'm waiting, I'm just going to wait. So I'm literally running the exact same prompt in another tab. All right. So let's go ahead. Let's check on it here. Okay. And we're going to walk through. So sometimes...
it will give you details. Sometimes it'll tell you what it's doing under the hood, right? And I know I have my token counter here, but the context window is much different. I probably should have mentioned the context window differences because that's important as well, right? So essentially, O1 Pro has a much, much larger context window. So
Uh, let's, let's go ahead and I'm going to read now. I'm going to read now what I actually put in. All right. I'm going to try to go quick, but like I said, I exported, uh, some recent podcast episodes. There's a ton of stats. This is an example. And I want you to think what data or what large amounts of context do you have? Cause this is, I found myself when Oh one came out.
I'm like, okay, I may not be in STEM. I may not be in data analysis, but I have access to a lot of data and either I don't have time or when I am analyzing it, I'm really just looking for the low hanging fruit. And there's probably so much deeper and so many different more channels I could go in if I had time. All right. So I'm going to go fast here.
So I'm saying these are my podcast stats. So remember when I said, when you prompt, Oh one, think of it like that coworker that wants all the information. And then they're going to go in the corner.
So I'm saying these are my podcast stats. Keep in mind, today's date is January 16, 2025. For all questions, always exclude the top 2% and bottom 2% of episodes unless otherwise noted, right? I have a bunch of episodes, their downloads, some other stats, and sometimes there's anomalies, right? Sometimes there's just problems and I don't want those problems to be included.
All right. So already you're seeing that could be a lot of manual work. Even if you're good with business intelligence, you're good in spreadsheets, right? I'm saying also always give the episode number and name. Never one keeping that in mind, please carefully answer and tell me question one, give me the average downloads per episode. Question two, give me the complete list of episodes with the new performance percentage over under of the adjusted average. So I'm asking it.
Find the adjusted average number of downloads. Take out the top 2%. Take out the bottom 2%. Then go give me for each one. I want how it compares versus the average. So let's just say the average was, I don't know, 4,000 downloads, right? I want to see.
The percentage, once you take out the top two, bottom two, I want to see each episode. Is it higher than that kind of median? I don't know. What's the math term, right? Is it higher or lower than that? All right. Then I'm saying question three, give me the top 10 and bottom 10 episodes and their respective percentages that they're over or under the adjusted average. All right.
All right. Question four for the top 10 from question three above the adjusted average, please suggest three slightly adjusted episode titles for each if I were to rerun them. So every once in a while, I'd say maybe, you know, I don't know.
depends anywhere from one to five times a month i'll rerun an episode yeah sometimes i get sick sometimes i can't be here y'all with y'all live you know at 7 30 every single day although i try to uh so i'm essentially saying for the 10 episodes that performed highest above the adjusted average
suggest three additional titles, but don't just like look at it and randomly suggest them. Look at the trends, right? Look at find common themes. So here's where we're really working with structured and unstructured data. This is where it's great to work with a large language model with natural language processing, right? So I'm like, yo, here's hundreds of episodes.
Go find the ones that are really good. And then those top percentage, you know, try to develop, you know, some way to see what's working and what's not, and then apply that to some of these top percentages.
All right. And then I'm saying like, look, find common naming trends. Example, title length, psychological marketing angles, superlatives, word choice, et cetera. Be exhaustive in your pursuit of spotting common and hidden trends. Question five, what are the most common patterns among underperforming episodes?
And how can I avoid them in the future? Question six, how does title length or structure correlate with episode performance? Break it down from every angle you can think of. Be pinpoint specific. Question seven, how does release day impact episode performance? Please exclude Monday's.
is that is usually our AI News That Matters day. And we don't usually run other types of shows on those days for needed context. So even though the large language model should know, I'm saying, you know, giving it, hey, this date was a Friday. So you can make sure you have it correct.
Question eight, how does the release time or hour affect episode performance? Do not group them together. Go individual by hour. Be exhaustively precise and give me a chart that shows hourly performance, right? Sometimes we get our episode out by 8.15. If I stop yapping, today's not going to be that. It's already 8.06 a.m. Sometimes something comes up and we might not get it published until 11 a.m.,
So I want to know hourly, how does that impact it? Then I'm saying, here's one that would take a long time to figure out. I'm saying staying power and average decay. So in this document, when I pasted all of this in there, I didn't upload a spreadsheet. I just pasted it in there. Essentially, it was information from a CSV. But for hundreds of podcast episodes...
It gave seven day downloads, 30 day downloads, 90 day downloads, all time downloads. So what I'm asking here is to essentially figure out staying power and average decay. So saying, Hey, when do average, like across hundreds of episodes, when do they normally quote unquote go stale?
right when do they stop really you know getting listened to because people are searching for these all the time it's not just people like hopefully you subscribe and thank you if you do right but everyone else is searching for podcasts and they're discovering so i'm trying to see which ones have staying power which ones are more evergreen and then i'm asking it uh to show me the top ones because then i can develop new episodes based off that
Question 10, how do episodes featuring specific brands or keywords, example, OpenAI, ChatGPT, Google, Large Language Model, AI, Claude, compare in performance?
Question 11, please also categorize all of these episodes according to what you can gather from the titles. Example, marketing, chat, GPT, enterprise, AI use cases, et cetera. Only put one episode in a category and try to create at least 20 different categories. In doing so, please also give me the average, the category performance versus the averages that we identified earlier, right? Okay.
a lot of this i want to see what sticks what do y'all like what do listeners actually care about right uh and i think after you have you know thousands tens of thousands of pieces of data points yes i can go figure out some of these things with some simple calculations in you know microsoft excel google sheets etc but this is where we're really combining a lot of data but with also unstructured this is bringing in unstructured data
Right. Structured data. So structured data are numbers, things that you can plot on a graph. Unstructured data is words. Right. You can't necessarily plot them. So we're combining structured data, unstructured data with a reasoning model. Right. And I gave it a ton of information. All right. And then I'm saying essentially, you know, I'm giving it some additional information.
encouragement, like how to format it, all that stuff. And then I'm also saying at the end, give me essentially a quick summary. And then here's all the data. So I pasted, I pasted in about 13 pages of those questions and data. All right. That was a lot.
Marie said, how long did it take you to come up with these amazing, with these detailed questions? Amazing. I'm a fast typer. I think all the time. Probably took me, I don't know, 12, 12, 13 minutes to type all these up. So yeah, there's no AI in helping me formulate these questions. We always talk about human in the loop, right? What role do humans have?
And one of the things that AI, and I think especially these reasoning models like O1, O1 Pro, it allows you to really let your expertise shine. One of my expertise, I think, I have a background in journalism. I have a background in marketing and advertising. And I don't know, maybe you saw some of that in play there. This is how my brain works.
I'm like, y'all, we got so much data. I need to be able to identify trends and to build something better to help you all, right? All right. So let's now jump back in and see how our chats are doing. So you'll see it's been, oh, live stream on us. Can you still hear me? I got something that said, can't hear, but let me know if you can. I got something on my screen just said I lost audio. So we'll see. All right. So-
Here is our details. All right. Thank you, Sam Sarah from YouTube. All right. So I can click details. So essentially you can see kind of slash sometimes how the O one model is actually thinking about this.
All right. It's weird. I've done, you know, similar prompts like this and I'll always do AB testing, right? I'll run the same prompts on O1 pro twice. I'll run, you know, the prompt on O1 pro versus O1 normal. I'll run the same prompt on O1 pro O1, right? I do a lot of testing. And even on O1 pro, sometimes it'll give you all of the details. Sometimes it won't. And then it says, Oh, sometimes O1 does better when it doesn't share the details with you. So
Is there ultimate transparency? Not really. But I'd say more times than not, you do kind of get to click that details and you can see kind of what's happening under the hood. So in my other one, it looks like I timed out in my other one. Bummer. So maybe I shouldn't have been doing two at once because this one thought for about nine minutes and then it said, oh, I'm done for. All right. But luckily...
All right, luckily we finished. We finished over here in our first chat. So I'm gonna go ahead, why not, and regenerate the other one. That one thought for about 10 minutes, ran out of steam. So this one, let's see if I can see exactly how long this one thought. Let me go up. Lot of information here, y'all. Lot of information, my gosh. All right, this one thought for 11 minutes and 22 seconds. So I think my...
I think my record is maybe like 15 or 18 minutes. I give it a lot. I give it a lot. All right. So I'm not going to read all of these one by one because it's going to take a while and I don't want this to go too long. But let's just see.
very quick overview how well it did so it's saying below is a comprehensive step-by-step response that follows all of your instructions precisely you know i've counted all the episodes so it's telling me what i've what it did so it took out the top two percent and bottom two percent which in this case was six total episodes computed and listed everything after removing those six right um so pretty good it kind of first gave me
an overview of how it did it. Then it gave me the preliminary steps. So it went through, identified the total episode count, identified the top two and bottom two percentages.
Uh, right. Listed them all. That's good. Uh, and then it kind of said, Hey, then there was 122. I didn't give this all 400 of our episodes because I knew I did testing and it worked fine, but it took way too long and it timed out too much. So I only uploaded, uh, like the last, uh, probably six months of episodes.
Uh, okay. Uh, so here we go. Question one. So now it's getting, I told it to label it. So here we go. Question one, average downloads per episode. So here you did, it did some, a little bit of math. So thank you for that. I don't like math. All right. And then it says answer to question one. Oh, I was, I was about right. About 4,000 downloads per episode. Uh, the downloads are weird. Download streams. Everyone looks at it differently. Um,
So yeah, I think we're almost at like 2 million downloads. So thank you all for listening. All right. So question two, a list of all remaining episodes versus adjusted average. This is what I wanted. All right. So here it says below is the performance calculation for each of the 122 remaining episodes.
Is there anyone smart in math? I don't even know what this means. I don't even know how to read it. I don't know. It created some kind of formula. Sometimes I ask ChatGPT to create algorithms. I just give it a bunch of data. I'm like, create new algorithms for me and tell me things that I can't find out in a spreadsheet. That's fun to do. I didn't do that here. So there's a calculation. So let's see if it gave me the full table. Sometimes it does. Sometimes it doesn't, right? So full table.
Okay, here we go. There we go. So it looks like we have all of our episodes here listed by episode number. It gave me the all-time downloads. It did the performance. So I can see this one right here was about 0.5.
6% below average, right? So I can go through here. I could ask it or I could copy and paste this, uh, and give it to, as an example, like, Oh, one mini or GPT four Oh, and have it turn it into an actual spreadsheet. One thing I realized, Oh, one pro isn't great at is creating documents. I don't even know if it technically has that, uh,
you know, that capability or functionality, but the advanced data analysis mode inside GPT-4.0 is great. GPT-4.0 is great at creating different types of documents. So if I wanted to, I could copy and paste this, but let's see. Yep. There we go. It's giving me, you know, 28% above adjusted average, 18% below, 3% below, 15% below, right? So this is great.
Oh, let's see. It said, it looks like it might've truncated. I had a feeling that even the O one model was, was not going to complete this in its entirety because it says, and so on. So it didn't do all 160. It says due to the length of this list,
to fully comply with your request. This table would extend for well over a hundred lines. I have demonstrated the exact calculation method and the format above. The same format applies for every single remaining episode. Below, I continue the listing in concise bullet form. Each line follows the same pattern, episode number and title all time, then the resulting percentage. Okay, so it did actually go through and do it. It just didn't show me the math.
for each one, which is fine. I didn't need that. All right, here we go. Question three, top 10 and bottom 10 episodes and their percentage over. This is what I wanted to know.
So here's the top 10 overs with their adjusted average over. So when will we achieve AGI that performed well? AI agents, everything you need to know. Top AI tools and features of 2024. How AI agents can bridge the gap to the future of enterprise work. Google's $1 trillion AI mistake, right? So there we go. This is good. I mean,
I mean, again, I could have sorted this by downloads and I could have found out some of this, but I wanted to see how much higher because there's some anomaly episodes where I'm like, okay,
Were these actually, you know, is this a bug? Sometimes, you know, as an example, Apple Podcasts or Spotify Podcasts will, you know, feature an episode, you know, if their algorithm says it's good and then it'll put it on like a top episodes and technology page. So I know sometimes some of our episodes get way many more downloads, but I'm like, I don't really want those. I want to just focus on the guts there. So it did a pretty good job. Bottom 10 episodes. There we go.
Slightly adjusted title names for each of the top 10. There we go. It's giving me all of those. Yep. For each of them, it's giving me adjusted titles.
uh, episode titles and also why, right? That's interesting. I didn't even say why, but it gave me for each of the 10, it gave me some, um, some other episodes, uh, question five, common patterns among underperforming episodes and how to avoid them. So it says titles that are too generic or vague, overly long titles without a clear hook, uh, insufficient mention of strong keywords. And in each of these, it's giving me very specific examples, right? It's not just giving me these general guidelines. Um,
It's telling me how to avoid it. Then a question six, how does title length or structure correlate with episode performance? It's all good. Question seven, how does release day impact episode performance? Let's see. Uh, Tuesdays show moderate to good performance. Wednesdays, Thursday, uh, engagement list because listeners have midweek energy Fridays. It says can be hit or miss. All right. So maybe I shouldn't schedule big shows for Friday since they can be hit or miss. Yeah. Sometimes people check out. Um,
Let's see. I did specifically ask for a table for time or hour. So for question eight, let's see here. Good. It did it. So it gave me the release hour and then the average all time download. So I can see.
Yeah. Apparently sometimes I release some late. That's weird. Some of those might've been bugs. I should have asked him this one to also give me the total number of episodes that have been published in that release hour. Because yeah, sometimes there's bugs with our, you know, our host, we use Buzzsprout. Yeah. Sometimes there's just weird anomalies. So yeah,
I should have asked for the number of episodes, but it looks like for the most part, it looks like maybe our sweet spot is when it gets published by 9 a.m. So maybe when we do publish it super early, maybe it misses people. Maybe people are listening on their commute to work, but it looks like for whatever reason, it looks like that sweet spot is releasing the episode by 9 a.m. This is all our local time.
Um, all right. Question nine, staying power in average decay. This is one I was really looking forward to. Uh, so I'll go through and read this. If you're interested, you know, you can, you can let me know, but it did good. It gave me kind of the average seven 30, uh, 90 and total. It identified, uh, certain, uh, episodes that extended past that. Um,
you know, some of the agent episodes. All right. So did a pretty good job. I wish it would have given me a little more depth on this. Uh, but again, what I would do in theory, I would look at the responses and I would update that prompt that I did and I would just run it again. Uh, right. Cause I can see some of these things. I'm like, ah, I forgot a little bit here. I should probably go back and add, add some question 10 episodes featuring specific brands or keywords. Uh, so they're
There we go. You know, obviously open AI chat, GBT typically see an average of about plus 10 to 30% higher than your overall adjusted average. Google Gemini or quad are about five to 15% higher. Um, large language models that it doesn't really show, uh, you know, any strong difference. Okay. That's pretty good. Uh, and then here's the one that would have taken me for forever, right? Hundreds of episode titles, and then to categorize them and then compare against averages, uh,
So it went through and it gave me, it looks like a list of 20 different categories. It didn't give me the category average download. All right. But it did break everything down by category. All right. And then we have our answer guide here, which I said at the end, just give me very straightforward bullet point answers. So how, how did this do? What do you all think? I know this took a while. Do y'all think a one pro was that worth $200?
Because I'm trying to think if I went through as a human and did this myself, right? If someone gave me the exact same questions, I don't know. It probably would have taken me three or four days, right? I think I probably could have done a little better because I probably would have done a better job inferring certain things. In certain instances, you saw even O1 pro truncated responses or didn't give me full things.
Right. That's frustrating. So what I probably would have done in the future probably would have broken this up.
Right. I gave it 11 extremely difficult tasks. And if I was using GPT-4-0 as an example, I would have done each of those as dedicated chats or, you know, taking them, tackling them one by one and going back and forth with chat GPT, at least probably three to 10 times on each of those 11 questions. Right. So, uh, from a time savings perspective, I think absolutely, uh,
Would it beat me? Maybe not. Although on the math and some of those more complex things, absolutely. I wouldn't have known, you know, especially without AI, you put me in a, on a computer with, with, you know, I can't use AI, you know, and just a spreadsheet only. I don't know if I could have gotten these answers. Right. And I'm decent, decent at basic math. Right. I have an analytical brain, um,
Obviously me knowing this is everyday AI, I run it, right? But if someone else came to me with this same data and said, you can't use a large language model, or they said, you can just use GPT-4.0, I think if I would have had to do it by myself, it would have been at least three or four days. If I would have used GPT-4.0, it probably would have been, I don't know, I'm guessing three to five hours.
Uh, because it would have required a lot of back and forth for each of those 11 questions. You have to worry about context window, uh, you know, to get that kind of quote unquote chain of thought reasoning. You, the human have to be the one pushing that chain of thought button, right? You have to be the one giving you examples, going back and forth, steering and guiding it. Whereas, you know, Oh one is more of like that full self-driving car kind of guides itself.
right? Uh, with, when the GPT family models, you have to do that. So if I'm being honest, so we got this done in 10 minutes with Oh one, it probably would have taken me three or four hours with GPT four Oh, and it probably would have taken me a couple of days if I just had, you know, the internet and spreadsheet and no AI. So is it worth it? I don't know. I don't know for me.
This isn't perfect, but what I would have done is I would have went back through those responses. I would have updated my prompts and I probably would have obviously broke this down into, uh, you know, three or four questions. It was too much for it to handle, even though it was within the, uh,
context window, you know, I didn't kind of, you know, put in too much context, it was a little too much thinking, right? Or, you know, there's probably something in open AI is training that's like, hey, when someone asked for, you know, hundreds of things, you know, in if it's part of a multiple other queries,
just, you know, showcase your ability to understand. Right. So I could have that, that one where it kind of cut things short. If I just would have done just that one question and given it to a one, it probably could have done it, but I gave it 11, 15,
fairly difficult questions that required a ton of response. So I do think this isn't a capabilities thing. This is more of a compute and training. Oh, one pro probably could have done this right in its entirety, but I'm sure there's some things that open AI has worked in there to say, Hey, you know, at a certain point, if there's, you know, this many questions and all the questions are multi multi-step, maybe you have to truncate. I don't know.
All right. There were a couple of questions. Let me see if I can get to them very quickly because I made you wait to the very end. I'm just scrolling through. If I see a question mark, I'm starring it. All right. Let's see. Dennis, if you have teams, can you upgrade a single user to pro? No, as far as I know. I'll ask my contacts at OpenAI. I did ask them about this like three weeks ago because I have free plus teams enterprise in pro accounts.
I don't have the option to upgrade anything on Teams. So as far as I know right now, the $200 Pro, which gives you a one pro is only available for individual users. Actually, the last time I checked on that was
like a week or two ago. So I should go back and double check. But previously there was no option to upgrade teams. And I'm not sure about enterprise accounts because any enterprise account I'm on, I'm not the kind of admin of that, but you know, I'm an individual enterprise user.
So yeah, people don't know that someone DM me on LinkedIn. They're like, oh, you do trainings. I'm like, yeah, that's what we do. So if, if, if your team, whether you're on, you know, chat GPT teams or chat GPT enterprise or co-pilot, right. That's what we do. We train people. I talk about AI every day. And you know, if your company, if your department needs help, uh, you know, you can call us in. Uh, all right, let's see.
I think Michael might have been asking this to someone else, but, you know, asking about GitHub Copilot. Yeah, there's other, you know, Cursor, GitHub Copilot. You know, there's other platforms that do great for some of these things, you know, database coding, software engineering. Yeah, I think Cursor, Microsoft, GitHub Copilot, great.
Uh, uh, Kieran says, isn't time taking to respond to con? Absolutely. Right. Uh, but that's why I'm generally, I'm not just giving an 11 minute task to chat GPT and then, you know, sitting there sipping my Nespresso and, you know, judging it. I'm doing other work, right. I'm opening another window, another account, you know, putting something similar in, in Claude or Gemini AI studio. Right. I'm always running things in parallel, especially when I, you know,
go through that time to put some, to put this content together. Obviously I would have to break it down into smaller chunks for non reasoning models, but yeah, Kieran, absolutely. It's a waste, not a waste of time, but the time it takes is a con, right? Especially in like, we're in this society where we want everything now, right?
I don't want to wait 11 minutes, but I waited 11 minutes and it did, like I said, probably work that would have taken me either multiple hours with GPT-4-0 or days without any AI. So is the time worth it? Patience is a virtue in doing things right. Uh,
pays off in the age of, you know, this instant gratification, right? Because now if I wanted to, I would go back. Like I said, I would improve that info that I gave a one. I would probably break it down into, you know, two or three, and I'm sure it would go, I don't know if I had to grade it, I would give it a 85%. If I broke it down, improved, uh, improved how I asked the information, this was user error.
That's user error, right? I gave it too much information. Although I think for that part, it should have been able to handle it. But for a lot of the other things, I'm like, oh, I should have worded that differently, right? I didn't do a good enough job. People always think an output, that means, oh, like, chat GPT sucks. It's dumb. No.
In that case, I was dumb. I didn't do a good enough job. Some of my communication was not precise enough, but sometimes you only know that by going back and forth. And I love being able to look at the details and seeing how ChatGPT, that's a cheat code. If you are using the O1, even on the ChatGPT plus plan, look at how it's reasoning. Look at those details. That's going to improve how you communicate with a large language model, because if it's struggling with something, you know, if it's half
halfway through the process or in the first 10, 20% of the process, if it's already tripped up, guess what? Then it's going to get even worse. So you might need to move some of the information from the bottom up top, you know, provide a better summary, you know, give it a, you know, more clear role, priority goals, all that stuff. Right.
So yes, it is a con. Marie, does it go down any rabbit holes with its chain of thoughts reasoning? It depends on how open-ended your input is. With an open-ended, yeah, absolutely, right? Sometimes for fun, I say, solve the world's problems. Solve hunger. Solve violence, right? Like solve...
inner city or whatever, right? Solve this big problem, right? And then I like to see it think. And I think that has more to do with the training of the model than the model's capabilities. But yeah, it can go down rabbit holes if you give it the opportunity. In this case, there was no rabbit holes because it was pretty well refined and defined, right? Then
Uh, Juliet said, sorry, I'm not up to date on the lingo. I have the $20 paid subscription to chat. GBT is that pro? I think someone already answered that, but no, uh, $20 chat GBT plus you get the, uh, the general Oh one model. It is very limited in terms of the amount of work that you can do with it. If you want the Oh one pro you have to be on the chat GBT pro account, which is $200 a month. Um,
Fred, do you compare different models all the time? Yeah, I think I probably answered that later in. All the time I compare different models. So I like using the AI arena chat, whatever it's called, lmarina.ai, right? The AI chatbot arena to do that. I've shared some videos on how I use a tool called the chat hub a lot where you can put one prompt in. It'll give you up to eight different large language models. So yeah, I compare model responses all the time.
Ada, can it access your website and do the analysis from your website? So the O1 series of models, at least right now, do not have access to the internet. They also do not have access to the full suite of tools. And that's a good way to end this, Ada, because I'm going to say this. I have a prediction coming next week in one of my shows.
on the future of the O1 models and what that means for not just agentic AI, but what it means for AGI. Because I do think once you start giving a model like this that can reason, once you start giving it tools, once you start giving it agency to make decisions on what tools to use, how to go about solving a problem. Right now, what O1 Pro can do, it's kind of in a box.
And I get what OpenAI is doing there. They're doing it for safety, right? This is the first widely available reasoning model and it could go off the rails, right? And you can't jailbreak models like this. So I get that they're not giving it tools right now. You know, they're working on artificial general intelligence. They have their site set on artificial super intelligence. So I get why they're keeping it in the box right now. But as soon as this O1 Pro model gets a little better, this is the first version of it, right? The first version is always the worst.
It's only been out for a couple of weeks and a couple of months after they've updated this once or twice. And when and if it does get tools, if it gets agentic capabilities, you know, we're talking about open AI's operator when that's coming out, the new tasks. It's an extremely exciting time to be on the cutting edge of AI. And that's what you're doing here. So thank you for joining me. I hope this is helpful. I know this was a longer show, but there you go. Let me answer this. Is it worth $200 a month?
I'm going to go ahead, opening AI is not paying me. I'm going to say yes. I'm going to say anyone that has access to data. So I'm not saying that you need data for your job or that you have a job in data. If you have access to data, if you are a decision maker, right? If you are a knowledge worker, I'd say it's 100% worth it. You just saw my use case, right?
I can make it better. But the value that I get from there, what I just saw in there, that's going to help me grow my podcast, right? That's going to help me bring in, you know, other great sponsors like Microsoft, right? Microsoft is one of the sponsors of this podcast. That's going to help me reach more people who want to learn AI because these are all insights that would take me so much longer, right? If you have data and you need to make decisions, right?
And you understand the basics of the O one reasoning model, a hundred percent worth it. It's a hot take. People are going to disagree with me, but I think it is.
People are going to say, Oh, GPT-4-0 is enough. Try doing something similar with GPT-4-0. Time is money, y'all. Can it accomplish the same things that I just showed you in 01? Yes. But like I said, it probably would have taken me three to four hours. And even during that time, I couldn't have done anything else, right? During the 11 minutes I had to wait, I didn't have to do anything, right? Yeah. I need to improve it and go back and iterate. But I think if you have...
Data to work with if you have to make decisions and if you learn the basics, it's 100% worth it, even at that steep price tag.
All right. I hope this is helpful. Make sure to join us Friday or sorry, Monday, January 20th, all week, five episodes, our first series we've ever done. You need to listen in. You need to pay attention. Thank you for tuning in. If this was helpful, please let us know if you're listening on the podcast. Sorry, this was a long one. You can listen to me on two X. I'm not going to be
mad. I would too. All right, but please leave us a rating, follow the show on Spotify or Apple Podcasts, wherever you get your podcasts. If this was helpful, you're listening on LinkedIn, please share, repost with your friends, someone who needs it. Thank you for tuning in. Go to our website, youreverydayai.com. So I'll see you back tomorrow and every day for more Everyday AI. Thanks, y'all.
And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit youreverydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.