We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode (Preview) Experimenting With OpenAI’s Deep Research, Another ChatGPT Moment, Won’t Someone Think of the Entry-Level Employees?

(Preview) Experimenting With OpenAI’s Deep Research, Another ChatGPT Moment, Won’t Someone Think of the Entry-Level Employees?

2025/2/6
logo of podcast Sharp Tech with Ben Thompson

Sharp Tech with Ben Thompson

AI Deep Dive AI Chapters Transcript
People
B
Ben Thompson
创立并运营订阅式新闻稿《Stratechery》,专注于技术行业的商业和策略分析。
Topics
Ben Thompson: 我使用 OpenAI 的 Deep Research 功能对苹果的财报进行了分析,结果表明该功能可以生成高质量的分析报告,足以满足小型投资者或初级员工的需求。报告组织结构清晰,逻辑严密,但缺乏新颖的见解,内容也存在冗余。Deep Research 的输出质量取决于输入提示的质量,清晰的提示可以得到更好的结果。它擅长填充既定的论点,但仍不擅长自己生成论点。 就目前而言,AI 是否能够产生真正新颖的见解仍然是一个悬而未决的问题。Deep Research 的能力目前还不足以取代人类分析师的工作,因为人类分析师仍然具备 AI 无法替代的洞察力,例如能够迅速识别关键数据并得出结论。 然而,许多白领工作并非需要创造性洞见,而是需要整理信息并将其以易于理解的形式呈现给决策者,AI 可以胜任这类工作。Deep Research 通过网络搜索信息并将其整理成报告,能够高效地进行网络搜索,并找到一些难以自行找到的信息,从而节省大量时间。Deep Research 的效用取决于用户提供的指令的清晰度和细节程度,清晰的指令可以得到更好的结果。

Deep Dive

Chapters
Ben Thompson tested OpenAI's Deep Research by analyzing Apple's earning. Although the analysis was good and helpful for a small-scale investor or a new employee, it lacked novel insights and the writing could be improved. The output was well-organized but didn't offer groundbreaking analysis.
  • Deep Research produced a solid report on Apple's earnings, suitable for a junior analyst's report.
  • The analysis lacked novel insights; the critical insights were provided in the prompt.
  • The writing, while organized, wasn't concise and lacked momentum.

Shownotes Transcript

Translations:
中文

Hello and welcome to a free preview of Sharp Tech. ♪

And so my second prompt to deep research was give me an analysis of Apple's earnings. And by the way, I think these are the three most important points. Number one, just the services bit. It's worth calling out. Number two, the China problem, and it probably goes back further. And then number three, the Apple intelligence story seems overstated XYZ. Analyze the earnings with sort of these points in mind. And I thought that analysis was really good.

It still wasn't, you know, fortunately, I think as good as my analysis, the writing, it goes on too long. It's not really dense in information. I think one of the trajectory strengths is, um,

a good trajectory article is every sentence is sort of, I think there, I think good writing just has its own, its own momentum and every sentence should be pushing you forward and adding, layering on new information and not, there shouldn't be sort of emptiness sort of in there. There's a quite a bit more emptiness sort of in, in, I think these results and,

And I also don't think there was any novel insight. It was insightful, but that's because I infused the critical insight into it with the prompt. And even then, I think it's still got a couple of things. I, I don't think we're right. Um, that were sort of based on more conventional wisdom, or I told it to reference old trajectory stuff and would like over-reference some sort of old things that said, um,

I thought it was good. Like it was like, if I were working in say, you know, or I were a small scale investor or something, and I wanted an analysis along these lines, like,

Like if I were to have sort of a research intern or a new employee and I wanted them to generate a report, this is a solid report. Is it right? If you're going into a meeting and want to be briefed on the issues and capable of holding a conversation about Apple's earnings and Apple's future in China, like this report is

does the job in a pretty solid way. It's not stratechery analysis, but it was also very biased to say it's not just tech analysis, but, but, but there's a few, there's a few takeaways here. And actually your, your, let's come back to your example, like going into a meeting. Cause I think that's another one that I want to get to that I think was really beneficial to me. The, it's just the,

And this is the weird thing. Like this takes me, it took me back a bit to the original chat GPT moment where it was just so startling that it wrote in paragraphs and sort of like, uh,

was well-rounded that it took a few days to sort of dive deeper and see like what is actually like, is it actually making sense here? It's saying a lot of stuff sort of very confidently. There's an aspect of the quality of this output that is well-formatted is well-organized that,

is sort of cogent in a way that it can potentially play tricks on you. And I have another example of where it sort of really dropped the ball, but it matters. It's very convincing. And like, and the, the sort of the, the way that I, I,

To me, this is the reasoning aspect of O3 really coming out. It's going back. It's recycling. It's going through all its sources. But it's reorganizing itself to be very cogent and consumable in a way that's very convincing. And if you think about the concern about...

You can see how this could be a problem where it just – Well, hallucinations can be dangerous because it's also very comprehensive and thorough, and so you get lulled into a false sense of security with the output. That's right.

It's also probably all true. Yeah, there are citations and sort of all those sorts of things. But I think just a few takeaways on those two bits were the quality difference between – so the overall writing and organization was pretty solid between both of them. The insight level between the two I thought was pretty stark.

But that insight came from the prompt. That is always been the case with LLMs. The quality of your output is downstream from the prompt, but that seems dramatically expanded in this case where if you give it a really clear direction, a clear thesis, and I think you put my line in here in the show notes, like the capability of deep research to fill in a thesis is really strong because

Its capability to generate a thesis is still, you know, there really wasn't a thesis in the first the first summer. That's good. It was just a relief. We don't we don't want it generating theses going forward. That's our human's job, at least for another six to 12 months. But but to your point, what does this mean? Sort of generally like like for how and how?

This is a conversation that in some respects, I have more to say about it than anyone. I am basically a professional thesis generator in some respects. On the other hand, I'm heavily biased because no matter what I say or what I try to think, obviously my sort of livelihood is at stake here in a certain respect. So take whatever I say with a necessary grain of salt. But then number three is,

The market evidence is that I am a good thesis generator. That's why people subscribe to Shishakari. So by definition, a lot of people aren't because they're not making the profits that I'm making. Right. Like I feel my Midwestern sheepishness is coming out here. Like, like, but it's very tangible. I don't know where you're going with this. It's very tangible to this conversation. When you think about what does this mean for work?

Because I still feel pretty secure. And I think to me this only accentuates the big question about AI and can it generate truly novel insights? Could deep research – how many versions of deep research, how many models are we going to need to get to now?

to have the insight. And for me, the insight was immediate. The moment I saw those China numbers, I knew I'm going to make a chart about Apple's revenue in China. That's going to show that the growth was very flat or declining. It shot up during Huawei and then it was flat and declining afterwards. Like, yeah, like now, and that, that is again, I'm the sort of human LOM in this regard, but there is some aspect of,

I think this bit about hallway and Apple ships, of course, people talk about it now, but that's because I raised it immediately. Like, like when people were over the moon, when Apple's results shot up, I'm like, no, this is because of, because of Huawei. And when are we going to get to the point? What's going to be necessary for that to be generated from the LM, as opposed to me infusing that into the LM on the other hand. And this is where I come back to the market points because,

That's why I have a livelihood writing about this because that's what I do is think about those things. Again, I feel so sheepish saying this, but again, that's why I point to sort of the market results. Like people clearly find my thinking about this valuable, but how many jobs and how many people who are writing reports, that's actually valuable.

why they're writing reports? Are we looking for novel insights from the vast majority of junior analysts and the vast majority of entry-level lawyers or paralegals or whatever it might be? Or is there actually a huge aspect of the economy of white-collar work that is about going out, doing some research, collating some data, putting it in a digestible format for people

above them, their executives or whatever it might be to bring to bear their knowledge and sort of make a decision on it. Yeah, no, exactly. Well, and I want to just ground people in what this product is because I would imagine most of our audience aren't subscribed to the like elite tier. I don't even know what the open AI tiers are, but the $200. Yeah. Plus $20 pro is $200 a month. This,

is available to pro users right now. Sorry, this is the Rich Versus podcast. You can read sample output if you're subscribed to Stratechery. It doesn't cost $200 a month to check it out. But it's...

Input a prompt and to your point like the tool becomes more useful, the more detail and structure you provide and so in that respect, deep research is like having a real employee like if you give an employee an assignment, and it's vague and ambiguous you're going to get back mediocre work.

And if you give an employee a very clear assignment with what you're looking for, chances are you're going to get back better work. And that's the way deep research is operating. And this is a tactic in that it's going out and surfing the web. Basically, that's what it's doing. And it's finding stuff.

One of the things that's remarkable to me is it's really effective at searching the web. I don't know how they distinguish between sources, but like I was recording Sharp China today. So I asked Deep Research to generate a report about the recent history of China's activity in Panama and why that has been setting off alarm bells with U.S. leaders.

Number one, the work product was a nice little overview of the shifts that have taken place since 2017 when Panama severed ties with Taiwan and established diplomatic relations with China. But it also took me to an in-depth article that provided even more backstory in terms of the security concerns the U.S. has and some of the controversy the PRC has created locally in Panama.

And I honestly don't think I would have found that article on my own in part because it's, it was just sort of an arcane source that was based in Latin America. And most of what I learned did not make it into the show. Spoiler alert for anybody who hasn't listened to sharp China. Uh, but,

it allowed me to relay various bullet points from that follow-up article that provided context for people who have no idea why the Panama Canal is now a source of controversy in the U.S.-China relationship.

And it also took what would have been like an hour and a half of research and reading about the history and condensed it to about 20 minutes. And I didn't have to go like wading through the Google morass to find a couple really good articles.

and deep research did it for me. That's the sort of thing that could be useful to literally anyone in any industry. And it's already doing it at a really high level. And again, it provides nice little links and footnotes for all its findings, which makes the follow-up sort of a natural next step as you consume any of these results. Yep. No, no. The question though, you mentioned the Google morass. This is...

One of the real challenges facing deep research.

All right, and that is the end of the free preview. If you'd like to hear more from Ben and I, there are links to subscribe in the show notes, or you can also go to sharptech.fm. Either option will get you access to a personalized feed that has all the shows we do every week, plus lots more great content from Stratechery and the Stratechery Plus bundle. Check it out, and if you've got feedback, please email us at email at sharptech.fm.