We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode The De-democratization of AI: Deep Learning and the Compute Divide in Artificial Intelligence Research with Nur Ahmed

The De-democratization of AI: Deep Learning and the Compute Divide in Artificial Intelligence Research with Nur Ahmed

2020/12/5
logo of podcast Last Week in AI

Last Week in AI

AI Deep Dive AI Chapters Transcript
People
N
Noor Ahmed
Topics
Noor Ahmed:AI领域的权力、数据和技能集中在少数几家公司手中,缺乏系统性证据。研究旨在探讨AI是否正在民主化,主要参与者是谁,以及大型公司或组织的活跃度是否对其他组织产生影响。研究方法包括分析顶级计算机科学会议论文的作者机构,使用合成控制法进行统计分析,并结合ImageNet 2012竞赛的结果,探讨深度学习和GPU的结合对公司和组织参与度的影响。研究发现,大型科技公司和顶尖大学在AI研究中的参与度显著增加,而排名靠后的大学则有所减少,大型公司主要与顶尖大学合作。大型公司在深度学习研究方面领先,而排名靠后的大学则更活跃于传统的机器学习方法。这表明深度学习和计算能力可能是导致非精英大学落后的原因。研究结果也揭示了AI研究中多样性的下降,以及对非精英大学和不同背景的研究人员的不利影响。研究的局限性在于仅关注了主要的计算机科学会议。未来研究方向包括解释公司参与AI研究的原因以及这种参与带来的影响,例如对初创企业的影响,以及计算能力差距是否会加剧数字鸿沟等。 Andrey Krenkov:作为访谈者,Andrey Krenkov主要对Noor Ahmed的研究结果进行提问和总结,并就研究结果的意义、影响以及未来研究方向与Noor Ahmed进行探讨。他强调了研究结果的意外性和重要性,并指出政府干预和政策解决方案的必要性,以促进AI研究的多样性和包容性。

Deep Dive

Chapters
The study was motivated by concerns over the concentration of power, data, and skills in the hands of a few companies and the need to understand whether AI is becoming more or less democratized.

Shownotes Transcript

Translations:
中文

Hello and welcome to SkyNet Today's Let's Talk AI podcast, where you can hear from AI researchers about what's actually going on with AI and what is just clickbait headlines. We release weekly AI news coverage and also occasional interviews, such as today. I am Andrey Krenkov, a third-year PhD student at the Stanford Vision and Learning Lab and the host of this episode.

In this interview episode, we'll get to hear from one of the authors of a recent paper with the democratization of AI, deep learning and the compute divide in artificial intelligence research. And that author is Noor Ahmed.

Noor Ahmed is a strategy PhD candidate at Ivy Business School at Western University Canada and a research fellow at the Scotiabank Digital Banking Lab. His focus is broadly on innovation and computational social science. Currently, he is exploring artificial intelligence. Thank you so much, Noor, for joining us on this episode. Thank you so much for having me.

All righty, well, let's get going. Our focus will be this recent paper with the democratization of AI, deep learning and a computer divide in AI research, which just came out a month ago. And you started with a quote that I think is quite informative from Yoshua Bengio. The quote is in AI currently the power of expertise for data are all concentrated in the hands of a few companies.

So before we dive into the details of how you explored this message in this paper, how about I just let you let us know what motivated the study and the main questions you sought to address in it? Sure. So as you quoted Benji's, this interesting comment that the power, the data and the skills are all concentrated in the hands of a few companies, a lot of

experts were actually talking about this kind of issues. So they were concerned that maybe large corporations or large companies are more active than most other organizations, but we did not have any systematic evidence that that is the case. And also people have been talking a lot about how we need to democratize AI. We need to make AI more accessible to everyone.

In that way, we can make the world a better place because AI is a pretty powerful technology. So basically, we wanted to study whether is AI being more democratized or not? Who are the major actors that are contributing to AI research? And if certain companies or organizations are being more active than their past,

Does it have any consequences for other organizations that have been active in research? So we wanted to explore these questions and that's why we wrote the paper basically. Makes a lot of sense. I think a lot of us working in AI know sort of anecdotally that this seems to be true. But of course, as you say, it's good to have actual solid research to back that up. Exactly.

Maybe you can let us know at a high level, what was the approach you took to actually quantifiably answer these questions? Right. So what we did was we wanted to examine the major computer science conferences, like what's actually happening there. So first we consulted csranking.org, a very well-known website among computer scientists.

So they have listed out around 60 or so conferences that are relatively comparable in terms of prestige, submission, submission number and quality as well. And these conferences actually affect professor assistant professors tenure decisions.

So we wanted to see like, OK, let's compare what's happening in AI with other non-AI conferences that would allow us to, you know, find some evidence that maybe something different is going on in AI or maybe not. So we just wanted to explore that.

So to give you a high level overview, if you just look at the two figures in our paper, the first two figures, I think that would actually convince a lot of people that something interesting is going on in AI relative to neuron AI. So the first two figures basically show the share of papers that have at least one co-author involved.

with the companies and we find that only in AI conferences we have a consistent trend, which is that large companies have actually increased their presence in these 10 computer science AI conferences. On the other hand, we do not find any evidence in these descriptive analysis that companies or large organizations have increased their presence in non-AI conferences. So basically we have 10 AI conferences

and 47 non-AI conferences. And we also have done some statistical analysis to give more solid evidence. So we used a statistical method known as synthetic control method. What it does is it creates a counterfactual for your treated unit. In our case, it is AI with your non-treated units. In this case, it is non-AI conferences.

So to use this synthetic control method, we used ImageNet contest 2012's result, which was surprising to a lot of computer scientists as well as a lot of industry observers. So people did not expect that deep learning would work better with GPUs and deep learning would produce such astonishing results.

And since then, since 2012, actually, we observed that a lot of companies, organizations started to use deep learning in their products, in their research. And that's why actually a lot of people are calling modern AI from like they're dividing the whole AI into two eras. So they're calling modern AI started from 2012 because of the ImageNet contest surprising result.

Thank you. Yeah, that's quite clear. And I do recommend readers, you can look up a paper and take a look at those figures yourself. And as you say, they're quite convincing. So I'm looking at figure one right now, and you can see over the past decade, let's say for NeurIPS, one of the biggest conferences, it looks like the ratio of papers with at least one company co-author jumped from something like, I don't know, 0.2 to more like 0.4.

right so 40 percent of all papers at this conference have industry affiliation which uh i believe comparing to the other non-ai cs conferences as you say is super high and it hasn't been arise exactly i mean uh if you look at kdd right so it's less than 40 percent to more than 50 percent uh uh in in 2019 basically like at least uh

one quarter was affiliated with large companies. I mean, that's pretty interesting and surprising. And to some people, it might be concerning as well.

Definitely. And speaking of the results that you found in your analysis, part of it had to do with increased presence of large companies. But you also did some analysis comparing the presence of elite universities. So by some ranking the top 50, you know, research universities and then there were universities that are less elite. So they were ranked more like 200 to 500.

And yeah, can you let us know a bit about the results that you found for those that area? Sure. So broadly, we found was, yes, companies in general, our farms in general, they have increased their presence. And in particular, within that, we find 46 large technology companies have increased presence significantly. And we also find that elite universities who are ranked from one to 50, we use two different rankings. We use

two different years as well. And the results are pretty reliable. So we find the top 50 universities in the whole world, they have increased presence significantly in AI relative to non-AI. And we find that large companies are mostly collaborating with elite universities. So the potential explanation is that

Large companies have computing power and data. On the other hand, elite universities have talents, human capital, who actually have expertise in deep learning, and which is why they were able to increase their presence. Unfortunately, what we find is that universities who are ranked between 201 to 500, we find that they have actually lost ground in AI research.

So the results are surprising in a couple of different ways. In particular, innovation research over the last seven to 10 years actually has documented that corporations have actually reduced corporate R&D. However, we find that actually that is not the case, at least in AI. At least in AI, corporations have increased their presence. In particular, large companies have increased their presence in AI research.

So that is a very surprising result for more innovation scholars.

The results are also concerning for, you know, non-elite universities who are ranked from 201 to 500. They're losing ground. That is also concerning for a lot of people because some of these universities actually represent diverse population from not only from the U.S., from the whole world. So that means that actually AI research is being less democratized and AI research is now less diverse than before.

And when you say they're losing ground, is that in terms of collaboration with big firms or is that also just in terms of numbers of papers they publish?

Yeah, so what happened with non-elite universities is that they have slightly increased their presence. But when you think about the overall number of the papers that are being published and relative to their counterfactual, they have actually reduced their presence pretty significantly. Now they're publishing around, say, universities who are ranked between 201 to 300. Now they're publishing around 25 percent less papers than their counterfactual.

So that is pretty concerning now because now we are having less diverse presence in these top AI conferences. I see. And I also found it very interesting. You have figure eight in the paper, some more analysis where you actually looked at selected keywords from AAAI, one of the big conferences, and showed that

There is some distinction on what types of research these companies do. Can you speak a bit to that?

Sure. So because AAAI, as you know, is one of the most well-known AI conferences, and we wanted to analyze what's happening, what kind of research large companies are doing, what kind of research elite universities are doing, and what kind of research non-elite universities are doing. So we have done some TF-IDF analysis. So this is basically how prominent particular keywords are among your papers. So when you look at the result in figure eight, we find that actually...

Large companies are pretty ahead in terms of deep learning research. So in convolutional neural network, machine learning, recurrent neural network and things like that. In those methods, they are well ahead of, say, non-elite universities and also slightly ahead of elite universities. On the other hand, non-elite universities are more active in, say, traditional machine learning methods such as support vector machine learning.

and things like that. So they have not been able to catch up with large companies as well as elite universities. So we think this provides some suggestive evidence that deep learning and in particular access to computing power might be a reason why non-elite universities are lagging behind in AI research.

Definitely. And I also really like figure seven where you illustrate to share papers or have at least one co-author of different groups for deep learning papers. And basically it's a fairly obvious trajectory where like at the top is the top 50 universities and firms. So over time, those were doing all the deep learning research.

And much fewer from the lower ranked universities, like 100 to 200, 300 to 400, have co-authors from these less elite universities, which again suggests that to do cutting edge AI research, which nowadays involves deep learning and compute, there's a disadvantage there. And as you say, there's de-democratization there.

So given these conclusions, what kind of implications did you touch on in the paper? Sort of what should be done given these conclusions? So computer scientists are saying that AI research actually needs a lot of diverse people so that the tools that we make that are, you know, helpful for everyone.

So unfortunately, I think our results suggest that the opposite might be happening, that AI research might be becoming actually less diverse because companies are actually less diverse than non-elite universities. We know like if you just look at American universities, non-elite universities are

are more diverse than elite universities and they are more diverse than most companies, in particular large technology companies. So if we have more researchers from those non-elite universities, they might actually represent the real population rather than a selective group of privileged people.

So that's the first concern that we need more diversity in AI research and our results actually point to the opposite direction. And second is that this is the first concrete evidence in our opinion that, you know, that governments may have to step up their efforts with regards to, you know, computing power. Like already, like Stanford already has a center which, you know,

put forward this proposal that American government should have a national research cloud. But till now, we did not have any evidence that actually we need that. So our paper actually provides support for that. And also,

you know, that we have now more companies who are involved in AI research, but that doesn't mean it's actually bad. So it could, it should give us some, you know, it should concern us a little bit, but it can also be good because in the past companies have contributed a lot to basic research and development. So, but we also do not know like what are the consequences. And now that these large companies are far more active than most other organizations,

So we need to have more research. So basically, our paper is a call to economists, to sociologists of science and other researchers that we should we need to study more what's going on. What are the consequences? And now that elite universities and large companies are more active in research, what are the consequences? We need to know actually know more about this.

Definitely. That makes a lot of sense. And I'm curious, with respect to those implications, did you find that this trend is still ongoing? Is it still, is the divide still growing or is there a kind of plateau? What can we expect as far as next year, let's say? Next year, like 2020 or 2021? Like 2021, you mean? Yeah.

Yeah, like is the trend still for the gap to grow or is maybe that's slowing down a bit? So our data suggests that we should expect the divergence to grow a little bit more for next couple of years at least. That's what our data suggests.

So, again, it's harder to predict what's going to happen over the long run. But in the short run, we probably should expect that large companies and elite universities will sort of, you know, diverge away from non-elite universities in terms of AI research.

Yeah, and that's very interesting to me because as you've demonstrated, the gap is already very substantial. So not only is it already there, but it is still growing, which does imply that probably there should be government action or some sort of policy solution to allow for more diversity and then not sort of leave only the elites to do the work. I'm curious.

When you set about this project and you crunched the numbers and got the results, were you surprised by the degree to which there is this gap and there's this distinction?

Oh, yes, definitely. Like we were definitely surprised that the gap was so significant. We did not expect. I mean, we had a hunch that probably we would observe that elite universities, because they have those trained computer scientists, would have higher presence. And like because we knew a lot about Google and Facebook, they were super active in research. So maybe, yes, these few companies would exist.

be active in AI. However, we did not expect to the extent that the presence they actually have. So without doing, I think, a solid analysis, we would not know, like, actually the divergence is so significant. Like, even if you remove all these large 10 companies who are active in AI research, even after removing them, actually you would find that in general companies have increased presence.

So something is interesting going on in AI, which actually, you know, pulled or pushed. I don't know what's the main reason why these companies are actually active in AI research. So we need to investigate that. Like why beyond just large companies in general, companies are active in AI research. Yeah. And on that note, actually, I was going to ask, and I'm curious, um,

are there any limitations in this paper and are there any sort of yeah next steps so you just mentioned one and looking into why firms are doing so much ai research what are some other future directions you might take with this sort of research of ai field

So yes, of course, every research has its limitations for our research paper in particular. So we do not have like data on other conferences or journals. So we limited our attention only to major computer science conferences.

Because it's harder to find actually comparable conferences because we could have gotten more data on AI conferences or other journals, but we did not know how to get comparable data. So that's one limitation that we are only focusing on major conferences in computer science. So, I mean, we would definitely request others to keep looking at such kind of publications, like whether the results hold in other areas in non-elite areas

conferences and journals or not. So that's definitely worth looking into. And as for us, like I am actually trying to explain why in general firms have increased their presence in AI. So because I'm a PhD student, I will be on the job market soon. So I am actually in my job market paper, I'm trying to answer why firms have increased their presence in AI research.

So that's what I'm planning to do immediately. And also, I would like to continue my research on what are the consequences of their increased presence.

So we hope to do more machine learning based text analysis and other advanced statistical methods using those methods. We want to know actually what might, you know, what are the consequences now that we have a small group of people who are more active in AI than before. So we hope to continue this line of research in near future.

I see. Yeah, that makes a ton of sense. And I personally look forward to seeing this sort of analysis because given the size of the gap that you point out here, certainly there should be some effects observed and it should be kind of looked into.

I think we touched on a lot of aspects of the paper that I've seen, and I think we've gotten a pretty good overview to the listeners. Are there any things from the paper that we haven't touched on that you think maybe you also want to mention or point out?

Yeah, so I would like to talk about two particular policy implications, maybe. One is for startups, what does this mean? Like if startups do not have the resources to get started in AI, could that hurt innovation? So that's another research line actually, you know, policymakers and other social scientists can actually look into. So I think it is important to know for, you know, for the future of the innovation ecosystem in AI,

And also another line of research might be, is compute divide, could be a new form of digital divide? Like think about all the other universities that are not even ranked in these top research rankings. Like what's going to happen to them? Like, are they going to fall way behind because of this divide?

You know, requirement of higher amount of computing, large amount of data or well-trained computer scientists. What's going to happen to those universities and what are the consequences of that for those countries, say developing countries around the world?

So will there be negative consequences because of that? So so there are those concerns as well. So maybe international organizations can think about how to help out those other universities and those other actors to catch up with these elite universities and large companies in AI research. I see. Very interesting.

So I think with that, we can actually go ahead and wrap up. That was a great, interesting overview. I hope listeners enjoyed that. And thank you, Noor, for joining us on this episode. Thank you for inviting me.

Great. Thank you. And listeners, once again, the name of a paper is the D-Democratization of AI, Deep Learning and the Compute Divide in Artificial Intelligence Research. You can Google it and actually go through it yourself and look at some of the additional figures. It gets a little bit technical, but I was able to get a lot out of it. So I think anyone listening should also be able to browse it.

And that, thank you so much for listening to this episode of Let's Talk AI. You can find articles on similar topics to today's and subscribe to our weekly newsletter at ScanItToday.com. Subscribe to us wherever you get your podcasts and don't forget to leave us a rating and a review if you like the show. We always like your feedback and be sure to tune in to our future episodes.