The AI winter was more of an 'AI autumn,' where expectations fell but important foundational work continued. Researchers like Hinton, LeCun, and Schmidhuber made gradual progress, laying the groundwork for later breakthroughs in deep learning.
GPUs, originally designed for gaming, became crucial for accelerating the matrix math required by neural networks. This enabled the training of deeper and more complex models, which was a key factor in the deep learning revolution.
The democratization of AI allows independent researchers and small teams to apply powerful models to novel domains, leading to diverse and innovative applications. This approach bypasses the incentive structures of traditional academia and large labs, fostering creativity.
AlexNet demonstrated the power of deep learning on GPUs, significantly outperforming previous methods like SVMs. It marked a tipping point in computer vision and showed that deep neural networks could be effectively trained on large datasets, sparking widespread interest in neural networks.
The awards signal AI's growing impact across scientific disciplines, marking a 'crossing the chasm' moment where AI moves from niche technology to mainstream scientific tooling. It validates AI's role as a meta-discipline that benefits other fields.
Boltzmann machines are a type of neural network developed in the 1980s that use probabilistic rules inspired by statistical physics. They were crucial for learning complex probability distributions and finding hidden patterns in data, paving the way for modern deep learning techniques.
In the 1990s, AI research was often seen as stagnant, with limited practical success in neural networks. However, in hindsight, this period was marked by foundational work that set the stage for the deep learning breakthroughs of the 2010s.
The 'bitter lesson' suggests that general-purpose techniques like search and learning tend to outperform domain-specific, hand-engineered methods. This principle is reflected in the increasing reliance on computation and data scaling in modern AI research.
Transformers introduced the attention mechanism, allowing models to focus on relevant parts of input data and capture long-range dependencies. This was a significant leap from earlier models like Boltzmann machines, which were more limited in their ability to process complex data.
Universities often lack access to the compute resources and data engineering expertise needed for large-scale AI research. Bridging this gap requires better collaboration between academia and industry, as well as open-source tools that allow researchers to focus on domain-specific applications.
In this episode of AI + a16z, General Partner Anjney Midha shares his perspective on the recent collection of Nobel Prizes awarded to AI researchers in both Physics and Chemistry. He talks through how early work on neural networks in the 1980s spurred continuous advancement in the field — even through the "AI winter" — which resulted in today's extremely useful AI technologies.
Here's a sample of the discussion, in response to a question about whether we will see more high-quality research emerge from sources beyond large universities and commercial labs:
"It can be easy to conclude that the most impactful AI research still requires resources beyond the reach of most individuals or small teams. And that open source contributions, while valuable, are unlikely to match the breakthroughs from well-funded labs. I've even heard heard some dismissive folks call it cute, and undermine the value of those.
"But on the other hand, I think that you could argue that open source and individual contributions are becoming increasingly more important in AI development. I think that the democratization of AI will lead probably to more diverse and innovative applications. And I think, in particular, the reason we should expect an explosion in home scientists — folks who aren't necessarily affiliated with a top-tier academic, or for that matter, industry lab — is that as open source models get more and more accessible, the rate limiter really is on the creativity of somebody who's willing to apply the power of that model's computational ability to a novel domain. And there are just a ton of domains and combinatorial intersections of different disciplines.
"Our blind spot for traditional academia [is that] it's not particularly rewarding to veer off the publish-or-perish conference circuit. And if you're at a large industry lab and you're not contributing directly to the next model release, it's not that clear how you get rewarded. And so being an independent actually frees you up from the incentive misstructure, I think, of some of the larger labs. And if you get to leverage the millions of dollars that the Llama team spent on pre-training, applying it to data sets that nobody else has perused before, it results in pretty big breakthroughs."
Learn more:
They trained artificial neural networks using physics)
They cracked the code for proteins’ amazing structures)
Follow on X:
Check out everything a16z is doing with artificial intelligence here), including articles, projects, and more podcasts.