We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

#196 - Nvidia Digits, Cosmos, PRIME, ICLR, InfAlign

2025/1/13

Last Week in AI

AI Deep Dive AI Insights AI Chapters Transcript

People

Andrey Kurenkov

Jeremie Harris

Topics

Andrey Kurenkov: 我在湾区一家做生成式AI的创业公司工作，之前在研究生院学习AI。英伟达发布的Digits超级计算机，价格为3000美元，可以运行参数高达2000亿的大型模型，这将降低大型模型开发者入门门槛。Digits不仅可以运行大型模型，还可以用于训练模型，这对于开发者来说非常重要。Meta推出AI角色账户功能，旨在优化平台内容，提高用户粘性，但由于用户批评其“令人毛骨悚然且不必要”而迅速下线。谷歌将更多AI团队整合到DeepMind中，以加速研究到开发的流程。英伟达发布了Cosmos世界基础模型平台，用于物理AI应用的模型开发。微软在Hugging Face上发布了Phi-4语言模型。 Jeremie Harris: 我从事AI国家安全方面的工作，在Gladstone AI工作。英伟达的GB10 Grace Blackwell超级芯片是GB200的低配版，但仍然比个人电脑强大得多。英伟达降低了数据中心超级芯片的密度，以解决供电和冷却问题。英伟达正试图通过定制芯片制造来与博通竞争，以满足客户对定制硬件的需求。Meta推出AI角色账户的目的是为了优化平台内容，提高用户粘性。OpenAI推迟发布代理的原因之一是担心提示注入攻击。TSMC计划在2025年将CoWoS产能提高到创纪录的75000片晶圆，是2024年的两倍。微软暂停了威斯康星州数据中心项目的一部分建设，以重新评估技术变化的影响。DeepMind曾经是一个纯粹的研究实验室，现在它正在转变为谷歌的一个产品开发部门。

Deep Dive

Key Insights

What is the NVIDIA Digits and what are its key features?

The NVIDIA Digits is a $3,000 personal AI supercomputer designed to lower the barrier for developers working on large models. It features the GB10 Grace Blackwell Superchip, can handle models with up to 200 billion parameters, and includes 128GB of coherent memory and 4TB of NVMe storage. It offers up to one petaflop of AI performance at FP4, making it a powerful tool for AI development on a local machine.

Why did Meta remove AI character accounts from Instagram and Facebook?

Meta removed AI character accounts after users criticized them as 'creepy and unnecessary.' The AI characters, part of a test, were managed by people but faced backlash for their perceived lack of authenticity. Meta cited a bug that affected users' ability to block these accounts as the reason for their removal.

What is the significance of NVIDIA's focus on custom chip manufacturing?

NVIDIA is focusing on custom chip manufacturing to compete with companies like Broadcom, which designs custom chips for AI applications. By establishing an R&D center in Taiwan and recruiting Taiwanese engineers, NVIDIA aims to develop ASIC (Application-Specific Integrated Circuit) solutions tailored to specific AI workloads, reducing reliance on off-the-shelf GPUs and improving efficiency for AI developers.

Why is OpenAI taking longer to launch AI agents?

OpenAI is delaying the launch of AI agents due to concerns about prompt injection attacks, where malicious inputs could bypass the model's restrictions. Agents, which can interact with the web and sensitive infrastructure, pose a higher risk if compromised. OpenAI is working to mitigate these risks before releasing the agents to the public.

What is the PRIME approach in online reinforcement learning for AI models?

PRIME is a novel approach to online reinforcement learning that uses process rewards to improve the reasoning abilities of AI models. It involves generating diverse solutions to problems, filtering out incorrect answers, and rewarding the most efficient and correct reasoning traces. This method has shown significant improvements in benchmarks, such as the Math Olympiad, by encouraging models to explore new solutions while maintaining accuracy.

What are the key findings of the ICLR paper on in-context learning of representations?

The ICLR paper found that language models shift from pre-trained semantic representations to context-aligned ones when given structured tasks. By using a graph-tracing approach, the study showed that models adapt their internal representations based on the context of the input sequence. This suggests that models can dynamically adjust the meaning of words based on their usage in specific contexts, which has implications for jailbreaks and adversarial attacks.

What is the significance of the METAGENE-1 metagenomic foundation model?

METAGENE-1 is a foundation model trained on metagenomic sequences, which are short DNA fragments from environmental samples like sewage. The model is designed to detect pathogens and disease indicators cost-effectively. By analyzing these sequences, it can provide early warnings of pandemics and other health threats, making it a valuable tool for public health monitoring.

What is the purpose of the TransPixar model in text-to-video generation?

TransPixar is designed to improve text-to-video generation by adding transparency (alpha channel) to video outputs. This allows for more realistic special effects, such as explosions or overlays, by enabling the model to predict both the RGB and alpha channels simultaneously. The model was trained on a dataset of high-resolution green screen videos and has shown significant improvements in video quality and motion alignment.

What are the key factors driving the growth in training compute for AI models?

The growth in training compute for AI models is driven by three main factors: an increase in hardware quantity (doubling annually since 2018), longer training durations (1.5x per year since 2022), and improvements in hardware performance (more flops per GPU). These factors together have contributed to a 4.2x annual growth in training compute since 2018.

What is the InfAlign approach to language model alignment?

InfAlign is an approach to language model alignment that accounts for inference-time scaling, where models generate multiple outputs and select the best one. Traditional alignment methods, like RLHF, don't account for this process, leading to misalignment. InfAlign uses a positive exponential transformation of rewards to prioritize the best outputs, ensuring that the model's alignment is consistent with its usage during inference.

Chapters

This introductory chapter welcomes listeners to the Last Week in AI podcast, briefly introduces the hosts Andrey Kurenkov and Jeremie Harris, and mentions the podcast's text newsletter and new Discord server. It also acknowledges listener comments, reviews, and the existence of another podcast with a similar name.

Podcast's text newsletter available at lastweekin.ai
New Discord server launched
Listener comments and reviews acknowledged
Another podcast with similar name exists

Shownotes Transcript

Our 196th episode with a summary and discussion of last week's* big AI news! *and sometimes last last week's Recorded on 01/10/2024

Join our brand new Discord here!) https://discord.gg/wDQkratW)

Hosted by Andrey Kurenkov) and Jeremie Harris). Feel free to email us your questions and feedback at [email protected] )and/or [email protected])

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/).

Sponsors:

The Generator -) An interdisciplinary AI lab empowering innovators from all fields to bring visionary ideas to life by harnessing the capabilities of artificial intelligence.

In this episode:

Nvidia announced a $3,000 personal AI supercomputer called Digits, featuring the GB10 Grace Blackwell Superchip, aiming to lower the barrier for developers working on large models.
The U.S. Department of Justice finalizes a rule restricting the transmission of specific data types to countries of concern, including China and Russia, under executive order 14117.
Meta allegedly trained Llama on pirated content from LibGen, with internal concerns about the legality confirmed through court filings.
Microsoft paused construction on a section of a large data center project in Wisconsin to reassess based on new technological changes.

If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form).

Timestamps + Links:

(00:00:00) Intro / Banter

(00:04:52) Sponsor Break

Tools & Apps

(00:05:55) Nvidia announces $3,000 personal AI supercomputer called Digits)

(00:10:23) Meta removes AI character accounts after users criticize them as ‘creepy and unnecessary’)

Applications & Business

(00:16:16) NVIDIA Is Reportedly Focused Towards “Custom Chip” Manufacturing, Recruiting Top Taiwanese Talent)

(00:21:54) AI start-up Anthropic closes in on $60bn valuation)

(00:25:38) Why OpenAI is Taking So Long to Launch Agents)

(00:30:08) TSMC Set to Expand CoWoS Capacity to Record 75,000 Wafers in 2025, Doubling 2024 Output)

(00:33:10) Microsoft 'pauses construction' on part of data center site in Mount Pleasant, Wisconsin)

(00:37:23) Google folds more AI teams into DeepMind to ‘accelerate the research to developer pipeline’)

Projects & Open Source

(00:41:59) Cosmos World Foundation Model Platform for Physical AI)

(00:48:21) Microsoft releases Phi-4 language model on Hugging Face)

Research & Advancements

(00:50:16) PRIME: Online Reinforcement Learning with Process Rewards)

(00:58:29) ICLR: In-Context Learning of Representations)

(01:07:38) Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs)

(01:11:44) METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring)

(01:15:45) TransPixar: Advancing Text-to-Video Generation with Transparency)

(01:18:03) The amount of compute used to train frontier models has been growing at a breakneck pace of over 4x per year since 2018, resulting in an overall scale-up of more than 10,000x! But what factors are enabling this rapid growth?)

Policy & Safety

(01:23:45) InfAlign: Inference-aware language model alignment)

(01:28:44) Mark Zuckerberg gave Meta’s Llama team the OK to train on copyrighted works, filing claims)

(01:33:19) Anthropic gives court authority to intervene if chatbot spits out song lyrics)

(01:35:57) US government says companies are no longer allowed to send bulk data to these nations)

(01:39:10) Trump announces $20B plan to build new data centers in the US)

#196 - Nvidia Digits, Cosmos, PRIME, ICLR, InfAlign 01:46:34 Share