The NVIDIA Digits is a $3,000 personal AI supercomputer designed to lower the barrier for developers working on large models. It features the GB10 Grace Blackwell Superchip, can handle models with up to 200 billion parameters, and includes 128GB of coherent memory and 4TB of NVMe storage. It offers up to one petaflop of AI performance at FP4, making it a powerful tool for AI development on a local machine.
Meta removed AI character accounts after users criticized them as 'creepy and unnecessary.' The AI characters, part of a test, were managed by people but faced backlash for their perceived lack of authenticity. Meta cited a bug that affected users' ability to block these accounts as the reason for their removal.
NVIDIA is focusing on custom chip manufacturing to compete with companies like Broadcom, which designs custom chips for AI applications. By establishing an R&D center in Taiwan and recruiting Taiwanese engineers, NVIDIA aims to develop ASIC (Application-Specific Integrated Circuit) solutions tailored to specific AI workloads, reducing reliance on off-the-shelf GPUs and improving efficiency for AI developers.
OpenAI is delaying the launch of AI agents due to concerns about prompt injection attacks, where malicious inputs could bypass the model's restrictions. Agents, which can interact with the web and sensitive infrastructure, pose a higher risk if compromised. OpenAI is working to mitigate these risks before releasing the agents to the public.
PRIME is a novel approach to online reinforcement learning that uses process rewards to improve the reasoning abilities of AI models. It involves generating diverse solutions to problems, filtering out incorrect answers, and rewarding the most efficient and correct reasoning traces. This method has shown significant improvements in benchmarks, such as the Math Olympiad, by encouraging models to explore new solutions while maintaining accuracy.
The ICLR paper found that language models shift from pre-trained semantic representations to context-aligned ones when given structured tasks. By using a graph-tracing approach, the study showed that models adapt their internal representations based on the context of the input sequence. This suggests that models can dynamically adjust the meaning of words based on their usage in specific contexts, which has implications for jailbreaks and adversarial attacks.
METAGENE-1 is a foundation model trained on metagenomic sequences, which are short DNA fragments from environmental samples like sewage. The model is designed to detect pathogens and disease indicators cost-effectively. By analyzing these sequences, it can provide early warnings of pandemics and other health threats, making it a valuable tool for public health monitoring.
TransPixar is designed to improve text-to-video generation by adding transparency (alpha channel) to video outputs. This allows for more realistic special effects, such as explosions or overlays, by enabling the model to predict both the RGB and alpha channels simultaneously. The model was trained on a dataset of high-resolution green screen videos and has shown significant improvements in video quality and motion alignment.
The growth in training compute for AI models is driven by three main factors: an increase in hardware quantity (doubling annually since 2018), longer training durations (1.5x per year since 2022), and improvements in hardware performance (more flops per GPU). These factors together have contributed to a 4.2x annual growth in training compute since 2018.
InfAlign is an approach to language model alignment that accounts for inference-time scaling, where models generate multiple outputs and select the best one. Traditional alignment methods, like RLHF, don't account for this process, leading to misalignment. InfAlign uses a positive exponential transformation of rewards to prioritize the best outputs, ensuring that the model's alignment is consistent with its usage during inference.
Our 196th episode with a summary and discussion of last week's* big AI news! *and sometimes last last week's Recorded on 01/10/2024
Join our brand new Discord here!) https://discord.gg/wDQkratW)
Hosted by Andrey Kurenkov) and Jeremie Harris). Feel free to email us your questions and feedback at [email protected] )and/or [email protected])
Read out our text newsletter and comment on the podcast at https://lastweekin.ai/).
Sponsors:
In this episode:
If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form).
Timestamps + Links:
(00:00:00) Intro / Banter
(00:04:52) Sponsor Break
Tools & Apps
(00:05:55) Nvidia announces $3,000 personal AI supercomputer called Digits)
(00:10:23) Meta removes AI character accounts after users criticize them as ‘creepy and unnecessary’)
Applications & Business
(00:16:16) NVIDIA Is Reportedly Focused Towards “Custom Chip” Manufacturing, Recruiting Top Taiwanese Talent)
(00:21:54) AI start-up Anthropic closes in on $60bn valuation)
(00:25:38) Why OpenAI is Taking So Long to Launch Agents)
(00:30:08) TSMC Set to Expand CoWoS Capacity to Record 75,000 Wafers in 2025, Doubling 2024 Output)
(00:33:10) Microsoft 'pauses construction' on part of data center site in Mount Pleasant, Wisconsin)
(00:37:23) Google folds more AI teams into DeepMind to ‘accelerate the research to developer pipeline’)
Projects & Open Source
(00:41:59) Cosmos World Foundation Model Platform for Physical AI)
(00:48:21) Microsoft releases Phi-4 language model on Hugging Face)
Research & Advancements
(00:50:16) PRIME: Online Reinforcement Learning with Process Rewards)
(00:58:29) ICLR: In-Context Learning of Representations)
(01:07:38) Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs)
(01:11:44) METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring)
(01:15:45) TransPixar: Advancing Text-to-Video Generation with Transparency)
(01:18:03) The amount of compute used to train frontier models has been growing at a breakneck pace of over 4x per year since 2018, resulting in an overall scale-up of more than 10,000x! But what factors are enabling this rapid growth?)
Policy & Safety
(01:23:45) InfAlign: Inference-aware language model alignment)
(01:28:44) Mark Zuckerberg gave Meta’s Llama team the OK to train on copyrighted works, filing claims)
(01:33:19) Anthropic gives court authority to intervene if chatbot spits out song lyrics)
(01:35:57) US government says companies are no longer allowed to send bulk data to these nations)
(01:39:10) Trump announces $20B plan to build new data centers in the US)