We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode 2025.05.06 | Voila实现低延迟全双工对话;RM-R1提升大模型推理奖励。

2025.05.06 | Voila实现低延迟全双工对话;RM-R1提升大模型推理奖励。

2025/5/6
logo of podcast HuggingFace 每日AI论文速递

HuggingFace 每日AI论文速递

AI Chapters
Chapters

Shownotes Transcript

本期的 15 篇论文如下:

[00:22] 🤖 Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play(Voila:用于实时自主交互和语音角色扮演的语音-语言基础模型)

[01:09] 🤔 RM-R1: Reward Modeling as Reasoning(RM-R1:将奖励建模视为推理)

[01:52] 🧠 Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers(野外Grokking:用于Transformer真实世界多跳推理的数据增强)

[02:32] 🧮 FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models(FormalMATH:大规模语言模型的形式化数学推理基准)

[03:17] ✂ ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations(ReplaceMe:基于层剪枝和线性变换的网络简化)

[03:59] 🧠 Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL(通过拒绝采样和强化学习中的梯度方差最小化优化思维链推理器)

[04:39] 🚀 Practical Efficiency of Muon for Pretraining(Muon在预训练中的实际效率)

[05:18] ⚙ A Survey on Inference Engines for Large Language Models: Perspectives on Optimization and Efficiency(大语言模型推理引擎综述:优化与效率的视角)

[06:01] 🤖 R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning(R1-奖励:通过稳定强化学习训练多模态奖励模型)

[06:44] 🤔 Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents(随机应变:基于强化学习的社交智能体自适应思考)

[07:24] 🤖 SkillMimic-V2: Learning Robust and Generalizable Interaction Skills from Sparse and Noisy Demonstrations(SkillMimic-V2:从稀疏和嘈杂的示范中学习鲁棒且可泛化的交互技能)

[08:03] 🤖 Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning(基于强化学习的LLM自主推理与工具集成)

[08:50] 🖼 SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing(SuperEdit:修正并促进基于指令的图像编辑的监督)

[09:30] 🧮 Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities(大语言模型低精度训练:方法、挑战与机遇)

[10:11] 🎨 Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction(Ming-Lite-Uni:自然多模态交互统一架构的进展) 【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递