本期的 9 篇论文如下:
[00:26] 🎙 Baichuan-Omni-1.5 Technical Report(百川全能1.5技术报告)
[01:03] 📚 Qwen2.5-1M Technical Report(Qwen2.5-1M 技术报告)
[01:47] 🤖 Towards General-Purpose Model-Free Reinforcement Learning(面向通用无模型强化学习的研究)
[02:25] 🗣 Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation(Emilia:一个大规模、广泛、多语言和多样化的语音生成数据集)
[03:07] 🧠 ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer(ARWKV:预训练并非我们所需要的,基于RNN-注意力机制的语言模型诞生于Transformer)
[03:52] 🧠 iFormer: Integrating ConvNet and Transformer for Mobile Application(iFormer:将卷积网络与Transformer集成应用于移动应用)
[04:38] 🧠 Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models(参数 vs FLOPs:混合专家语言模型最优稀疏性的缩放规律)
[05:19] 🧠 Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity(混合Mamba:通过模态感知稀疏性增强多模态状态空间模型)
[06:09] 📊 Feasible Learning(可行学习)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递