本期的 14 篇论文如下:
[00:22] 🤖 Qwen2.5 Technical Report(Qwen2.5技术报告)
[01:00] 🧠 Progressive Multimodal Reasoning via Active Retrieval(通过主动检索实现渐进式多模态推理)
[01:39] 🌐 MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval(MegaPairs:大规模数据合成用于通用多模态检索)
[02:26] 🧠 LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks(LongBench v2:面向现实长上下文多任务的深入理解和推理)
[03:15] 📊 How to Synthesize Text Data without Model Collapse?(如何合成文本数据而不导致模型崩溃?)
[03:56] 🌊 Flowing from Words to Pixels: A Framework for Cross-Modality Evolution(从文字到像素:跨模态演化的框架)
[04:37] 🎥 LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis(LeviTor:面向三维轨迹的图像到视频合成)
[05:20] 🖼 Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion(可感知功能的对象插入:基于掩码感知的双重扩散)
[06:05] 🌐 DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation(DI-PCG:基于扩散的高效逆向程序化内容生成用于高质量3D资产创建)
[06:46] 🧠 AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling(AceMath:通过后训练和奖励建模推进前沿数学推理)
[07:33] 🧠 Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception(基于视觉专家的描述性字幕增强的多模态感知)
[08:14] 🖼 UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency(基于循环编辑一致性的无监督指令图像编辑)
[08:54] 🧪 TOMG-Bench: Evaluating LLMs on Text-based Open Molecule Generation(基于文本的开放分子生成基准测试)
[09:36] 🕺 Move-in-2D: 2D-Conditioned Human Motion Generation(二维条件下的生成人体运动)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递