We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode 2025.06.09 | 常青问题分类提升问答系统;多模态融合优化音频描述。

2025.06.09 | 常青问题分类提升问答系统;多模态融合优化音频描述。

2025/6/10
logo of podcast HuggingFace 每日AI论文速递

HuggingFace 每日AI论文速递

AI Chapters
Chapters

Shownotes Transcript

本期的 15 篇论文如下:

[00:24] 🕰 Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA(明日依旧为真吗?多语种常青问题分类以提升可信赖的问答系统)

[01:04] 🎧 FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual Fusion(FusionAudio-1.2M:通过多模态上下文融合实现细粒度音频描述)

[01:46] 🤔 Is Extending Modality The Right Path Towards Omni-Modality?(扩展模态是通向全模态的正确路径吗?)

[02:23] 🎤 Audio-Aware Large Language Models as Judges for Speaking Styles(音频感知大语言模型作为语音风格的评判者)

[03:00] 🧠 Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs(利用自注意力机制实现LLM中输入依赖的软提示)

[03:36] 🖼 STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis(STARFlow:用于高分辨率图像合成的可扩展隐式归一化流)

[04:17] 🧠 MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning(MORSE-500:一个程序化可控的视频基准,用于压力测试多模态推理)

[04:56] 🧩 PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers(PartCrafter: 基于组合潜在扩散Transformer的结构化3D网格生成)

[05:33] 🤝 Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision(桥接视角:关于以自我中心和以外部视角进行跨视角协同智能的调查)

[06:18] 🤖 3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model(3DFlowAction:从3D流动世界模型中学习跨具身操作)

[07:00] 🚀 Prefix Grouper: Efficient GRPO Training through Shared-Prefix Forward(前缀分组器:通过共享前缀前向传播实现高效的GRPO训练)

[07:45] 🧪 CodeContests+: High-Quality Test Case Generation for Competitive Programming(CodeContests+: 针对竞争性编程的高质量测试用例生成)

[08:35] 🤖 Splatting Physical Scenes: End-to-End Real-to-Sim from Imperfect Robot Data(物理场景的点云重建:从不完美的机器人数据实现端到端的真实到仿真)

[09:13] 🤖 HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization(HASHIRU:用于混合智能资源利用的分层代理系统)

[09:55] 🧠 Truth in the Few: High-Value Data Selection for Efficient Multi-Modal Reasoning(少量真知:用于高效多模态推理的高价值数据选择) 【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递