We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode AI can't read the room

AI can't read the room

2025/4/28
logo of podcast Marketplace All-in-One

Marketplace All-in-One

AI Deep Dive AI Chapters Transcript
People
L
Leyla Isik
Topics
Leyla Isik: 我和我的研究团队使用简短的视频(例如两人聊天、两个婴儿玩耍、两人进行同步滑冰表演)进行研究。我们将这些视频给人类参与者观看,并询问他们诸如‘这些人是否在互相交流?’、‘互动是积极的还是消极的?’之类的问题。然后,我们将相同的视频提供给超过350个开源AI模型。结果发现,AI模型在理解视频内容方面远不如人类。我们发现,实际上没有任何模型能够很好地将行为或大脑反应与不同的社会属性(例如,人们是否在交流)匹配。令人惊讶的是,它们甚至无法很好地区分人们是否面对面。虽然我们预料到AI在某些方面会存在不足,但其整体表现之差还是让我们感到非常惊讶。我们测试了350个模型,有些模型的表现比其他模型更好,但这为我们提供了有趣的见解。但是,没有一个模型能够完全匹配我们测试的所有人类行为。AI在理解人类行为方面还有很长的路要走,尤其是在需要理解人类意图和预测其行为的场景中,例如自动驾驶汽车的左转。即使是判断人们是否面对面这样最基本的事情,AI也做得不好。这表明,在AI与人类互动方面,还有很多工作要做,我们需要改进这些系统,并找到新的方法来对这些系统进行压力测试。 即使在最基本的行为理解方面,AI也存在不足,例如判断人物位置和关系。虽然AI在过去十年中取得了令人惊叹的进步,但我认为,解决这些问题可能需要从根本上改变方法,而不仅仅是增加数据和扩大网络规模。目前许多AI客服应用是基于文本的,但如果要扩展到更广泛的应用,例如辅助机器人,就需要AI能够基于视觉线索与人类互动。历史上,AI在很大程度上受到了人类、认知科学和神经科学的启发。但在最近的AI热潮中,这三个领域似乎有些脱节。我认为,现在是时候让这些领域重新走到一起了,我们需要将人类关心的因素和我们赋予世界的结构融入AI模型的设计中。 Stephanie Hughes: 在节目的访谈中,我与Leyla Isik教授讨论了这项研究的发现以及其对AI商业应用的意义。

Deep Dive

Shownotes Transcript

Leyla Isik, a professor of cognitive science at Johns Hopkins University, is also a senior scientist on a new study looking at how good AI is at reading social cues. She and her research team took short videos of people doing things — two people chatting, two babies on a playmat, two people doing a synchronized skate routine — and showed them to human participants. After, they were asked them questions like, are these two communicating with each other? Are they communicating? Is it a positive or negative interaction? Then, they showed the same videos to over 350 open source AI models. (Which is a lot, though it didn't include all the latest and greatest ones out there.) Isik found that the AI models were a lot worse than humans at understanding what was going on. Marketplace’s Stephanie Hughes visited Isik at her lab in Johns Hopkins to discuss the findings.