We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Could We Speak to Dolphins? A Promising LLM Makes That a Possibility

Could We Speak to Dolphins? A Promising LLM Makes That a Possibility

2025/5/23
logo of podcast Science Quickly

Science Quickly

AI Deep Dive AI Chapters Transcript
People
A
Arik Kirschenbaum
D
Denise Herzing
M
Melissa Hobson
R
Rachel Feltman
T
Thad Starner
T
Thea Taylor
Topics
Rachel Feltman: 我认为海豚非常聪明和可爱,我一直好奇它们在想什么。梅丽莎·霍布森向我介绍了 Dolphin Gemma 项目,这是一个旨在解码海豚叫声的大型语言模型,这让我对我们是否能与海豚交流充满了期待。 Melissa Hobson: 我发现水下环境充满了声音,海豚主要通过声音进行交流。它们的声音交流方式非常广泛,包括口哨声、咔哒声和爆发脉冲声。Dolphin Gemma 项目利用人工智能来分析这些声音,希望能帮助我们理解海豚的语言。 Thea Taylor: 作为一名海洋生物学家,我对人工智能模型如何促进我们对海豚交流的理解非常感兴趣。我认为我们需要小心区分动物是否真正理解语言,还是仅仅将声音与奖励联系起来。我们必须以科学和公正的态度看待这个问题。 Arik Kirschenbaum: 我认为我们并不完全了解海豚是如何交流的,也不知道它们有多少话要说。我认为人工智能可以帮助我们发现人类可能没有注意到的模式。但是,我认为海豚可能没有我们所拥有的那种语言,因为语言是一种非常复杂且昂贵的东西,需要进化优势。 Thad Starner: 我和我的团队一直在努力重现一种叫做 VCM3 的特殊声音,但是我们一直没有成功。令我惊讶的是,Dolphin Gemma 竟然能够生成这种声音。我们希望通过 Dolphin Gemma 了解海豚如何完成声音序列,并发现它们交流的模式。 Denise Herzing: 我和我的团队花了 40 年时间研究大西洋斑点海豚,并收集了大量的声学数据。我们使用这些数据来训练 Dolphin Gemma。我们还开发了一种叫做 CHAT 的技术,可以让我们与海豚进行双向交流。我们希望通过这种技术,我们可以向海豚展示系统是如何工作的,并鼓励它们模仿我们的声音。

Deep Dive

Shownotes Transcript

Translations:
中文

Close your eyes, exhale, feel your body relax and let go of whatever you're carrying today.

Well, I'm letting go of the worry that I wouldn't get my new contacts in time for this class. I got them delivered free from 1-800-CONTACTS. Oh my gosh, they're so fast. And breathe. Oh, sorry. I almost couldn't breathe when I saw the discount they gave me on my first order. Oh, sorry. Namaste. Visit 1-800-CONTACTS.com today to save on your first order. 1-800-CONTACTS.

Hey listeners, Rachel here. It's been a year since I started hosting Science Quickly, and because of that, I have a quick favor to ask. We would love to get your feedback on how Science Quickly has been doing and how you might like to see us evolve.

That's why we're putting out a listener survey. If you complete it this month, you'll be eligible to win some awesome Scientific American swag. You can find the survey at sciencequickly.com/survey or we'll also have that link in our show notes. It would mean a lot to us if you took a few minutes to complete the survey. We promise it won't take too much of your time. Again, you can find the survey at sciencequickly.com/survey. Thanks in advance for letting us know your thoughts.

For Scientific American Science Quickly, I'm Rachel Feltman. There are a few animals that pretty much everyone likes. Fluffy pandas, cute kittens, real tigers. Dolphins would probably make the list for most folks, too. They're intelligent, playful, and have those permanent goofy grins on their faces. But they're not all.

Watching them dart around in the water kind of makes you wonder, what are those guys thinking? It's a question many scientists have asked. But could we actually find out? And what if we could even talk back? Freelance ocean writer Melissa Hobson has been looking into a new project that's been making a big splash, sorry, in the media. It's being billed as the first large language model, or LLM, for dolphin vocalizations.

Could this new tech allow us to actually communicate with dolphins? Here's Melissa to share what she's learned. When you dip your head under the waves at the beach, the water muffles the noise around you and everything goes quiet for a moment. People often assume that means the ocean is silent, but that's really not true. Underwater habitats are actually full of noise. In fact, some marine animals rely heavily on sound for communication, like dolphins.

If you've ever been in the water with dolphins or watched them on TV, you'll notice that they're always chattering, chirping, clicking and squeaking. While these intelligent mammals also use visual, tactile and chemical cues, they often communicate with each other using vocalisations. They have a really, really broad variety of acoustic communication. That's Thea Taylor, a marine biologist and managing director of the Sussex Dolphin Project.

a dolphin research organisation based on England's south coast. She's not involved in the Dolphin LLM project, but she's really interested in how AI models such as this one could boost our understanding of dolphin communication. When it comes to vocalisations, dolphins generally make three different types of sounds: whistles for communication and identification, clicks to help them navigate,

and burst pulses, which are rapid sequences of clicks. These tend to be heard during fights and other close-up social behaviours. Scientists around the world have spent decades trying to find out how dolphins use sound to communicate, and whether the different sounds the mammals make have particular meanings. For example, we know each dolphin has a signature whistle that is essentially its name.

But what else can they say? Arik Kirschenbaum is a zoologist at England's Girton College at the University of Cambridge. He's an expert in animal communication, particularly among predatory species like dolphins and wolves. Arik's not involved in the dolphin LLM work. Well, we don't really know everything about how dolphins communicate. And the most important thing that we don't know is we don't know how much they have to say.

It's not all that clear really in terms of the cooperation between individuals, just how much of that is mediated through communication. Over the years, researchers from around the world have collected vast amounts of data on dolphin vocalisations.

Going through these recordings manually looking for patterns takes time. AI can a process data a lot faster than we can. It also has the benefit of not having a human perspective. We almost have an opportunity with AI to kind of let it have a little bit of free reign and look at patterns and indicators that we may not be seeing and we may not be picking up. So I think that's what I'm particularly excited about.

That's what a team of researchers is hoping to do with an AI project called Dolphin Gemma, a large language model for dolphin vocalizations created by Google in collaboration with the Georgia Institute of Technology and the nonprofit Wild Dolphin Project. I caught up with Thad Starner, a professor at Georgia Tech and research scientist at Google DeepMind, and Denise Herzing, founder of the Wild Dolphin Project, to find out how the LLM works.

The Wild Dolphin Project has spent 40 years studying Atlantic spotted dolphins,

This includes recording acoustic data that was used to train dolphin Gemma. Then, teams at Georgia Tech and Google asked the LLM to generate dolphin-like sound sequences. What it created surprised them all. The AI model generated a type of sound that Thad and his team had been unable to reproduce synthetically using conventional computer programs. Could the ability to create this unique dolphin sound get us a step closer to communicating with these animals?

We've been having a very hard time reproducing particular types of vocalizations we call VCM3s.

And it's the way the dolphins prefer to respond to us when we are trying to do our two-way communication work. VCM type 3, or VCM3s, are a variation on the burst pulses we mentioned earlier. Traditionally in experimental studies in captivity, dolphins, for whatever reason, mimicked whistles they were given using a tonal whistle. Like, right, you would hear it.

What we're seeing and what Thad was describing is the way the spotted dolphins that we work with seem to want to mimic. And it's using a click or two clicks. And it's basically taking out energy from certain frequency bands automatically.

And so when I first saw the results from the first version of Dolphin Gemma, half of it was, you know, the mimicking ocean noise. But then the second half of it was actually doing the types of whistles we expect to see from the dolphins. And to my surprise, the VCM3s showed up. And I said, oh my word, the stuff that's the hardest stuff for us to do, we finally have a way to actually create those VCM3s.

Another way they will be using the AI is to see how the LLM completes sequences of dolphin sounds. It's a bit like when you're typing into the Google search bar and autocomplete starts finishing your sentence, predicting what you were going to ask.

Once we have Dolphin Gemma trained up on everything, we can fine-tune on a particular type of vocalization and say, okay, when you hear this, what do you predict next? We can ask it to do it many, many different times and see if it predicts a particular vocalization back. And then we can go back and look at Denise's 40 years of data and say, hey, is this consistent, right? It helps us

get a magnifying glass to see what we should be paying attention to. If the AI keeps spitting back the same answers consistently, it might reveal a pattern. And if the researchers found a pattern, they could then check the Wild Dolphin Project's underwater video footage to see how the dolphins were acting when they made a specific sound.

This could add important context to the vocalization. Okay, what were they doing when we saw sequence A in these 20 sequences? Were they always fighting? Were they always disciplining their calf? I mean, we know they have certain types of sounds that are correlated with certain types of behaviors, but what we don't have is the repeated structure that would suggest some language-like structures in their acoustics.

The team also wants to see what the animals do when researchers play dolphin-like sounds that have been created by computer programs to refer to items such as seagrass or a toy. To do this, the team plans to use a technology called CHAT that was developed by THAAD's team. It stands for Cetacean Hearing Augmented Telemetry. The equipment, worn while freediving with the dolphins, has the ability to recognize audio and play sounds. Luckily for Denise, who has to wear it,

the technology has become much smaller and less cumbersome over the years and is now all incorporated into one unit.

It used to be made up of two parts: a chest plate and an arm panel. And when Denise would actually slide into the water, there was a good chance that she could knock herself out. I never knocked myself out. Getting in and out was the challenge. You need a little crane lift, right? Because the thing was so big and heavy until you got into the water, and it was hard to make something that you could put on quickly. And so we've iterated over the years with a system that was on the chest and on the arm.

And now we have this small thing that's just on the chest. And the big change here is that we discovered that the Pixel phones are good enough on the AI now that they can do all the processing in real time much better than the specialty machines we were making five years ago. And so we've gone down from something that was

I don't know, four or five different computers in one box to basically a smartphone. And it's really, really changed what we can do. And I'm no longer afraid every time that Denise slides into the water. The researchers use the chat system to essentially label different items. Two freedivers get into the water with dolphins nearby.

If the researchers can see they won't be disturbing the dolphins' natural behaviours, they use their chat device to play a made-up dolphin-like sound while holding or passing a specific object. The hope is that the dolphins might learn which sounds refer to different items and mimic those specific noises to ask for the corresponding objects.

You want to show the dolphins how the system works, not just expect them to just figure it out quickly and absorb it, right? So another human and another researcher, we are asking each other for toys using our little synthetic whistles. We exchange toys. We play with them while the dolphins are around watching. And if the dolphins want to get in the game, they can mimic the whistle for that toy and we'll give it to them. For example, this is the sound researchers use for a scarf. The dolphins like to play with scarves.

and Denise has a specific whistle she uses to identify herself.

But could the team be unintentionally training the dolphins, like when you teach a dog to sit? Here's what Thea had to say. I think my hesitation is whether that's the animal actually understanding language or whether it's more like, I make this sound in relation to this thing, I get a reward. This is where we have to be careful that we don't kind of bring in the human bias and the, oh, it understands this kind of excitement, which I get, I totally get people want to feel like

We can communicate with dolphins because, I mean, who wouldn't want to be able to talk to a dolphin? But I think we do have to be careful and look at it from a very kind of unbiased and scientific point of view when we're looking at the concept of language and what animals understand. This is where we need to pause and get our dictionary out. Because if we're trying to discover whether dolphins have language, we

we need to be clear on exactly what language is. Well, there's no one really good definition of language, but I think that one of the things that really has to be present if we're going to give it that very distinguished name of language is that these different communicative

symbols or sounds or words or whatever you want to call them need to be able to be combined in different ways so that there's really you could almost say almost anything if you can combine different sounds or different words into different sentences then you have at your disposal an infinite range of concepts that you can convey and it's that ability to really to be unlimited in what you can say that seems to be what's the important part of what language is

So if we understand language as the ability to convey an infinite number of things, rather than just assigning different noises to different objects, can we say that dolphins have language?

At the moment, Arik thinks the answer is probably no. So they clearly have the cognitive ability to identify objects and distinguish between different objects by different sounds. That's not quite the same. It's not even close to being the same as having language. And we know that it's possible to teach dolphins...

to understand human language. If I had to guess, I would say that I think dolphins probably don't have a language in the sense that we have a language. And the reason for that is quite simple. Language is a very complicated language

and expensive thing to have. It's something that uses up an awful lot of our brain and it only evolves if it provides some evolutionary benefit. And it's not at all clear what evolutionary benefit dolphins would have from language. To Arik, this research project is not about translating the sounds the animals make.

but seeing if they appear to recognise complex AI sequences as having meaning. So there's that wonderful example in the movie Star Trek: The Voyage Home where the crew of the Enterprise are trying to communicate with humpback whales.

And Kirk asks Spock, can we reply to these animals? And he says, well, we could simulate the sounds, but not the meaning. We'd be responding in gibberish. Now, there's a couple of reasons why they would be responding in gibberish. One is that when you listen to a few humpback whales, you cannot possibly have enough information to

to build a really detailed map of what that communication looks like. When you train large language models on human language, you are using the entirety of the internet. Billions upon billions of utterances are being analyzed.

None of us investigating animal communication have a data set anywhere near the size of a human data set. And so it's extremely difficult to have enough information to reverse engineer and understand meaning just from looking at sequences. There's another problem. When we translate one human language to another, we know the meanings of both languages.

But that's not true for dolphin communication. When we're working with animals, we actually don't know what a particular sequence means. We can identify perhaps that sequences have meaning, but it's very, very difficult to understand what that meaning is without being able to ask the animal themselves, which of course requires language in the first place. So it's a very circular problem that we face in decoding animal communication.

Denise says this project isn't exactly about trying to talk to dolphins, at least not yet. The possibility of having a true conversation with these animals is a long way off. But researchers are optimistic that AI could open new doors in their quest to decode dolphins' whistles. Ultimately, they hope to find potential meanings within the sequences. So could Dolphin Gemma help us figure out if dolphins and other animals have language?

Thad hoped so. With language comes culture. And I'm hoping that if we start doing this two-way work, the dolphins will reveal to us new things we'd never expected before. I mean, we know that they dive deep in some of these areas and see stuff that humans have never seen.

We know they have lots of interactions with other marine life that we have no idea about. But even if it's unlikely we'll be having a chat with Flippa any time soon, scientists are interested to see where this might lead. Humans often see language as the thing that sets us apart from animals. Might people have more empathy for cetaceans - that's whales, dolphins and porpoises - if we discovered they use language?

As someone who's particularly interested, obviously, in cetacean communication, I think this could be a really vital step forward for being able to understand it, even in kind of the more basic senses. If we can start to get more of a picture into the world of cetaceans, the more we understand about them, the more we can protect them, the more we can understand what's important. So, yeah, I'm excited to see what this can do for the future of cetacean conservation.

That's all for this week's Friday Fascination. We're taking Monday off from Memorial Day, but we'll be back on Wednesday. In the meantime, we'd be so grateful if you could take a minute to fill out our ongoing listener survey. We're looking to find out more about our listeners so we can continue to make Science Quickly the best podcast it can be. And if you submit your answers this month, you'll be eligible to win some sweet SIAM swag. Go to sciencequickly.com slash survey to fill it out now.

Science Quickly is produced by me, Rachel Feltman, along with Fonda Mwangi, Kelso Harper, Naima Marci, and Jeff Dalvisio. This episode was reported and co-hosted by Melissa Hobson and edited by Alex Zaghiara. Shaina Poses and Aaron Shattuck fact-check our show. Our theme music was composed by Dominic Smith. Subscribe to Scientific American for more up-to-date and in-depth science news. For Scientific American, this is Rachel Feltman. Have a great weekend. ♪

Thank you.