First, there was the famous blue dress. Now, there is the Yanny vs. Laurel Internet debate on what different people are hearing.
A Virginia Tech linguist weighed in on the latest controversy, pointing out everyone’s excited in the fact that we’re all hearing this differently, and it doesn’t feel ambiguous.
Abby Walker, an assistant professor and co-director of the Speech Lab at Virginia Tech, responded Sunday to the viral Yanny or Laurel audio clip. She noted the clip itself is synthesized speech.
“There are lots of unnatural things going on, but it almost looks like there are two files on top of each other. At first I thought someone had intentionally made a weird file to mess with people, but apparently is just a bad automatic synthesis from vocabulary.com,” she said.
There is always ambiguity in the signal, according to Walker. She said here are lots of studies, including her own, showing how different types of context affect what word or sound people hear.
“Contexts like: do you think it’s a woman or a man speaking? What shape is the person making with their mouth? Are you sitting in a car or in a lab? Are you thinking about Australia or New Zealand? etc…,” she said.
But in reality, depending on their ears or hearing strategy, or the quality of the equipment they’re listening through, different people are being more influenced more by lower or higher frequency information.
The audio clip was posted online through an Instagram account, and within hours, it had taken the Internet by storm. Again, causing both controversy and a difference of opinion.
Three years ago, the blue dress phenomena created a similar stir when people looking at the same image saw different colors.
“There’s lots of information in a speech signal – the speaker’s pitch, what position their mouth is in, and how big they are (the size of their vocal tract). How we interpret a sound/word is based on combinations of this information, and linguists have known for a long time that the exact same sound file can sound like a different word/sound if we mess with how big you think the speaker is (for example, by telling you the speaker is male or female),” Walker said.
This file has certain ambiguities that let you either hear it as a big, operatic male voice saying the word “Laurel” (the original recording), or a much more synthetic and smaller-bodied voice saying “Yanny”.
“The ambiguity comes from some strong higher frequency information – if you ignore this info, you’ll hear Laurel, and if you pay a lot of attention to it, you’ll hear Yanny,” she said.
In the case of the dress, onlookers were torn between black and blue or white and gold. The issue divided the Internet. The same seems to be true in Yanny versus Laurel.
The difference could also depend upon a person’s own hearing.
“If you’re unable to hear the higher frequency information, because your hearing is in decline, or because your equipment isn’t playing it very loudly, then that’s definitely going to impact the degree to which you’re going to get the Yanny reading,” Walker said.
The language specialist thinks the controversy highlights two faulty assumptions we have about speech.
First, that there’s an objective, single, correct way to interpret a sound. According to Walker, this is not true because we’re *always* using context to interpret sound. Second, that other people hear things the same way as us.
“Since most of the time we all come to the same interpretation, it obscures the fact that there are lots of different acoustic cues listeners can use, and so even when we say we agree about what we heard, it doesn’t mean that we actually heard the same thing, and/or that we paid attention to the same information to get the answer,” she said.
Walker is an assistant professor and co-director of the Speech Lab at Virginia Tech. She researches sociolinguistics, phonetics, dialectal variation, language cognition and language change.