Abstract
Imagine you are at party with loud music playing. What would it be like trying to speak to your friend in all that noise? Scientists call background noise like this “masking sound” because it covers up other sounds, in two ways. The background sound might be so loud that it blocks out other noises, or it might contain information that is distracting. Maybe it is your favorite song and you cannot help singing along! Which of these do you think affects you most when you are trying to talk? We decided to find out by putting people in a brain scanner and asking them to talk while we played various noises in the background. We found that the brain cares most about sounds that contain lots of information, even if they do not block out other noises very well. So, maybe being able to hear yourself is not as important as we thought!
Masking: The Problem With Talking in Noise
Most of us have conversations with other people every day, in all kinds of places—at home, in the street, at a party, or in the playground, for example. If you are at home in a quiet room, it is pretty easy to concentrate on what you are saying. But what if you are standing on a busy street, or meeting a friend at a fairground? Holding a conversation in a very loud place can be tricky. But what is it that makes such a conversation difficult?
Scientists who study speech have identified two ways that background sounds make it harder to hear and talk to other people. When one sound covers up another, we call this masking. Masking potential is how likely one sound is to mask, or cover up, another. The first type of masking potential happens when the background sound physically covers up your voice. This is called energetic masking potential because it is the energy of the sound wave that covers up your voice. “Energy” could include how loud the sound is (volume) or how high or low it is (pitch). The louder the sound, or the closer it is in pitch to the sound it is masking, the harder it is to pick the two apart.
The second type of masking potential happens when the background sound contains information that may distract you. This is called informational masking potential because it is the information in the background sound that covers up the other sound. “Information” can mean words or other things that have meaning to you, like sirens or music. Your brain must spend time figuring out what is relevant and what is not. All sounds have a little bit of both types of masking potential.
Imagine you are at a fairground and just got off the roller coaster. You spot your friend at the candyfloss stall and go over to tell them all about your ride. But the candyfloss stall is blasting out instrumental music, making it hard to hear yourself talking. This is mostly energetic masking, with a little bit of informational masking from the pattern of the music. Now, imagine you hear an announcement on a loudspeaker saying, “FREE RIDES ON THE TWISTER!” You would probably stop talking to listen to the announcement, then rush over to the twister for your ride. This is mostly informational masking, with a little bit of energetic masking from the loudness and pitch of the announcement.
Scientists have done a lot of research into how we listen to speech when there is background noise, so we know that energetic and informational masking work in different ways and are processed differently by the brain [1]. However, we do not know as much about what happens when we are trying to talk in a noisy environment. This work could help us understand how our brains control our voices, which might also help us figure out why some people have problems with their speech.
How Does the Brain Help Us Talk in Noisy Environments?
We looked at a part of the brain called the posterior superior temporal gyrus (pSTG). This area is found in both the left and right sides of the brain (Figure 1).
There are two things scientists think the pSTG might be doing when we talk in a noisy environment. First, it might be listening to your own voice to see if you are being clear enough. If you make a mistake, or if you cannot hear yourself properly over the noise, the pSTG will register an “error” and try to change your voice to fix it. This is what seems to happen when people talk in noise with low information content, such as traffic noise. As the noise gets louder and it gets harder to hear yourself, the pSTG gets more active [2]. The second thing the pSTG might be doing is keeping track of what is going on in the background, in case there is information we can use.
We know that the pSTG is activated when someone is trying to listen to one person while others are also talking, and we also know that we pay attention to what is going on in the background when we are talking. It is easier to talk when breaks in the background noise happen at regular times [3]. Someone who wants to speak will often wait until other people have stopped talking. In our fairground example, you would probably stop talking to your friend until the loudspeaker had finished.
While some studies look at how distracting speech can be, most have looked at how people react to “white noise,” similar to the sound of an airplane passing overhead. These studies have concluded that we do not pay attention to the content of background noise when we are trying to talk. Instead, we focus on how well we can hear ourselves and use that information to change our voices [4]. It makes sense that we would mostly ignore background noise when there is little information in it. We wondered if this is still true when the background noise consists of something potentially interesting, like speech. In that case, do we focus on listening to ourselves, or to what is going on in the background?
Testing What the Brain Does When We Speak in Noise
A brain scanner measures how much blood is traveling to various parts of the brain. The harder a brain area is working, the more blood it needs. The results are shown on a screen as a picture, in which brightly colored areas are the most active.
We asked people to lie in a brain scanner and read sentences aloud while we played sounds with various levels of energetic and informational masking. There were four types of sounds, starting with recordings of people talking and getting gradually less speech-like and more like white noise. We then asked the participants to read sentences silently to themselves while listening to the sounds. We compared the silent reading condition to reading aloud with the various maskers, to make sure that we were measuring changes related to talking in each condition, not just to hearing various kinds of noise. We wanted to know how active the pSTG was when people spoke in each kind of background sound.
If people focus mainly on their own speech when talking in noise, then how well we can hear ourselves will be the most important thing when trying to speak in a noisy place. In this case, we would expect the pSTG to register more “errors” when background noise had more energetic content. In other words, the more effective the background noise is at blocking out the speaker’s voice, the harder the pSTG must work and the more activation we will see. But if people mainly focus on what is going on around them when trying to speak in a noisy place, we would expect the pSTG to be most active when background noise has more informational masking. In other words, the more interested we are in the background noise, the harder the pSTG must work and the more activation we will see.
Which Matters More?
Based on findings from previous brain scanning studies, we expected a strong response in the pSTG when people talked in noise with more energetic masking potential. Instead, we found something that surprised us (Figure 2). Although the pSTG was active when people spoke in maskers with high energetic masking potential, this was only a small response. The response was much bigger when people tried to talk in maskers with high information content. In fact, the more information there was in the background, the more active the pSTG was. In other words, our brains work harder to concentrate on talking when background noise contains information that we are interested in. Our brains are less bothered by how much the background noise covers up our voices.
It could be that the brain mistakes speech-like background noise for our own voice, causing the brain to register an “error.” However, there is not much evidence to suggest that we cannot tell our own voices apart from those in the background. We think it is more likely that this extra brain activity happens because people are monitoring the background noise for relevant information. This does not mean that the brain completely ignores energetic masking when we are speaking. Our participants did speak more loudly when we played sounds with more energetic masking potential, showing that our brains do monitor how well others might hear our voices. But overall, we found that informational masking made the most difference to brain responses, suggesting that speech-like sounds in the background were the most distracting. We think that the brain response to informational masking is so strong that it drowns out any response to the energetic content of sounds.
In the future, we would like to look at our data using more sensitive analysis techniques, to see if we can find brain areas that care more about energetic masking. While more studies and analysis will help us better understand our results, we now know that the ability to hear yourself when talking in noisy environments is not as important as we once thought! Understanding how humans talk is essential to figuring out how why some people have difficulty with speech. So this is an exciting result that tells us more about ourselves, and may be able to help some people in the future.
Funding
SM is funded by a British Academy Postdoctoral Fellowship, also funded by Royal Society Dorothy Hodgkin Fellowship [grant number DHF\R1\211078], and wrote this article while supported by a British Academy Postdoctoral Fellowship [grant number pf170122]. The research described in this article was supported by an ESRC Ph.D. studentship awarded to SM.
Glossary
Masker: ↑ A sound that covers up what you are trying to listen to.
Energetic Masking Potential: ↑ How good a sound is at blocking out other sounds.
Energy: ↑ Physical properties, like pitch and loudness, that block out the thing you are trying to listen to.
Informational Masking Potential: ↑ How good a sound is at distracting you from other sounds.
Information: ↑ Non-physical properties, like meaning, that may distract you from the thing you are trying to listen to.
Posterior Superior Temporal Gyrus (pSTG): ↑ The area at the back (posterior) of the superior temporal gyrus. The part of the temporal lobe that is on the outside (gyrus) near the top (superior) of the lobe.
Conflict of Interest
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Original Source Article
↑Meekings, S., Evans, S., Lavan, N., Boebinger, D., Krieger-Redwood, K., Cooke, M., et al. 2016. Distinct neural systems recruited when speech production is modulated by different masking sounds. J. Acoust. Soc. Am. 140:8–19. doi: 10.1121/1.4948587
References
[1] ↑ Evans, S., McGettigan, C., Agnew, Z. K., Rosen, S., and Scott, S. K. 2016. Getting the cocktail party started: masking effects in speech perception. J. Cogn. Neurosci. 28:483–500. doi: 10.1162/jocn_a_00913
[2] ↑ Christoffels, I. K., Formisano, E., and Schiller, N. O. 2007. Neural correlates of verbal feedback processing: an fMRI study employing overt speech. Hum. Brain Mapp. 28:868–79. doi: 10.1002/hbm.20315
[3] ↑ Cooke, M., and Lu, Y. 2010. Spectral and temporal changes to speech produced in the presence of energetic and informational maskers. J. Acoust. Soc. Am. 128:2059–69. doi: 10.1121/1.3478775
[4] ↑ Guenther, F. H. 2006. Cortical interactions underlying the production of speech sounds. J. Commun. Disord. 39:350–65. doi: 10.1016/j.jcomdis.2006.06.01