Frontiers for Young Minds

Frontiers for Young Minds
Core Concept Neuroscience and Psychology Collection Article Published: January 17, 2022

Speech Prosody: The Musical, Magical Quality of Speech


When we speak, we can vary how we use our voices. Our speech can be high or low (pitch), loud or soft (loudness), and fast or slow (duration). This variation in pitch, loudness, and duration is called speech prosody. It is a bit like making music. Varying our voices when we speak can express sarcasm or emotion and can even change the meaning of what we are saying. So, speech prosody is a crucial part of spoken language. But how do speakers produce prosody? How do listeners hear and understand these variations? Is it possible to hear and interpret prosody in other languages? And what about people whose hearing is not so good? Can they hear and understand prosodic patterns at all? Let’s find out!

During their first year at Hogwarts, Harry Potter and his friends learn the levitation charm “Wingardium Leviosa”. While practicing, Harry’s best friend, Ron, has a hard time making the feather on his desk obey his command and lift into the air. Hermione knows exactly why: “You’re saying it wrong. It’s Levi-o-sa, not Levio-sa”. “You do it, then, if you’re so clever. Go on, go on!”, replies Ron. With a swish and a flick of her wand, Hermione speaks the charm “Win-gar-dium Levi-o-sa!” and her feather slowly rises from her desk. It turns out that the levitation charm only works if you say the magic words properly. In other words, what you say matters, but how you say it also makes a big difference. Hermione actually uses speech prosody to convince her feather to levitate.

What Is Speech Prosody?

Speech prosody is often described as the musical quality of speech [1]. If you think of vowels and consonants as the sounds of language that make up what you say, then prosody is related to how you say these sounds. When Hermione corrects Ron’s pronunciation, she does not correct any vowels or consonants. In fact, the vowels and consonants in “Levi-o-sa” and “Levio-sa” are the same. What Hermione corrects instead is the stress pattern. Ron mistakenly places emphasis on “sa” in “Leviosa” when he is supposed to place it on “o”. Without the correct stress pattern, the words “Wingardium Leviosa” no longer have the intended meaning and the levitation charm does not work. Now, you might think: “great example, but the Harry Potter stories are fictional, and I could never make an object fly”. This may be true, but speech prosody works just as well in our Muggle (non-magic) world. Have you ever thought about how “object” can be pronounced in two ways? When we mentioned “object” a few sentences ago, we meant the noun that refers to a thing that can be seen and touched. In this context, you would pronounce “object” with stress on the first part of the word, like “ob-ject”. But if you were to stress the second part, instead, it takes on a new meaning. The verb “ob-ject” describes someone expressing disagreement. Think of how Hermione ob-jects to Ron’s pronunciation of the levitation charm. So, by simply changing the stress pattern, the word changes from a noun to a verb. Now that really is magic!

Speech prosody is more than just changing stress patterns though. When you speak, everything you do with your voice that is not directly related to pronouncing vowels and consonants is prosodic. Think of the rhythm and intonation of speech. Prosody makes speech sound less monotonous and boring. It can also change the meaning of speech—the meaning of a whole sentence can change by emphasizing different words! You can also make serious sentences sound sarcastic, or make happy stories sound sad, when you change the tone of your voice. Or you can turn statements into questions by changing the prosodic pattern. Try saying this sentence aloud: “See you tomorrow!”. Now say the sentence again, but this time turn it into a question: “See you tomorrow?” Notice how your voice goes up in pitch at the end of the sentence? This is speech prosody!

How Do We Use and Understand Prosody?

When we speak, we can (and do!) vary how high or low, how loud or soft, and how fast or slow our speech is. This variation in pitch, loudness, and duration is what creates the prosodic patterns of speech [1]. Everyone uses prosody when they speak. Even Ron uses prosody when he says “Levio-sa” by pronouncing “sa” slightly higher, louder, and longer than the other parts of the word. Compare this to when Hermione says “Levi-o-sa”. She pronounces “o” higher, louder, and longer than the other parts of the word (Figure 1). Whatever the prosodic pattern may be, it is always described in terms of the relative increase or decrease in pitch, loudness, and duration.

Figure 1 - The different pronunciations of “Wingardium Leviosa” visualized in two ways.
  • Figure 1 - The different pronunciations of “Wingardium Leviosa” visualized in two ways.
  • The speech waveform (top) shows the loudness of the recorded speech over time and the spectrogram (bottom) shows the loudness at different frequencies over time. In the spectrogram, the blue lines show the voice frequency, related to the perceived pitch, and the yellow lines show the intensity, related to the perceived loudness. You can see in (A) that “o” is higher (blue) and louder (yellow) than the other parts of “Levi-o-sa” and in (B) that “sa” is higher (blue) and louder (yellow) than the other parts of “Levio-sa”.

When we listen to speech, we can usually hear the variation in pitch, loudness, and duration that a speaker produces. These prosodic patterns help us understand what was said [2]. Over time, we learn to recognize commonly used prosodic patterns and attach meaning to them. For example, when you were very young, you learned that someone is asking a question if their voice rises in pitch at the end of the sentence. But you may not always be aware of such connections. As a Muggle, you might never have realized how important the correct stress pattern is in “Levi-o-sa”, since these magic words have no function in the Muggle world. Ron, on the other hand, must learn the correct stress pattern if he wants to make objects fly. As a wizard, he has to make the connection between the stress pattern “Levi-o-sa” and its function: producing a proper levitation charm.

How Does Our Native Language Influence Prosody?

When Harry, Ron, and Hermione are in their fourth year, students from Durmstrang and Beauxbatons visit Hogwarts to compete in the Triwizard Tournament. These international students speak English, but English is not their native language. Now, imagine that Hermione wants to teach one of these international students the levitation charm. You might think that the difference between “Levi-o-sa” and “Levio-sa” would be obvious to everyone, but in fact, people who speak another language might not be able to tell the difference as easily as you or Ron can. This is because not all languages use the same prosodic patterns. Have you, for instance, ever noticed how English or German sound very different from French or Italian? We have seen that, in English, a word can change meaning if you change the stress pattern (like in “ob-ject” and “ob-ject”), but in some other languages, stress patterns are always fixed. In French, for instance, stress is always on the final part of a word. So, Fleur, a student from the French wizarding school Beauxbatons, will probably say “Levio-sa”, just like Ron does. But the question is: would Fleur realize that the correct pronunciation has a different stress pattern? Listeners tend to stick to what they know, and their native languages may influence how they perceive speech in another language. If Fleur listens to Hermione teaching her the spell, she might be able to hear that “Levi-o-sa” is different from “Levio-sa”, but she will probably not realize how important the stress contrast is for the meaning of the word because stress contrasts do not exist in French. So, she may not recognize the stress contrast for what it is [3]. But do not worry, she can still learn to recognize it and if anyone can teach her, it is Hermione!

How Does Our Hearing Ability Influence Prosody Perception?

Good hearing is important for understanding prosody. After all, it would be hard to link prosodic patterns to their function if you could not hear the patterns in the first place. This is the case for listeners who hear very little or are completely deaf. Fortunately, a device called a cochlear implant (Figure 2) can bring back some hearing for these listeners. A surgeon implants a wire with tiny electrodes into part of the inner ear called the cochlea. This is the place where healthy ears transform soundwaves into electrical signals that are then sent to the brain via the auditory nerve. For listeners with cochlear implants, the transformation of soundwaves into electrical signals happens via the device and the electrodes send these signals to the auditory nerve directly. Listening with a cochlear implant is sometimes called electric hearing. In a sense, it is magical that this device can bring back some hearing, but electric hearing is far from perfect. Listeners with cochlear implants have difficulty hearing pitch differences [4]. If Ron had a cochlear implant, it would have been very hard for him to hear that Hermione pronounces “o” slightly higher in pitch than the other parts of the word “Levi-o-sa”. However, he would still be able to hear it as louder and longer, so there is a chance he would be able to learn the correct stress pattern with practice. In time, he would probably still be able to make sense of the prosodic pattern based on what he could hear, although this would be much harder work than if his hearing was not impaired.

Figure 2 - Ear with a cochlear implant.
  • Figure 2 - Ear with a cochlear implant.
  • The wire with electrodes is implanted in the spiral-shaped cochlea, which is the blue part that looks like a snail. The electrodes of the cochlear implant send sound-like electrical signals directly to the auditory nerve. The yellow lines attached to the cochlea are part of the auditory nerve. The auditory nerve carries the electrical signals to the brain (Image credit:

What Is the Magic of Speech Prosody?

The fact that speech prosody can make objects fly is pretty magical. But do you know what is even more magical? That you now know how important speech prosody is! It is like you have been waiting for a Hogwarts letter announcing you are off to Wizarding school so you can finally learn all about the magical powers of speech prosody. Well, here it is. Your letter has arrived. So, what are you waiting for? Get ready to go out into the Muggle world and use your speech prosody magic!


Speech Prosody: The musical quality of speech, like stress, rhythm, and intonation. It can express sarcasm and emotions, and it can also change the meaning of speech.

Stress Pattern: The way parts of a word or sentence are stressed or unstressed. Stressed parts are emphasized by increasing the relative pitch, loudness, and duration.

Rhythm: The structured organization of speech parts over time, like the beat of a song. Speech usually has a rhythm that you can tap along to.

Intonation: The way pitch varies over time, like the melody of a song.

Pitch: How high or low speech is. What we hear as pitch can be measured as the frequency of a voice—when the frequency goes up, the perceived pitch goes up.

Loudness: How loud or soft speech is. What we hear as loudness can be measured as the intensity of a voice—when the intensity goes up, the perceived loudness goes up.

Duration: How long or short speech is. The duration of speech is measured over time.

Cochlear Implant: An electronic device that can bring back hearing for deaf individuals. It uses electrodes to send sound-like electrical signals to the auditory nerve directly.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


This project was supported by the Center for Language and Cognition Groningen (CLCG) and by the VICI grant 918-17-603 from the Netherlands Organization for Scientific Research (NWO) and the Netherlands Organization for Health Research and Development (ZonMw). Further support was provided by the Heinsius Houbolt Foundation. We would like to thank Caryl Hart for her feedback on an earlier draft of the manuscript.


[1] Cole, J. 2015. Prosody in context: a review. Lang. Cogn. Neurosci. 30:1–31. doi: 10.1080/23273798.2014.963130

[2] Cutler, A., Dahan, D., and van Donselaar, W. 1997. Prosody in the comprehension of spoken language: a literature review. Lang. Speech 40:141–201. doi: 10.1177/002383099704000203

[3] Dupoux, E., Peperkamp, S., and Sebastián-Gallés, N. 2001. A robust method to study stress “deafness”. J. Acoust. Soc. Am. 110:1606–18. doi: 10.1121/1.1380437

[4] Everhardt, M. K., Sarampalis, A., Coler, M., Başkent, D., and Lowie, W. 2020. Meta-analysis on the identification of linguistic and emotional prosody in cochlear implant users and vocoder simulations. Ear. Hear. 41:1092–102. doi: 10.1097/AUD.0000000000000863