Vocal capabilities

Printer-friendly version

This is a brief discusssion of this portion of the laryngology exam that I wrote in the 2000s. To read a longer reasoning behind this portion of the examination, see Further details

Vocal Capabilities - posted October 2015. 


This portion of the exam actually starts during the history interview as many cues can be picked up during conversation with the patient. Then a series of vocal behaviors are elicited from the patient and recorded. The reason for performing these tasks are that they are a vocal stress test. Like the more familiar cardiac stress test, many problems show up under exercise. The speaking voice is typically near the lowest end of the patient’s pitch range and using the speaking voice could be likened to measuring the heart's ability to pump while sitting down at rest. Asking the patient to phonate throughout their range then would be like exercising. A generalization is that most voice disorders cause problems with stiffening the vocal fold, and are most detectable at high pitch when extra physiologic stiffness is being applied to the vocal folds. Neuromuscular problems are typically best detected at lower pitches when the additional physiologic flaccidity augments any pre-existing muscular problem.

This portion of the exam is audio recorded for documentation and later review.

Speaking voice

Anchor pitch is the lowest common pitch during a given task. We use a reading task. The patient states his/her name and reads a paragraph. In our clinic the paragraph «Mans First Boat» is utilized. It is approximately a 4th grade level of reading. There are other phonetically balanced paragraphs in use as well. It is probably most important to be internally consistent by using the same text each time. When the patient cannot read or cannot speak English we resort to counting in the patient's own language. While the patient is reading we match their most obvious lowest pitch and note this as their anchor/speaking pitch. This should be considered their fundamental frequency (F0). A typical F0 for men is about C3 or 100 Hz. A typical F0 for women is about G3 or 200Hz.

Reading passages:

Man’s First Boat Long ago, men found that it was easier to travel on water than on land. They needed a cleared path or road when traveling on land. But on water, a log of wood or any large object that would float became a man’s boat. It served to carry him across a stream or down a river. The earliest boats were probably made by fastening three or four logs of wood together. Such a boat we call a raft. A long pole was used by the man on the raft to help guide it across the water. The pole was also useful if the raft got stuck in the mud. 

Other reading passages typically used for neurologic disorders Early one morning, a man and a woman were ambling along a one mile lane running near Rainy Island Avenue. He saw half a shape mystically cross a single path at least fifty or sixty steps in front of his sister Kathy’s house.

Maximum phonation time (MPT)

I ask the patient to say /i/ (which can be translated as a prolonged eeeeee sound) at their anchor pitch. Typically they will need to be reminded that it is low in their voice. We try to use their anchor pitch to make the maximum phonation time as consistent as possible between exams, at least for a given patient. They are asked to breathe in fully and hold /i/ at anchor pitch for as long as possible. This is a very imprecise measure because variables other than pitch, such as degree of loudness or pressed phonation, can drastically affect the MPT. However, it is a useful measure in a given patient when measured before and after treatment, particularly when dealing with air wasting disorders.

Hertz vs Semitones

If you read medical papers on voice, chances are you have encountered a statement like “The pitch increased an average of 20 Hz (Hertz) with the procedure.” This is a relatively meaningless statement unfortunately. Hertz is a logrithmic/exponential scale. Averaging numbers (adding up a group of numbers and dividing by the total) in an exponential scale is a non-simple task. It mixes up addition with multiplication and the order in which one adds and multiplies has an effect on the answer. It can be done with logrithms, but not directly. The effect is perhaps most noticeable when you compare pitch notation in Hertz with pitch notation in semitones.

Perhaps thinking about a piano might be helpful. If you played a major scale, Do, Re, Mi, Fa, So, La, Ti, Do. There is a full tone between the Do and the Re, between the Re and the Mi, while there is a half-tone or a semi-tone between the Mi and the Fa. On the other hand if you play every key in a row, including the black keys, the distance between each note is a semi-tone. Whether you are at the bottom of the piano or at the top, the sound interval between a C note and a D note sounds like the same interval or distance.

Each note in the 12 note scale goes up an equal amount, that is, an equal amount exponentially speaking. The jump between C3 and C3# is 15.56 Hertz and the very next jump between C3# and D3 is 16.48 Hertz. Although the Hertz jump is not equal between the notes, it is an equal jump in the exponent number and it sounds like an equal jump to our ears going up the scale.

For a more extreme example at the top of the piano, if you jump from C to D you may have jumped 256 Hz, while at the bottom of the piano, the interval between C and D measures only 8 Hz.

I utilize musical notation to describe pitch range. Middle C on the piano is called C4 (the beginning of the fourth octave on the piano). Men typically speak comfortably in a range from B2 to E3. That is to say, on the piano B2 is the B in the second octave (13 notes or 13 semi-tones below middle C) and E3 (the E below middle C or 8 semi-tones below middle C) is the typical range that most males speak in comfortably.

Most females speak in a range between F3 and A3. These notes are found in the octave below middle C. There are some women who speak outside this range. One example could be a cigarette smoking woman and has smoker's polyps might have a speaking pitch of E3 or D3 in the upper end of a typical male's range. Pitch is not the only component of the perception of male and female. Resonance plays into that perception as well. Thus a woman with a speaking pitch of E3 may still sound like a woman rather than a man despite being in the typical male speaking pitch range.

Because the perception of the distance between notes on the piano, where ever they lie is equal, it seems fair to say that a surgical change that raises the voice from a C to a C# and a surgery that raises a voice from a F to a F# has had an equal effect by raising the voice one semi-tone.

Speaking loudly

The patient re-reads the paragraph in their loudest possible voice. Some coaching is necessary since some patients will be hesitant to embarrass themselves since they know their voice is limited and may not sound well to others. Non-organic patients will have difficulty with this unexpected task. Underdoers may have a restrained quality.


I ask the patient to say "Hey!" as if they had an emergency and had to get someone’s attention. Disorders which cause a flaccidity of the vocal fold, such as paralysis or atrophy, will lack an edge to the sound or, if more severe, have the characteristics of a leaky valve. The harder a flaccid vocal fold is driven, the louder will be a luffing or fluttering sound. This sound will be apparent, especially at low pitch, since the additional energy imparted to the vocal fold combined with its flaccidity will cause the fold to buckle out rather than to draw in and start vibrating. At higher pitches this luffing may disappear as tensioning of the vocal folds from the cricothyroid muscle increases the ability of a flaccid vocal fold to recoil. 

  • The task may allow stiff vocal cords to actually produce sound, when quiet sounds were almost impossible.
  • Non-organic disorders will typically demonstrate an unusual pattern such as the voice getting softer instead of louder during a yell. 

Pitch range

Pitch range determinations almost always require coaching since singers with voice problems are embarrassed when they cannot get their voice to perform properly, perhaps even threatening their career. Other singers, who are shower or car radio singers only, are sometimes quite reluctant to sing in front of an audience, even an audeince as small as one person - the examiner.

To confirm range we listen for characteristics of a vocal ceiling and vocal floor to determine overall pitch range. A muscular ceiling has a tight, strained quality. A mucosal ceiling has a breathy quality. A rapport ceiling has a completely normal sound. Ceilings tend to be the same for different vocal tasks so when several tasks reveal the same pitch, the upper range has been determined.

For the ceiling, the patient is asked to sing /i/ repeating the pitch of the examiner. Often we try working up the scale by intervals such as Do Mi Sol Do. Another method is to have the patient make a sound like a siren striving to reach the highest note possible. Above C5 we may switch to /a/ or if the patient is having difficulty we switch to /oo/. We verify the ceiling by asking the patient to sing the first phrase of Happy Birthday, again working up the scale until a ceiling is reached. A staccato task is also useful for confirming the pitch ceiling.

To determine the floor of the pitch range, the patient is asked to lower their pitch in a stepwise fashion. We ask them to keep going until we hear vocal fry (a popping sound like grease in a frying pan), they can’t move lower or the sound becomes breathy. Then they are asked to try to reach their lowest note by gliding down in pitch from a mid-range starting point. In some pathology the anchor pitch is often right at the bottom of the pitch range. The normal recoil position of the vocal folds should be 5 to 6 semitones higher.

Much pathology is revealed in the upper pitch range so this is an extremely important part of the vocal exam. The accuracy of the examiner is increased by having the patient perform multiple tasks and verifying that the extremes are the same in each task.

We utilize musical notation to describe pitch range. Middle C on the piano is called C4 (the beginning of the fourth octave on the piano). One advantage to using musical measures is the ease of communication with singers. See swelling tests below. It is possible to use Hertz as well.

Vegetative sounds

I ask the patient to cough followed by a clearing of their throat. Theses tasks can be helpful in sorting out weakness of the glottis or psychogenic problems. For instance if a patient could only whisper up to this point in the exam, but can produce a robust cough, then the vocal cords have the capacity to come together and generate sound and were likely being held apart with muscle tension up to this point in the exam.

Swelling tests

Vocal fold swellings cause a very characteristic interruption of the voice as a patient sings a note higher and higher - the vocal  ceiling effect. This was described by Robert Bastian as the Vocal swelling tests (Bastian RW, Keidar A, Verdolini-Marston K. Simple vocal tasks for detecting vocal fold swelling. J Voice (1990;4:172-183.).

Generally, we should be able to produce the extreme upper and lower notes of our range at both loud and soft volumes. When we cannot reach the same note softly that we can reach loudly, there is probably an impairment and the greater the difference in vocal pitch range, between loud and soft voicing, the more significant the problem.

One of the easiest ways to determine the upper soft range is to sing the first four words of the nearly universally known song, Happy Birthday. This is sung by the patient at the softest possible volume, in a pianissimo, boy soprano style. When singing the words, “Happy Birthday to you", between the word “day” and “to” is a melodic interval of a fourth. If no sound comes out on the word “to” or if there is a significant onset delay to the start of vocal cord vibration on that word, then there is some mechanical change in the larynx in this interval of a fourth. This test can be repeated up or down a note and the point where the voice cuts out denotes the soft cutoff point.

 As pitch is elevated, a maximal vocal ceiling will be reached, that is - a pitch beyond which the patient cannot get any higher. A mucosal vocal ceiling will be characterized by onset delays (air escape prior to phonation) and ability to overcome this vocal ceiling by driving the voice louder or increasing the volume. A muscular vocal ceiling will be evident by a slight flatting or straining of the pitch. A central gap between the vocal cords, such as with vocal cord bowing, will have gradually increasing amounts of air escape and onset delays will very in pitch depending on how much supra-glottic squeeze is present.

Observing a patient's voice at high pitch is essential for finding disorders that cause stiffness of the vocal fold. Physiologically pitch is increased by stretching the vocal fold and this physiologic stiffness adds to any lesion-induced stiffness to stop or distort vocal fold vibrations. Thus, a lesion with minimal stiffness will show up as a loss of high, soft singing. As the lesional stiffness increases, the voice becomes pathologic at lower and lower pitches. Since the speaking voice is typically at the bottom end of a patients pitch range, it takes a significant amount of stiffness to cause a disordered speaking voice.

Swelling tests can be taught to patients and can be essential in preventing recurrence of nodules and polyps. The patient is taught to use the tests as a monitoring device. The person would sit down at a piano and sing the first line of Happy birthday to you at the softest possible volume. They would move up the scale until they experience and onset delay singing the word "to". That is likely the pitch where the vocal cords are touching because of the swelling.

If a person sings the tests daily, she can detect a swelling if it is worsening and alter her behavior before the swelling becomes chronic and permanent. For example, let's say the person was out at a bar the night before speaking very loudly in the swellings had increased in size from the vocal trauma. The next morning, with larger swellings, she would note that her onset delays occur at a lower than typical pitch. If she then rested her voice for several days, as the swellings went down in size, the note at which onset delays are experienced would gradually rise in pitch.


Part three is the laryngoscopy examination with a camera.