If I Can Diagnose Your Voice by Ear, So Will AI
I see a brief note on my desk: you woke up one day with a cold and a hoarse voice. The cold went away, but the voice problem stayed. With just that brief note — that you’re hoarse, and why you think you’re hoarse — I probably have at least a 10% chance of already knowing what’s wrong with your voice, and maybe a 50% chance that I can narrow the possibilities down to just a few.
I’ve been listening to voices for over 30 years. So when I hear you tell me directly, in your own words, why you’ve come to see me, something else happens. Even before I’ve finished interviewing you, I can often guess quite accurately what you have. Unbeknownst to you — and, frankly, to me — my brain is parsing the vowels in your speech, discarding the consonants and the words themselves, and recognizing how your vocal cords are vibrating. By the end of the interview, I suspect I have about a 30% chance of knowing the correct diagnosis with high certainty, and a 75% chance of narrowing it to a tight differential.
Then I put your voice through a few simple tests — soft sounds, loud sounds, high pitch, low pitch — and I’m probably making an accurate diagnosis about 75% of the time, with a narrow differential more than 90% of the time. So before I ever look, I know, with great accuracy, what your vocal problem is. I then use the endoscope to confirm what I heard — and every so often it teaches me something I’d missed, improving my accuracy the next time.
Here is my first prediction, and I don’t make it to alarm anyone. I’m certain there will soon be an app that can listen to a voice, process it, and return a reasonably accurate tentative diagnosis. For a good while yet, a physician will still need to look and confirm it — so this isn’t a threat to anyone’s practice. In time it may go further. But the point stands: the voice carries enough data that a machine will be able to read it.
And that is exactly what troubles me — because if a machine can extract that much from the voice, why are so many of us not even listening?
I’m fairly certain that more than 99% of primary care physicians would not know how to listen to a voice and reason toward a diagnosis. That’s understandable; it isn’t their field. Harder to excuse are the ENT doctors who never really listen. And most pointed of all — the specialists like me, the laryngologists and phoniatrists, who have every reason to listen and still don’t. We reach for the endoscope first and treat the voice as noise on the way to the picture.
The voice is not the preamble to the diagnosis. Very often, it is the diagnosis.
I have it backwards from most of my colleagues, and deliberately so. An app is going to prove the point soon enough. The only real question is whether we’d rather learn it from a machine — or finally start listening ourselves.
