Spatial Sound

via Daily Prompt: Sound

The famous painter Vincent Van Gogh is supposed to have cut off his own ear and presented it to his love. The barbaric punishment of cutting off a prisoner’s ear is supposed to have been inflicted by many tyrants throughout history, but just what is the effect of cutting off the ear? After all, the ear canal is still there, so is the ear drum so presumably, the person would still be able to hear.


Vincent Van Gogh

It turns out that the external ear (called the pinna) is important in determining the direction of sound. This is of great interest to companies like Microsoft working on virtual and augmented reality.


The Pinna

To understand this, lets start from some basics like the human hearing. Humans typically hear sounds between the range of 20Hz to 20,000Hz. But is generally considered most sensitive between 1000 to 4000Hz.

There are several effects that go into our understanding of where a sound is coming from. The two simplest ones are difference in time and loudness. Suppose someone is talking to your right, the sound from their voices will reach your right ear before it reaches your left ear. This tiny time difference is called the interaural time delay (ITD) and is used by the brain to figure out where a sound is coming from. Another factor is that the sound in your right ear will be slightly louder (depending on the shape of your head) than the sound in your left ear. This is called interaural loudness difference (ILD) and again, this provides a cue to the brain about where a sound is coming from.


The path of sound to both ears is different

BBC Radio did a series of horror dramas in what it called 3D sound for listeners with headphones. This basically used such effects to make the listener feel as if the sound effects and speakers were spatially located around him. So the sound of footsteps would be louder and come sooner to one of the headpieces than the other, one speaker would be louder in one ear while the other in the other ear.

But there are some limitations here. For instance, neither ITD nor ILD can tell us anything about whether the sound is coming from the front or the back. In fact, there is something called the “cone of confusion” which is a space where if any sound is made, we will not be able to distinguish where it is from compared to any other point in the cone of confusion.

It turns out the shape of our head and shoulders determines to a large extent how we perceive sound as coming from the front or back. Lower frequencies are reflected from different parts of the head, nose and upper torso before they enter the ear. Our brain recognizes these reflections to mean the sound is coming from the front. Sounds from the back produce a different set of reflections.

Audio researchers call these reflections as the HRTF or Head Related Transfer Function. This is also important in perceiving altitude. So the way the sound reaches your ear from a bird sitting above on a tree is different from the way it comes from a bird sitting on the ground. Because of the angle it arrives at your body is different and the way it reflects off different parts of your body is different hence the final sound reaching your ears is different, allowing you to tell where the bird is even if you are blindfolded.

To study these effects, audio researchers place microphones in the ears of special (very expensive) dummies.

Dummies used for 3D audio research

The pinna is also important in such direction sensitivity, especially in the high frequency range. Sounds coming from different directions are reflected differently by the ear and the brain learns to associate these reflections with certain directions.

But since everyone’s head and ear is shaped differently, everyone hears the same sound differently. This is why 3D sound has to be tailored to every hearer. If you listen to the sound customized for someone else, it will seem odd.

At Microsoft, they have developed a depth camera that measures the key features of a face (distance between eyes, height and width of head etc.) and can predict the corresponding HRTF. They can then predict how a sound coming from a particular direction would appear like to that person. This is being used in their next generation of augmented reality products.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s