Microphone Array Allows Researchers to Localize Sound Sources

11/2/2012 Katie Carr, ADSC

University of Illinois at Urbana-Champaign Electrical and Computer Engineering Professor Doug Jones leads a team of researchers working to create a system that provides a full 4D audio-visual remote reality, using three spatial dimensions plus time.

Written by Katie Carr, ADSC

The human ear is an intricate body part allowing humans to hear and identify where sounds originate. It allows humans to have a sense of where they are in the world.

University of Illinois at Urbana-Champaign Electrical and Computer Engineering Professor Doug Jones leads a team of researchers working to digitally recreate this natural phenomenon by developing low-cost, realistic, real-time audio-visual telepresence at the Advanced Digital Sciences Center in Singapore.

Jones’ team includes ADSC researcher Shengkui Zhao and ADSC software engineerSaima [dot] a

at
at
adsc [dot] com [dot] sg ( Saima Ahmed). Their research project, Realistic Audio Telepresencing for Entertainment and Meetings (RATEM), aims to create a system that provides a full 4D audio-visual remote reality, using three spatial dimensions plus time. A system like this could revolutionize teleconferencing, augmented reality and gaming by providing a true 4D sound and video perception. Additionally, it would enable the ability to create a telepresence where participants are placed in a virtual setting, but interact as though they are physically present together.

"You would be able to talk to the person on your right or left and hear them and look at them in the right place and carry on simultaneous conversations," Jones said of these virtual meetings.

Jones’ research group is collaborating with University of Illinois at Urbana-Champaign Electrical and Computer Engineering Associate Professor Minh Do's Immersive Telepresence for Entertainment and Meetings (ITEM) research group at ADSC. While Do focuses on the vision portion of the research, Jones is working to recreate sounds accurately and from the correct direction, leading to a 4D audio experience.

According to Jones, it is easy for humans to tell where a sound is coming from, but it is much more difficult for a computer, which would be used during teleconferencing. The human ear can locate sounds from above and below, as well as in front or behind and to the left and right. By using the brain, external ear and internal ear, humans are able to estimate the location of a sound. However, it is difficult to replicate this on a computer as scientists don’t fully understand the human body’s biological neural determination process.

To solve this problem, most scientists are using microphone array signal processing to localize sound sources. Unfortunately, current methods require large arrays of many microphones and expensive, special-purpose computers to do so in real time.

ADSC's microphone array
ADSC's microphone array

ADSC's actual microphone array, which measures just a few millimeters. University of Illinois at Urbana Champaign Electrical and Computer Engineering Professor Doug Jones developed the microphone array at Illinois in 2005.

“With those kinds of installations, you can do what we’re talking about,” Jones said. “But it costs several hundred thousand dollars per room. We want to do it at a reasonable price.”

Jones and Zhao aim to solve this fundamental research challenge by perfecting 3D audio direction-finding, as well as directional voice acquisition and binaural 3D audio reconstruction.

The pair recently developed a real-time 3D direction finding algorithm for a microphone array. Using a microphone array and an ordinary laptop, the program is able to track the location of a speaker’s voice. According to Jones, the new hardware and software are smaller, use less power, are less expensive and are more accurate than today’s conventional approaches that use much larger microphone arrays.

In 2005, Jones developed the smallest microphone array in the world at the University of Illinois. The array has four microphones that each record sounds from a different direction. He originally developed the array to be used in hearing aids, but it is now being further developed at ADSC to discover the relative strength of a sound at the different microphones to determine what direction the sound came from.

When a human hears a sound, their ears split the sound up into different frequency bands to determine where the sound in that frequency range is coming from. Additionally, ears are 15 centimeters apart, meaning a sound reaches one ear before the other, causing a time delay. This helps the ear determine where the sound is coming from. Using Jones’ biologically-inspired microphone array, with each microphone measuring only a few millimeters and located within a few millimeters of each other, the researchers are able to locate the direction of sounds. Since the microphones are so close together, the array doesn’t use time delay, but rather the directional pattern of the sound to determine the sound source. Jones and his team are attempting to identify the sound source with just a few microphones, or essentially the way the brain does it.

Demonstration of sound direction finding
Demonstration of sound direction finding

Researcher Shengkui Zhao and ADSC Director Marianne Winslett test ADSC's microphone array to track where a speaker's voice is originating.

Jones and Zhao aim to make their 4D audio technology practical, small and affordable, in hopes that it will be usable in many different areas.

While Jones began developing theoretical techniques at Illinois, he said at that time there wasn’t interest in taking the next step toward practical research. By bringing the project to ADSC, he has been able to work on building better algorithms and optimizing implementations.

“I don’t think we would’ve been able to get where we are today if we weren’t doing the research in Singapore,” Jones said. “It has been great to work with the computing and video researchers at ADSC to try to integrate our research with other modalities, like vision, to combine sound with images and video.” 

As Jones and Zhao have succeeded in tracking the directions of sounds with a microphone array, their research will now shift to focusing on finding a solution for the cocktail party effect, or separating voices and sounds from other background noises, and realistically reconstructing a sound.

Jones and his team had two research papers accepted to the 2012 Design, Automation and Test in Europe Conference (DATE) and the 2012 Conference on Industrial, Electronics and Applications (ICIEA) and have submitted three other papers to various international conferences. Additionally, Zhao is close to finishing development of a real time 3D sound reconstruction system, and the team is beginning to research other possible applications for the findings, such as defense, virtual sports or possibly robotics.

“I think this research definitely has market potential,” Jones said. “My dream would be that every video camera or cell phone would be able to record 3D sound and listen to the recording in full 3D. Microphones are cheap and arrays are small, so this technology could be on even the smallest cell phone or video camera. Technology is almost at that point.” 

The Advanced Digital Sciences Center is a University of Illinois at Urbana-Champaign research center based in Singapore, led by Illinois Computer Science and Electrical and Computer Engineering faculty. ADSC focuses on breakthrough innovations information technology.


Share this story

This story was published November 2, 2012.