Lab teaches computers to understand human speech

Engineering Professor John Hansen is trying to improve the way computers understand and respond to vocal commands.

The present technological problem, Hansen said, is that voice-activated computer systems, the technology currently used for items such as collect calls, are often most needed when they are most likely to fail -- when the speaker is sick, under stress or surrounded by background noise. Hansen, an assistant professor of electrical engineering, is trying to change that by developing mathematical algorithms to analyze speech and adjust for factors such as stress, noise and accent.

For instance, Hansen said, the spoken and the yelled forms of the word "help" are very different, even when coming from the same speaker. The "h" and the "p" are very difficult to project, while the "el" part of the word is much more pronounced. When combined with the speaker's accent, one syllable can become an intricate puzzle for a computer programmed to identify words under normal conditions.

Changes to the sound of a word can stem from a variety of factors. In addition to the changes caused by raising the volume of a word, background noise can further complicate a sound, Hansen said. Some of the technology Hansen is developing is designed not only to cancel out background noise but also to determine which noise is most distracting to the human ear. For instance, microphone feedback is much more annoying than more "white" noise from an air duct, Hansen said.

Another one of Hansen's current projects is to develop a system able to screen children for speech, hearing and language problems. Such a system would need to be particularly conscious of regional or inner-city accents, to avoid labeling children who speak non-standard English as having a medical pathology.

To solve these problems, Hansen and his colleagues develop computer simulations, formulate algorithms and look at new signal-processing techniques to receive the sounds better.

"We don't form a solution for the sake of just generating mathematical equations," he said.

Other breakthroughs will help computers discern accents and learn to identify and interpret them for their listeners. This technology would especially help organizations such as the United Nations and NATO, in which speakers from many different language backgrounds have to converse in the common medium of American English.

"The range of accents in American English is unbelievable," Hansen said. In contrast, he said, the Japanese have been able to excel in the field of speech recognition because of the homogeneity of Japanese pronunciation. Reliable dictation systems for English speakers are still a long way off, he said.

Hansen's primary equipment consists of workstations, microphones, stereo equipment and styrofoam-padded walls to reduce extraneous noise. With these, he can test the accuracy of a cellular phone programmed to dial a certain number at a command of "call home" in a number of different environments, or work on developing a program to process commands given in a noisy airplane cockpit.

Hansen's research takes place in his office and in a few rooms in the basement

Discussion

Share and discuss “Lab teaches computers to understand human speech” on social media.