IBM speech technology crosses language barriers

07.08.2006
IBM is now testing speech technology that can verify speakers' identities over the telephone, even if they are not speaking their native languages.

On the eve of the SpeechTek trade show, which begins today and runs through Thursday in New York City, Brian Garr, IBM's director for enterprise speech, said the company is likely to release products with the speaker verification feature "within the next 12 months."

Speaker verification, also known as voice-print recognition, can be used by companies to authenticate customers over the telephone and serve as a spoken equivalent to a password. If and when laws adapt, it could also eventually serve as a legal proxy for a written signature.

For speakers' identities to be identified, they would first have to provide a sample of their voice.

"You could say very little. It could be just a single sentence, like 'My name is my password,'" Garr said. "You could register in English, and then later verify in French or German. As far as I know, we're the only company that can do that."

IBM does not yet boast such technology in its chief voice product, WebSphere Voice Server. But Garr said that the product, used by 87,000 customers as a way to enhance customer service, has important back-end technology that enables companies to integrate it with other WebSphere middleware products such as its WebSphere Application Server.

Garr said IBM recently won a new customer, German voice systems integrator dtms Solutions, as a customer. The company had previously used technology from market leader Nuance Communications Inc. Nuance merged with ScanSoft Inc. in October. At the time, the combined firm held about three-quarters of the speech recognition market, according to Gartner Group.

Garr declined to reveal Voice Server's exact sales, saying only that it has sold "thousands of ports." He said that the speech recognition market has grown slower than expected, but that adoption of Voice-over-Internet-Protocol (VOIP) as well as Service-Oriented Architectures (SOAs) should help spur demand.

One area where IBM appears to be lagging is in support for the VoiceXML 2.0 format, a World Wide Web Consortium (W3C) standard supported by several companies. including Nuance, Avaya Inc. and Nortel.

According to Ken Rehor, a Palo Alto, Calif. telecommunications consultant who also chairs the VoiceXML Forum, the group has now certified a total of 18 platforms and services as compliant with VoiceXML 2.0.

Microsoft, a growing player in speech technology, is readying its own Speech Server 2007, which will support VoiceXML 2.0.

But Microsoft suffered an embarrassing moment late last month when, during a demonstration in front of financial analysts of the upcoming Windows Vista operating system's built-in voice recognition capabilities, the software continually spelled out different words than were being spoken.

When asked how IBM's voice products differ from those from Microsoft, Garr said: "Primarily because we provide an entire application stack through WebSphere," he said. "Speech technology is what it is. We're all pretty good at it at this point."