Search Engines Break the Sound Barrier
Computerworld -
Do your telemarketers consistently make legally required disclaimers when selling securities? If your firm records its telemarketing calls, IT could set up audio mining software to let management search audio file archives to quickly find the answer.
Emerging audio mining tools, also called audio indexing or audio search software, offer speech processing and search technologies in a single package. The speech engine creates an index that includes a time and date stamp for each spoken word or phoneme in an audio or video file. The search engine then uses that index to allow rapid identification and playback of specific passages. The software may also apply metatags that identify the speakers or the subject of a given passage.
The speech-processing accuracy of speech-to-text engines, traditionally used to index high-quality broadcast audio, has advanced to the point where vendors are introducing new packages for indexing more informal conversations, ranging from corporate meetings to training videos and even help desk telephone conversations.
"[The technology] seems to have passed the threshold of usability," says William Meisel, president of TMA Associates, a speech-recognition consulting and market research firm in Tarzana, Calif.
Unlike speech-to-text packages, which can be trained for individual users, audio indexing products are speaker-independent. They also rely on large, language-specific vocabulary dictionaries, as well as domain models that may optimize for the type of conversation (e.g., telephone) or industry (e.g., health care). While the newest products can process audio at or faster than real time with an accuracy sufficient for searching, the output text isn't a readable transcript, cautions Jackie Fenn, an analyst at Stamford, Conn.-based Gartner Inc. And as new companies, products and terms come into use, users must update their systems regularly or face what Francis Kubala, division scientist at Cambridge, Mass.-based BBN Technologies, calls "the out-of-vocabulary problem."
Audio mining's most compelling fit may be for applications where a searchable index can replace the need for transcription. In contrast, data mining of audio content for marketing purposes is "a little bit of an evangelistic sell" at this point, Meisel says.
"The call center is a little tougher, because you may or may not discover something [with audio mining]," explains Fenn.
The technology's greatest value may be derived from embedding it in other applications. San Mateo, Calif.-based Virage Inc., for example, offers both Atlanta-based Fast-Talk Communications Inc.'s Fast-Talk and BBN's Audio Indexer as plug-ins to its VideoLogger video indexing system. More advanced applications could eventually integrate call center logs with sales activity and other customer relationship management data, analysts say.
But audiomining hasn't worked in every case. Ted Ryan, manager of collections development at Atlanta-based The Coca-Cola Co., says he wanted to use it to index television commercials last year, but "the voice-overs clashed with the music." With an accuracy rate of just 15%, he turned to manual transcriptions.
Coca-Cola also tried using audio indexing of meetings. "Our chief executive [at the time] was Cuban. When we ran it with executive speeches, it came up with gobbledygook," Ryan says.
Nonetheless, he says he's interested in testing the latest tools to index radio advertisements. And accuracy continues to improve, says Kubala, adding that he expects the word error rates for nonbroadcast audio to drop dramatically during the next three years.
Additional Resources



Learn the important issues you must consider before starting your next mobility initiative. Get your mobility white paper from IDC now, compliments of Sybase.
White Papers & Webcasts
Natural User Interface for Enterprise Applications
Learn how a revolutionary user interface can make a complex enterprise application so intuitive even casual users can jump right in....
Why Now is the Right Time for the Linux Desktop
(Source: Novell) Faced with tighter budgets, enterprises are rethinking their desktop strategies to deliver the same - if not better - services and...
Moving Beyond Monolithic - What's Next for Enterprise Application Architectures?
This white paper reviews the current state of enterprise application architecture and presents a prediction on what might come next....
Novell Opens PR Video
Is the Linux desktop for me? Customers are looking for ways to be more flexible and save money. Using Linux offers a great...
SUSE Linux Enterprise Server Deployment Approach Guide
This document is intended for IT professionals and managers who are considering deploying SUSE Linux Enterprise Server. Novell has had a number of...
Usability Is Everything
Learn what sets Workday's HR and Payroll solutions apart from the competition....
SUSE Linux Enterprise Desktop Data Sheet
SUSE Linux Enterprise Desktop is the market's only enterprise-quality Linux desktop ready. It delivers seamless interoperability with existing enterprise systems and dozens of...
The Value of Real SaaS at Workday
Cost savings, speed to value, and innovation brought to the enterprise by Workday's software-as-a-service solutions for HR and Payroll....
SUSE Linux Enterprise Server Data Sheet
SUSE Linux Enterprise Server is a highly reliable, interoperable and manageable server operating system built to power mission-critical workloads in physical and virtual...
SaaS at Flextronics, Inc.
Dave Smoley, CIO of Flextronics, discusses the real value of software-as-a-service and why he chose Workday for his HR solution....
Subscribe to Computerworld
