Home  |   Login  |   Logout  |   Access Information  |   Alerts  |   Purchase History  |   Cart  |   Sitemap  |   Help   
 
CrossRef Search
BROWSE SEARCH IEEE XPLORE GUIDE SUPPORT
You requested this document:
1. Hierarchical search for large-vocabulary conversational speech recognition: working toward a solution to the decoding problem
Deshmukh, N.; Ganapathiraju, A.; Picone, J.;
Signal Processing Magazine, IEEE
Volume 16,  Issue 5,  Sept. 1999 Page(s):84 - 107
Abstract:

Large vocabulary continuous speech recognition (LVCSR) systems have advanced significantly due to the ability to handle extremely large problem spaces in fairly small amounts of memory. The article introduces the search problem, discusses in detail a typical implementation of a search engine, and demonstrates the efficacy of this approach on a range of problems. The approach presented is scalable across a wide range of applications. It is designed to address research needs, where a premium is placed on the flexibility of the system architecture, and the needs of application prototypes, which require near-real-time speed without a great sacrifice in word error rate (WER). One major area of focus for researchers is the development of real-time systems. With only minor degradations in performance (typically, no more than a 25% increase in WER), the systems described in this article can be transformed into systems that operate at 10×RT or less. There are four active areas of research related to this problem. First, more intelligent pruning algorithms that prune the search space more heavily are required. Look-ahead and N-best strategies at all levels of the system are key to achieving such large reductions in the search space. Second, multi-pass systems that perform a quick search using a simple system, and then rescore only the N-best resulting hypotheses using better models are very popular for real-time implementation. Third, since much of the computation in these systems is devoted to acoustic model processing, fast-matching strategies within the acoustic model are important. Finally, since Gaussian evaluation at each state in the system is a major consumer of CPU time, vector quantization-like approaches that enable one to compute only a small number of Gaussians per frame are proven to be successful. In some sense, the Viterbi (1967) based system presented represents only one path through this continuum of recognition search strategies
Abstract | Full Text: PDF(1696 KB)    IEEE JNL
 
» Key
IEEE JNL IEEE Journal or Magazine
IEE JNL IEE Journal or Magazine
IEEE CNF IEEE Conference Proceeding
IEE CNF IEE Conference Proceeding
IEEE STD IEEE Standard
 
 
Indexed by IEE Inspec
© Copyright 2008 IEEE – All Rights Reserved