TY  - BOOK
AU  - Hori,Takaaki
AU  - Nakamura,Atsushi
TI  - Speech recognition algorithms using weighted finite-state transducers
T2  - Synthesis lectures on speech and audio processing,
SN  - 9781608454730
PY  - 2013///
CY  - San Rafael, Calif.
PB  - Morgan & Claypool
KW  - Speech processing systems
KW  - Automatic speech recognition
KW  - Transducers
N1  - Includes bibliographical references (p. 137-148); Preface -- 1. Introduction -- 1.1 Speech recognition and computation -- 1.2 Why WFST? -- 1.3 Purpose of this book -- 1.4 Book organization --; 2. Brief overview of speech recognition -- 2.1 Statistical framework of speech recognition -- 2.2 Speech analysis -- 2.3 Acoustic model -- 2.3.1 Hidden Markov model -- 2.3.2 Computation of acoustic likelihood -- 2.3.3 Output probability distribution -- 2.4 Subword models and pronunciation lexicon -- 2.5 Context-dependent phone models -- 2.6 Language model -- 2.6.1 Finite-state grammar -- 2.6.2 N-gram model -- 2.6.3 Back-off smoothing -- 2.7 Decoder -- 2.7.1 Viterbi algorithm for continuous speech recognition -- 2.7.2 Time-synchronous Viterbi beam search -- 2.7.3 Practical techniques for LVCSR -- 2.7.4 Context-dependent phone search network -- 2.7.5 Lattice generation and N-best search --; 3. Introduction to weighted finite-state transducers -- 3.1 Finite automata -- 3.2 Basic properties of finite automata -- 3.3 Semiring -- 3.4 Basic operations -- 3.5 Transducer composition -- 3.6 Optimization -- 3.6.1 Determinization -- 3.6.2 Weight pushing -- 3.6.3 Minimization -- 3.7 Epsilon removal --; 4. Speech recognition by weighted finite-state transducers -- 4.1 Overview of WFST-based speech recognition -- 4.2 Construction of component WFSTs -- 4.2.1 Acoustic models -- 4.2.2 Phone context dependency -- 4.2.3 Pronunciation lexicon -- 4.2.4 Language models -- 4.3 Composition and optimization -- 4.4 Decoding algorithm using a single WFST -- 4.5 Decoding performance --; 5. Dynamic decoders with on-the-fly WFST operations -- 5.1 Problems in the native WFST approach -- 5.2 On-the-fly composition and optimization -- 5.3 Known problems of on-the-fly composition approach -- 5.4 Look-ahead composition -- 5.4.1 How to obtain prospective output labels -- 5.4.2 Basic principle of look-ahead composition -- 5.4.3 Realization of look-ahead composition using a filter transducer -- 5.4.4 Look-ahead composition with weight pushing -- 5.4.5 Generalized composition -- 5.4.6 Interval representation of label sets -- 5.5 On-the-fly rescoring approach -- 5.5.1 Construction of component WFSTs for on-the-fly rescoring -- 5.5.2 Concept -- 5.5.3 Algorithm -- 5.5.4 Approximation in decoding -- 5.5.5 Comparison with look-ahead composition --; 6. Summary and perspective -- 6.1 Realization of advanced speech recognition techniques using WFSTs -- 6.1.1 WFSTs for extended language models -- 6.1.2 Dynamic grammars based on WFSTs -- 6.1.3 Wide-context-dependent HMMs -- 6.1.4 Extension of WFSTs for multi-modal inputs -- 6.1.5 Use of WFSTs for learning -- 6.2 Integration of speech and language processing -- 6.3 Other speech applications using WFSTs -- 6.4 Conclusion --; Bibliography -- Authors' biographies
UR  - http://dx.doi.org/10.2200/S00462ED1V01Y201212SAP010.
UR  - http://www.morganclaypool.com/doi/abs/10.2200/S00462ED1V01Y201212SAP010
UR  - http://oclc-marc.ebrary.com/Doc?id=10649981
UR  - http://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=503575.
UR  - http://site.ebrary.com/id/10649981
UR  - http://proquest.safaribooksonline.com/?fpi=9781608454730.
ER  -