Amazon cover image
Image from Amazon.com

Speech recognition algorithms using weighted finite-state transducers / Takaaki Hori and Atsushi Nakamura.

By: Contributor(s): Series: Synthesis lectures on speech and audio processing ; # 10.Publisher: San Rafael, Calif. : Morgan & Claypool, 2013Copyright date: ©2013Description: 1 online resource (xii, 150 p.) : illustrations., digital file ; 24 cmContent type:
  • text
Media type:
  • computer
Carrier type:
  • online resource
ISBN:
  • 9781608454730
Subject(s): Online resources:
Contents:
Preface -- 1. Introduction -- 1.1 Speech recognition and computation -- 1.2 Why WFST? -- 1.3 Purpose of this book -- 1.4 Book organization --
2. Brief overview of speech recognition -- 2.1 Statistical framework of speech recognition -- 2.2 Speech analysis -- 2.3 Acoustic model -- 2.3.1 Hidden Markov model -- 2.3.2 Computation of acoustic likelihood -- 2.3.3 Output probability distribution -- 2.4 Subword models and pronunciation lexicon -- 2.5 Context-dependent phone models -- 2.6 Language model -- 2.6.1 Finite-state grammar -- 2.6.2 N-gram model -- 2.6.3 Back-off smoothing -- 2.7 Decoder -- 2.7.1 Viterbi algorithm for continuous speech recognition -- 2.7.2 Time-synchronous Viterbi beam search -- 2.7.3 Practical techniques for LVCSR -- 2.7.4 Context-dependent phone search network -- 2.7.5 Lattice generation and N-best search --
3. Introduction to weighted finite-state transducers -- 3.1 Finite automata -- 3.2 Basic properties of finite automata -- 3.3 Semiring -- 3.4 Basic operations -- 3.5 Transducer composition -- 3.6 Optimization -- 3.6.1 Determinization -- 3.6.2 Weight pushing -- 3.6.3 Minimization -- 3.7 Epsilon removal --
4. Speech recognition by weighted finite-state transducers -- 4.1 Overview of WFST-based speech recognition -- 4.2 Construction of component WFSTs -- 4.2.1 Acoustic models -- 4.2.2 Phone context dependency -- 4.2.3 Pronunciation lexicon -- 4.2.4 Language models -- 4.3 Composition and optimization -- 4.4 Decoding algorithm using a single WFST -- 4.5 Decoding performance --
5. Dynamic decoders with on-the-fly WFST operations -- 5.1 Problems in the native WFST approach -- 5.2 On-the-fly composition and optimization -- 5.3 Known problems of on-the-fly composition approach -- 5.4 Look-ahead composition -- 5.4.1 How to obtain prospective output labels -- 5.4.2 Basic principle of look-ahead composition -- 5.4.3 Realization of look-ahead composition using a filter transducer -- 5.4.4 Look-ahead composition with weight pushing -- 5.4.5 Generalized composition -- 5.4.6 Interval representation of label sets -- 5.5 On-the-fly rescoring approach -- 5.5.1 Construction of component WFSTs for on-the-fly rescoring -- 5.5.2 Concept -- 5.5.3 Algorithm -- 5.5.4 Approximation in decoding -- 5.5.5 Comparison with look-ahead composition --
6. Summary and perspective -- 6.1 Realization of advanced speech recognition techniques using WFSTs -- 6.1.1 WFSTs for extended language models -- 6.1.2 Dynamic grammars based on WFSTs -- 6.1.3 Wide-context-dependent HMMs -- 6.1.4 Extension of WFSTs for multi-modal inputs -- 6.1.5 Use of WFSTs for learning -- 6.2 Integration of speech and language processing -- 6.3 Other speech applications using WFSTs -- 6.4 Conclusion --
Bibliography -- Authors' biographies.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Home library Collection Call number Materials specified Copy number Status Date due Barcode
AM PERPUSTAKAAN LINGKUNGAN KEDUA PERPUSTAKAAN LINGKUNGAN KEDUA KOLEKSI AM-P. LINGKUNGAN KEDUA - TK7882.S65H647 3 (Browse shelf(Opens below)) 1 Available 00002143038

Includes bibliographical references (p. 137-148).

Preface -- 1. Introduction -- 1.1 Speech recognition and computation -- 1.2 Why WFST? -- 1.3 Purpose of this book -- 1.4 Book organization --

2. Brief overview of speech recognition -- 2.1 Statistical framework of speech recognition -- 2.2 Speech analysis -- 2.3 Acoustic model -- 2.3.1 Hidden Markov model -- 2.3.2 Computation of acoustic likelihood -- 2.3.3 Output probability distribution -- 2.4 Subword models and pronunciation lexicon -- 2.5 Context-dependent phone models -- 2.6 Language model -- 2.6.1 Finite-state grammar -- 2.6.2 N-gram model -- 2.6.3 Back-off smoothing -- 2.7 Decoder -- 2.7.1 Viterbi algorithm for continuous speech recognition -- 2.7.2 Time-synchronous Viterbi beam search -- 2.7.3 Practical techniques for LVCSR -- 2.7.4 Context-dependent phone search network -- 2.7.5 Lattice generation and N-best search --

3. Introduction to weighted finite-state transducers -- 3.1 Finite automata -- 3.2 Basic properties of finite automata -- 3.3 Semiring -- 3.4 Basic operations -- 3.5 Transducer composition -- 3.6 Optimization -- 3.6.1 Determinization -- 3.6.2 Weight pushing -- 3.6.3 Minimization -- 3.7 Epsilon removal --

4. Speech recognition by weighted finite-state transducers -- 4.1 Overview of WFST-based speech recognition -- 4.2 Construction of component WFSTs -- 4.2.1 Acoustic models -- 4.2.2 Phone context dependency -- 4.2.3 Pronunciation lexicon -- 4.2.4 Language models -- 4.3 Composition and optimization -- 4.4 Decoding algorithm using a single WFST -- 4.5 Decoding performance --

5. Dynamic decoders with on-the-fly WFST operations -- 5.1 Problems in the native WFST approach -- 5.2 On-the-fly composition and optimization -- 5.3 Known problems of on-the-fly composition approach -- 5.4 Look-ahead composition -- 5.4.1 How to obtain prospective output labels -- 5.4.2 Basic principle of look-ahead composition -- 5.4.3 Realization of look-ahead composition using a filter transducer -- 5.4.4 Look-ahead composition with weight pushing -- 5.4.5 Generalized composition -- 5.4.6 Interval representation of label sets -- 5.5 On-the-fly rescoring approach -- 5.5.1 Construction of component WFSTs for on-the-fly rescoring -- 5.5.2 Concept -- 5.5.3 Algorithm -- 5.5.4 Approximation in decoding -- 5.5.5 Comparison with look-ahead composition --

6. Summary and perspective -- 6.1 Realization of advanced speech recognition techniques using WFSTs -- 6.1.1 WFSTs for extended language models -- 6.1.2 Dynamic grammars based on WFSTs -- 6.1.3 Wide-context-dependent HMMs -- 6.1.4 Extension of WFSTs for multi-modal inputs -- 6.1.5 Use of WFSTs for learning -- 6.2 Integration of speech and language processing -- 6.3 Other speech applications using WFSTs -- 6.4 Conclusion --

Bibliography -- Authors' biographies.

There are no comments on this title.

to post a comment.

Contact Us

Perpustakaan Tun Seri Lanang, Universiti Kebangsaan Malaysia
43600 Bangi, Selangor Darul Ehsan,Malaysia
+603-89213446 – Consultation Services
019-2045652 – Telegram/Whatsapp
Email: helpdeskptsl@ukm.edu.my

Copyright ©The National University of Malaysia Library