did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

9780471977308

Speech Recognition Theory and C++ Implementation

by ;
  • ISBN13:

    9780471977308

  • ISBN10:

    0471977306

  • Edition: 1st
  • Format: Hardcover
  • Copyright: 1999-05-04
  • Publisher: WILEY
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $244.21 Save up to $0.22
  • Buy New
    $243.99
    Add to Cart Free Shipping Icon Free Shipping

    PRINT ON DEMAND: 2-4 WEEKS. THIS ITEM CANNOT BE CANCELLED OR RETURNED.

Supplemental Materials

What is included with this book?

Summary

Automatic Speech Recognition (ASR) is the enabling technology for hands-free dictation and voice-triggered computer menus. It is becoming increasingly prevalent in environments such as private telephone exchanges and real-time information services. Speech Recognition introduces the principles of ASR systems, including the theory and implementation issues behind multi-speaker continuous speech recognition. Focusing on the algorithms employed in commercial and laboratory systems, the treatment enables the reader to devise practical solutions for ASR system problems. It addresses in detail C++ programming techniques used to develop ASR applications, thus offering skills that will prove useful in any large C++ based software project. Possible extensions of the well-established ASR technology are highlighted, based on "Hidden Markov Models" applied to fields such as modelling and prediction of econometric series. Features include: Accompanying CD-ROM containing all C++ source code of a complete laboratory multi-speaker continuous-speech ASR system (e.g. Initialisation, Training, Recognition, Evaluation, etc.) Detailed theoretical, mathematical and technical explanations of ASR A practical account of the functioning of ASR A crucial source of information for researchers, developers and project managers involved with ASR systems, Speech Recognition is also structured for use by students of digital signal processing, speech recognition and C++ programming techniques.

Author Biography

Claudio Becchetti graduated with honors in Electronic Engineering in 1994 at the University of Rome, where he achieved the Ph.D. in Telecommunications in 1999. From 2002 to 2009, he was adjoint professor at the University "La Sapienza", faculty of Telecommunication Engineering where he held first a course on Industrial design and then a course on Signal Theory. Claudio has 7 years teaching experience working with students studying ECG. This device is well suited as a practical example for signal theory, digital signal processing, electronics and software engineering.

Table of Contents

Preface xv
Acknowledgments xviii
Statistical Speech Recognition
1(66)
Part I: Theory
Introduction
2(4)
The theoretical part
3(2)
The implementation part
5(1)
ASR: A Simple Explanation abc
6(2)
Statistical Speech Recognition abc
8(10)
The deterministic approach
9(1)
The stochastic framework
10(1)
Stochastic model simplifications
11(3)
Hidden Markov Models: an introduction
14(3)
The search algorithm
17(1)
ASR Structure and Chapter Organization ***
18(2)
Problems, Applications and Products in ASR
20(2)
The ASR as an Information Theory Problem ***
22(6)
The layered model
22(2)
The L-language
24(1)
The information theory approach
24(4)
Part II: Implementation
Introduction
28(3)
The Crisis of Software Development *** ***
31(2)
Object oriented versus procedural programming
32(1)
A Safe C++ Programming Style ***
33(10)
What must not be used in C++ ***
35(3)
Syntax and programming style ***
38(1)
Conventions in writing code ***
39(2)
Basic design rules: the modules ***
41(2)
The Program Services ***
43(2)
The Memory Service ***
45(10)
Dynamic memory management: a C++ without pointers ***
46(4)
Object list resizing: the C++ ``realloc'' ***
50(1)
The root class: a user guide
51(3)
Implementation details of the container classes ***
54(1)
Diagnostics ***
55(6)
Defensive programming ***
55(3)
The diagnostic system (User guide)
58(2)
Implementation details of the diagnostic system ***
60(1)
Topics Related to Large Projects *** ***
61(6)
Portability and defined options (file compatib.h defopt.h Boolean.h)
61(2)
Test routines and version history
63(1)
Time efficiency, development speed and code readability
64(3)
Speech Database
67(54)
Part I: Theory
Introduction
68(1)
Statistical Notes
69(1)
Speech Variability ***
70(4)
Design of Speech Databases ***
74(8)
Some practical aspects of building a corpus
78(2)
Transcriptions of speech
80(2)
Speech Database Sources
82(6)
TIMIT and ATIS databases ***
83(5)
Part II: Implementation
Introduction
88(2)
The Configuration Manager ***
90(5)
The configuration file ***
91(1)
The option configuration file *.opt ***
92(1)
The configuration class ***
93(2)
Introducing new parameters ***
95(1)
The String Class ***
95(5)
The String class usage ***
96(1)
String class techniques ***
97(3)
The Structure of an ASR-Oriented Database
100(4)
The options of the database manager ***
103(1)
The Database Manager Class Hierarchy
104(9)
The sound file hierarchy (file soundfil.h)
105(2)
The label file hierarchy
107(3)
Soundlabelled File and Dbase Voc classes (file soundlab.*)
110(3)
Safe Polymorphism *** ***
113(5)
Implementation
115(3)
Adding new database standards ***
118(1)
Hints on Class Design *** ***
118(3)
Speech Signal Analysis
121(46)
Part I: Theory
Introduction
122(1)
Analog to Digiral Voice Conversion ***
122(3)
Physical features of speech signals ***
124(1)
Feature Extraction ***
125(9)
Signal preprocessing
125(1)
Windowing
126(3)
Spectral analysis
129(1)
Filter bank processing
130(2)
Log energy computation
132(1)
Mel frequency cepstrum computation
132(2)
Delta coefficients and energy
134(1)
Cepstrum Analysis
134(2)
Robustness in Speech Recognition ***
136(5)
Additive noise and linear distortion model
137(1)
Cepstral compensation
138(2)
Channel equalization and speech enhancement
140(1)
Model adaptation techniques
141(1)
Distortion Measures ***
141(3)
Part II: Implementation
Introduction
144(2)
Feature Extraction: a DSP Approach ***
146(3)
Configuration of the feature manager
148(1)
Hierarchy Design
149(5)
Adding a new block ***
153(1)
Safe Interfacing C-Style Routines *** ***
154(4)
The Mathematical Classes ***
158(9)
Overview of mathematical classes ***
160(1)
Mathematical vectors in C++ ***
161(2)
Advanced techniques for vector implementation ***
163(4)
HMMs and Initialization
167(44)
Part I: Theory
Introduction
168(1)
Markov Chains ***
168(2)
Hidden Markov Models ***
170(5)
Alternative definitions for HMMs
171(1)
Processes suitable for HMM modeling
171(1)
Example: two state-HMM
172(3)
Application of the HMM to Speech Modeling
175(9)
Probability inference in HMMs: forward and backward procedure
179(1)
Observation densities
180(2)
HMM initialization
182(2)
Automatic Segmentation
184(1)
Clustering of Feature Vectors
185(3)
HMMs versus Probabilistic, Bayesian and Neural Networks ***
188(4)
Part II: Implementation
Introduction
192(1)
HMM Phoneme Models
193(1)
HMM Estimation Procedure ***
194(4)
Implementation Details
198(4)
HMM model implementation
198(2)
Algorithm implementation
200(2)
Sectionate
202(1)
Module Options
202(2)
Array Class ***
204(3)
Array class regerence guide ***
205(2)
Diagonal Arrays ***
207(4)
Diagonal Array user guide
207(1)
Diagonal array: implementation details ***
208(3)
HMM Training
211(32)
Part II: Theory
Introduction
212(1)
MLE and MAP Training
213(1)
Utterance Probabilities and HMM Parameters
214(1)
The Maximization Procedure: MLE Case
215(6)
Parameter estimation formulas for MLE ***
217(2)
Summary of MLE procedure
219(2)
Maximum a Posteriori (MAP) Estimation
221(7)
Explicit formulas for MAP
223(1)
Estimate of the hyper-parameters
223(1)
Use of MAP procedure
224(4)
MAP Applications
228(1)
Part II: Implementation
Introduction
229(1)
Single Model Training
230(4)
Single model re-estimation algorithm
231(3)
Model Simultaneous Training
234(4)
Algorithm description
234(2)
Algorithm optimizations
236(2)
Logarithmic Math
238(1)
Class Organization
238(1)
Labels: Hints for Futher Developments
239(1)
Option Description
240(3)
Language Models
243(60)
Part I: Theory
Introduction
244(1)
The Language Models
245(1)
Languages as Information Sources
246(3)
The M-gram Model
249(2)
Perplexity
251(1)
Rare Events Estimation: Smoothing
252(7)
Good-Turing smoothing
254(2)
Linear interpolation model
256(1)
Non-linear interpolation
257(2)
Smoothing Methods: Examples
259(7)
Perplexity: an example
265(1)
Word Automatic Clustering ***
266(7)
Estimation accuracy of class-based and word-based probabilities
267(3)
Clustering algorithms
270(1)
Equivalent word merging algorithm
270(3)
Adaptive Language Models ***
273(1)
Appendix
274(3)
Part II: Implementation
Introduction
277(2)
Language Training Model Options
279(2)
Hierarchy Design
281(3)
Database Interface and Counts
284(11)
Segmented database (TIMIT)
285(3)
Non-segmented databases
288(4)
Sorted list and binary search ***
292(3)
Clustering
295(5)
Merging algorithm
297(2)
Perplexity computation for class-based language models
299(1)
Transition Probability Computation
300(3)
Recognition
303(38)
Part I: Theory
Introduction
304(1)
Static Structure of the Recognizer ***
305(4)
Static structure of a phonetic recognizer
306(1)
Static structure of a word recognizer
307(2)
The Search Algorithms ***
309(4)
The Viterbi algorithm
310(1)
Algorithm description
311(2)
Dynamic Structure Implementation ***
313(2)
Example of the Viterbi Algorithm
315(5)
Use of the logarithms
320(1)
Variants of the Static Structures ***
320(6)
Part II: Implementation
Introduction
326(1)
Static Structure
327(6)
Graph implementation *** ***
328(1)
Layered graph
329(4)
Dynamic Structure
333(5)
Memory management
333(2)
Representing trees
335(1)
Hypothesis management
336(2)
Recognition Options
338(3)
Evaluation and Parameter Setting
341(22)
Part I: Theory
Introduction
342(3)
Evaluation conditions
343(2)
Speech Data Retrieval
345(1)
Feature Parameters
345(1)
Initialization
346(2)
Training
348(4)
Grammar
352(1)
Phonetic grammar
352(1)
Word language model
353(1)
Recongnition
353(6)
Phonetic recognition
354(2)
Word recognition
356(3)
Part II: Implementation
Evaluation
359(1)
Options of the evaluator
359(1)
RES Specifications
360(3)
Econometric Appendix: The Behavior of Financial Time Series
Introduction 363(1)
The state of the art: the homogeneous rational expectation hypothesis approach 364(4)
Descriptive evidence from ``blue chips'' in Italy. USA. UK and Germany 368(2)
Autoregressive conditional heteroskedasticity models for stock returns 370(2)
Research frontiers on statistical properties of stock returns: the HMM approach 372(1)
Conclusions 373(1)
Empirical results 374(1)
Diagnostics 374(13)
References 387(12)
Index 399

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program