did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

9780824740405

Speech Processing: A Dynamic and Optimization-Oriented Approach

by ;
  • ISBN13:

    9780824740405

  • ISBN10:

    0824740408

  • Edition: 1st
  • Format: Hardcover
  • Copyright: 2003-06-18
  • Publisher: CRC Press

Note: Supplemental materials are not guaranteed with Rental or Used book purchases.

Purchase Benefits

  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $300.00 Save up to $163.87
  • Rent Book $189.00
    Add to Cart Free Shipping Icon Free Shipping

    TERM
    PRICE
    DUE
    USUALLY SHIPS IN 3-5 BUSINESS DAYS
    *This item is part of an exclusive publisher rental program and requires an additional convenience fee. This fee will be reflected in the shopping cart.

Supplemental Materials

What is included with this book?

Summary

Based on years of instruction and field expertise, this volume offers the necessary tools to understand all scientific, computational, and technological aspects of speech processing. The book emphasizes mathematical abstraction, the dynamics of the speech process, and the engineering optimization practices that promote effective problem solving in this area of research and covers many years of the authors' personal research on speech processing. Speech Processing helps build valuable analytical skills to help meet future challenges in scientific and technological advances in the field and considers the complex transition from human speech processing to computer speech processing.

Table of Contents

Series Introduction v
Preface vii
I ANALYTICAL BACKGROUND AND TECHNIQUES
1(200)
Discrete-Time Signals, Systems, and Transforms
3(26)
Signal Sampling
3(8)
Sampling basics
4(1)
Sampling theorem
4(2)
Practical cases of sampling
6(5)
Discrete-Time Systems and z-Transforms
11(5)
Classification of systems
11(3)
Fundamentals of linear time-invariant systems
14(1)
z-transforms
15(1)
Characterizations of Digital Filters
16(4)
Filter transfer functions
16(1)
Filters described by difference equations
17(1)
Poles and zeros in a digital filter
17(3)
Frequency Responses of Digital Filters
20(3)
Frequency response as related to pole and zero locations
20(1)
Digital resonator
21(1)
All-pass filter
22(1)
Discrete Fourier Transform
23(2)
Definition of DFT
23(1)
Frequency range and frequency resolution of the DFT
24(1)
Zero-padding technique
25(1)
Short-Time Fourier transform
25(3)
Definition of STFT: two alternative views
26(1)
STFT magnitude (spectrogram)
27(1)
Summary
28(1)
Analysis of Discrete-Time Speech Signals
29(36)
Time-Frequency Analysis of Speech
29(12)
Time-domain and frequency-domain properties of speech
30(6)
Joint time-frequency properties of speech
36(3)
Spectrogram reading
39(2)
Analysis Based on Linear Predictive Coding
41(9)
Least-squares estimate of LPC coefficients
42(1)
Autocorrelation and covariance methods
43(3)
Spectral estimation via LPC
46(2)
Pre-emphasis
48(1)
Choice of order of the LPC model
49(1)
Summary
50(1)
Cepstral Analysis of Speech
50(3)
Principles
50(2)
Mel-frequency cepstral coefficients
52(1)
Automatic Extraction and Tracking of Speech Formants
53(3)
Formants and vocal tract resonances
53(2)
Formant extraction and tracking methods
55(1)
Automatic Extraction of Voicing Pitch
56(5)
Basics of pitch estimation methods
57(1)
Time-domain F0 estimation
58(1)
Short-time spectral techniques for F0 estimation
59(2)
Auditory Models for Speech Analysis
61(2)
Perceptual linear prediction
61(1)
Other methods
62(1)
Summary
63(2)
Probability and Random Processes
65(32)
Random Variables, Distributions, and Summary Statistics
65(10)
Random variables and their distributions
65(1)
Summary statistics --- expectations, moments, and covariances
66(2)
Common PDF's
68(5)
Common PMF's
73(2)
Conditioning, Total Probability Theorem, and Bayes' Rule
75(7)
Conditional probability, conditional PDF, and conditional independence
75(3)
The total probability theorem
78(1)
Bayes' rule and its sequential form
79(3)
Conditional Expectations
82(1)
Discrete-Time Random Processes
83(4)
Summary statistics of a random sequence
83(1)
Stationary random sequences
84(2)
White sequence, Markov sequence, Gauss-Markov sequence, and Wiener sequence
86(1)
Markov Chain and Hidden Markov Sequence
87(7)
Markov chain as discrete-state Markov sequence
87(1)
From Markov chain to hidden Markov sequence
88(6)
Summary
94(3)
Linear Model and Dynamic System Model
97(24)
Linear Model
97(3)
Canonical form of the model
98(1)
Examples of the linear model
98(2)
Likelihood computation
100(1)
Time-Varying Linear Model
100(9)
Time-varying linear predictive model
100(2)
Markov modulated linear predictive model
102(1)
Markov modulated linear regression model
102(2)
Speech data and the time-varying linear models
104(5)
Linear Dynamic System Model
109(6)
State space formulation of the model
112(1)
Relationship to high-dimensional linear model
113(1)
Likelihood computation
114(1)
Time-Varying Linear Dynamic System Model
115(1)
From time-invariant model to time-varying model
115(1)
Likelihood computation
116(1)
Non-Linear Dynamic System Model
116(4)
From linear model to nonlinear model
116(1)
Nonlinearity and its approximations
117(3)
Summary
120(1)
Optimization Methods and Estimation Theory
121(58)
Classical Optimization Techniques
122(4)
Basic definitions and results
122(2)
Necessary and sufficient conditions for an optimum
124(1)
Lagrange multiplier method for constrained optimization
125(1)
Numerical Methods for Optimization
126(4)
Methods based on finding roots of equations
126(2)
Methods based on gradient descent
128(2)
Dynamic Programming Techniques for Optimization
130(5)
Principle of optimality
131(1)
Dynamic programming for the hidden Markov model
132(2)
Dynamic programming for the trended hidden Markov model
134(1)
Preliminaries of Estimation Theory
135(8)
Cramer-Rao lower bound and minimum variance unbiased estimator
136(2)
Example: MVU estimator for generalized linear model
138(1)
Sufficient statistic
139(1)
Best linear unbiased estimator
140(2)
Method of moments
142(1)
Least Squares Estimation
143(6)
Basic LSE procedure
143(1)
Least squares estimator for the linear model
144(2)
Order-recursive least squares
146(1)
Sequential least squares
147(1)
Nonlinear least squares
148(1)
Maximum Likelihood Estimation
149(11)
Basic MLE procedure for fully observed data
149(4)
MLE for the linear model
153(1)
EM algorithm --- Introduction
153(4)
EM algorithm example --- Markov modulated Poisson process
157(3)
Estimation of Random Parameters
160(8)
Minimum mean square error (MMSE) estimator
161(2)
Bayesian linear model
163(1)
General Bayesian estimators and MAP estimator
163(2)
Linear minimum mean square error (LMMSE) estimator
165(2)
Sequential LMMSE estimator
167(1)
State Estimation
168(8)
Generic Kalman filter algorithm
169(2)
Kalman filter algorithms for the linear state-space system
171(2)
Extended Kalman filter for nonlinear dynamic systems
173(3)
Summary
176(3)
Statistical Pattern Recognition
179(22)
Bayes' Decision Theory
180(2)
Bayes' risk and MAP decision rule
180(1)
Practical issues
181(1)
Minimum Classification Error Criterion for Recognizer Design
182(2)
MCE classifier design steps
182(1)
Optimization of classifier parameters
183(1)
Hypothesis Testing and the Verification Problem
184(4)
MAP decision rule and hypothesis testing
184(1)
Verification problem in pattern recognition
185(1)
Neymann-Pearson approach to verification
186(1)
Bayesian approach to verification
187(1)
Examples of Applications
188(10)
Discriminative training for HMM
188(2)
Discriminative training for the trended HMM
190(3)
Discriminative feature extraction
193(3)
Bayesian approach to verification using the Gaussian mixture model
196(2)
Summary
198(3)
II FUNDAMENTALS OF SPEECH SCIENCE
201(94)
Phonetic Process
203(60)
Introduction
203(1)
Articulatory Phonetics and Speech Generation
203(15)
Anatomy and physiology of the vocal tract
204(3)
Major features of speech articulation
207(3)
Phonemes, coarticulation, and acoustics
210(3)
Source-filter description of speech production
213(5)
Acoustic Models of Speech Production
218(9)
Resonances in a nonuniform vocal tract model
218(2)
Two-tube vowel models
220(1)
Three-tube consonant modeling
221(1)
Speech production involving both poles and zeros
222(2)
Transmission line analog of the vocal tract
224(3)
Coarticulation: Its Origins and Models
227(5)
Effects of coarticulation
228(1)
Coarticulation effects for different articulators
229(1)
Invariant features
230(1)
Effects of coarticulation on duration
231(1)
Models for coarticulation
231(1)
Acoustic-Phonetics and Characterization of Speech Signals
232(6)
Acoustics of vowels
233(2)
Diphthongs and diphthongization
235(1)
Glides and liquids
235(1)
Nasals
235(1)
Fricatives
236(1)
Stop consonants
237(1)
Introduction to Auditory Phonetics
238(2)
Outer ear
239(1)
Middle ear
239(1)
Inner ear
239(1)
Sound Perception
240(6)
Thresholds
241(1)
Just-noticeable differences (JNDs)
241(1)
Pitch perception
241(1)
Masking
242(1)
Critical bands
243(1)
Nonsimultaneous (temporal) masking
243(1)
Just-noticeable differences (JNDs) in speech
244(1)
Timing
245(1)
Speech Perception
246(15)
Physical aspects of speech important for perception
246(1)
Experiments using synthetic speech
247(1)
Models of speech perception
248(3)
Vowel perception
251(2)
Consonant perception
253(3)
Duration as a phonemic cue
256(1)
Perception of intonational features
257(4)
Summary
261(2)
Phonological Process
263(32)
Introduction
263(1)
Phonemes: Minimal Contrastive Units of Speech Sounds
264(2)
Phonemes and allophones
264(1)
Phoneme identification
265(1)
Features: Basic Units of Phonological Representation
266(4)
Why a phonemic approach is not adequate
266(1)
Feature systems
267(3)
Natural classes
270(1)
Phonological Rules Expressed by Features
270(4)
Formalization of phonological rules
271(1)
Common phonological rule types
271(3)
Feature Geometry --- Internal Organization of Speech Sounds
274(6)
Introduction
274(1)
From linear phonology to nonlinear phonology
274(2)
Feature hierarchy
276(1)
Phonological rules in feature geometry
277(3)
Articulatory Phonology
280(8)
Articulatory gestures and task dynamics
281(4)
Gestural scores
285(2)
Phonological contrast in articulatory phonology
287(1)
Syllables: External Organization of Speech Sounds
288(5)
The representation of syllable structure
288(2)
Phonological function of the syllable: basic phonotactic unit
290(3)
Summary
293(2)
III COMPUTATIONAL PHONOLOGY AND PHONETICS
295(136)
Computational Phonology
297(36)
Articulatory Features and a System for Their Specification
298(4)
Cross-Tier Overlapping of Articulatory Features
302(10)
Major and secondary articulatory features
302(2)
Feature assimilation and overlapping examples
304(3)
Constraining rules for feature overlapping
307(5)
Constructing Discrete Articulatory States
312(4)
Motivations from symbolic pronunciation modeling in speech recognition
312(2)
Articulatory state construction
314(2)
Use of High-Level Linguistic Constraints
316(3)
Types of high-level linguistic constraints
316(1)
A parser for English syllable structure
317(2)
Implementation of Feature Overlapping Using Linguistic Constraints
319(12)
Feature specification
320(1)
A generator of overlapping feature bundles: Overview and examples of its output
321(3)
Demi-syllable as the rule organizational unit
324(2)
Phonological rule formulation
326(5)
Summary
331(2)
Computational Models for Speech Production
333(50)
Introduction and Overview of Speech Production Modeling
334(3)
Two types of speech production modeling and research
334(2)
Multiple levels of dynamics in human speech production
336(1)
Modeling Acoustic Dynamics of Speech
337(17)
Hidden Markov model viewed as a generative model for acoustic dynamics
337(3)
From stationary state to nonstationary state
340(1)
ML learning for the trended HMM via the EM algorithm
341(4)
Example: Model with state-dependent polynomial trends
345(1)
Recursively-defined acoustic trajectory model using a linear dynamic system
346(2)
ML learning for linear dynamic system
348(6)
Modeling Hidden Dynamics of Speech
354(9)
Derivation of discrete-time hidden-dynamic state equation
355(2)
Nonlinear state space formulation of hidden dynamic model
357(1)
Task dynamics, articulatory dynamics, and vocal-tract resonance dynamics
357(6)
Hidden Dynamic Model Implemented Using Piecewise Linear Approximation
363(11)
Motivations and a new form of the model formulation
365(1)
Parameter estimation algorithm
366(8)
Likelihood-scoring algorithm
374(1)
A Comprehensive Statistical Generative Model of the Dynamics of Casual Speech
374(7)
Overlapping model for multi-tiered phonological construct
376(1)
Segmental target model
377(1)
Functional model for hidden articulatory dynamics
378(1)
Functional model for articulatory-to-acoustic mapping
379(2)
Summary
381(2)
Computational Models for Auditory Speech Processing
383(48)
A Computational Model for the Cochlear Function
384(4)
Introduction
384(2)
Mathematical formulation of the cochlear model
386(2)
Frequency-Domain Solution of the Cochlear Model
388(2)
Time-Domain Solution of the Cochlear Model
390(2)
Stability Analysis for Time-Domain Solution of the Cochlear Model
392(7)
Derivation of the stability condition
392(6)
Application of the stability analysis
398(1)
Computational Models for Inner Hair Cells and for Synapses to Auditory Nerve Fibers
399(2)
The inner hair cell model
399(1)
The synapse model
399(2)
Interval-Based Speech Feature Extraction from the Cochlear Model Outputs
401(4)
Inter-peak interval histogram construction
401(1)
Matching neural and modeled IPIHs for tuning BM-model's parameters
402(3)
Interval-Histogram Representation for the Speech Sound in Quiet and in Noise
405(5)
Inter-peak interval histograms for clean speech
406(2)
Inter-peak interval histograms for noisy speech
408(2)
Computational Models for Network Structures in the Auditory Pathway
410(18)
Introduction
411(2)
Modeling action potential generation in the auditory nerve
413(2)
Neural-network models central to the auditory nerve
415(8)
Model simulation with speech inputs
423(2)
Discussion
425(3)
Summary
428(3)
IV SPEECH TECHNOLOGY IN SELECTED AREAS
431(150)
Speech Recognition
433(78)
Introduction
433(4)
The speech recognition problem
434(1)
ASR system specifications
435(1)
Dimensions of difficulty
436(1)
Evaluation measures for speech recognizers
437(1)
Mathematical Formulation of Speech Recognition
437(3)
A fundamental equation
437(1)
Acoustic model, language model, and sequential optimization
438(1)
Differentially weighting acoustic and language models
439(1)
Word insertion penalty factor
439(1)
Acoustic Pre-Processor
440(3)
What is acoustic pre-processing
440(1)
Some common acoustic pre-processors
441(2)
Use of HMMs in Acoustic Modeling
443(3)
HMMs in ASR applications
443(1)
Relationships between HMM states and speech units
444(1)
Construction of context-dependent HMMs
444(1)
Some advantages of the HMM formulation for ASR
445(1)
Use of Higher-Order Statistical Models in Acoustic Modeling
446(5)
Why higher-order models are needed
446(1)
Stochastic segment models for speech acoustics
447(2)
Super-segmental, hidden dynamic models
449(1)
Higher-order pronunciation models
450(1)
Case Study I: Speech Recognition Using a Hidden Dynamic Model
451(12)
Model overview
452(1)
Model formulation
453(2)
Learning model parameters
455(4)
Likelihood-scoring algorithm
459(1)
Experiments on spontaneous speech recognition
459(4)
Case Study II: Speech Recognition Using HMMs Structured by Locus Equations
463(15)
Model overview
463(1)
Model formulation
464(2)
Learning locus-HMM parameters
466(8)
Phonetic classification experiments
474(4)
Robustness of Acoustic Modeling and Recognizer Design
478(4)
Introduction
478(1)
Model-space robustness by adaptation
479(2)
Adaptive training
481(1)
Case Study III: MAP Approach to Speaker Adaptation Using Trended HMMs
482(9)
Derivation of MAP estimates for the trended HMM
483(4)
Speaker adaptation experiments
487(4)
Case Study IV: Bayesian Adaptive Training for Compensating Acoustic Variability
491(13)
Background
492(2)
Overview of the compensation strategy
494(1)
Bayesian adaptive training algorithm
495(2)
Robust decoding using Bayesian predictive classification
497(3)
Experiments on spontaneous speech recognition
500(4)
Statistical Language Modeling
504(5)
Introduction
504(1)
N-gram language modeling
504(2)
Decision-tree language modeling
506(1)
Context-free grammar as a language model
506(1)
Maximum-entropy language modeling
507(1)
Adaptive language modeling
508(1)
Summary
509(2)
Speech Enhancement
511(48)
Introduction
511(1)
Classification of Basic Techniques for Speech Enhancement
512(2)
Classification by what and how information is used
512(1)
Classification by waveform or feature as the output
512(1)
Classification by single or multiple sensors
513(1)
Classification by the general approaches employed
513(1)
Spectral Subtraction
514(2)
Wiener Filtering
516(1)
Use of HMM as the Prior Model for Speech Enhancement
517(6)
Training AR-HMMs for clean speech and for noise
518(1)
The MAP enhancement technique
518(1)
The approximate MAP enhancement technique
519(1)
The MMSE enhancement technique
520(2)
Noise adaptation
522(1)
Case Study I: Implementation and Evaluation of HMM-Based MMSE Enhancement
523(6)
Double pruning the MMSE filter weights
524(1)
PDF approximation for noisy speech
524(2)
Overview of speech enhancement system and experiments
526(1)
Enhancement results using SNR as an evaluation measure
527(2)
Enhancement results using subjective evaluation
529(1)
Case Study II: Use of the Trended HMM for Speech Enhancement
529(15)
Formulation of the prior model
530(1)
Derivation of the MMSE estimator using the prior model
531(5)
Implementation of the MMSE enhancement technique
536(2)
Approximate MMSE enhancement technique
538(1)
Diagnostic experiments
539(4)
Speech waveform enhancement results
543(1)
Use of Speech Feature Enhancement for Robust Speech Recognition
544(13)
Roles of speech enhancement in feature-space robust ASR
544(1)
A statistical model for log-domain acoustic distortion
545(3)
Use of prior models for clean speech and for noise
548(2)
Use of the MMSE estimator
550(1)
MMSE estimator with prior speech model of static features
551(2)
Estimation with prior speech model for joint static and dynamic features
553(2)
Implementation issues
555(2)
Summary
557(2)
Speech Synthesis
559(22)
Introduction
559(2)
Basic Approaches
561(1)
Choice of Units
562(1)
Synthesis Methods
563(8)
Articulatory method for speech synthesis
564(1)
Spectral method for speech synthesis
565(4)
Waveform methods for speech synthesis
569(2)
Phase mismatch
571(1)
Databases
571(1)
Case Study: Automatic Unit Selection for Waveform Speech Synthesis
572(3)
Intonation
575(2)
Text Processing
577(2)
Evaluation of Speech Synthesis Output
579(1)
Summary
580(1)
References 581(39)
Index 620

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program