did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

9783540654780

Computational Models of Speech Pattern Processing : Proceedings of the NATO Advanced Study Institute on Computational Models of Speech Pattern Processing, Held in St. Helier, Jersey, U. K., July 7-18, 1997

by ; ;
  • ISBN13:

    9783540654780

  • ISBN10:

    354065478X

  • Format: Hardcover
  • Copyright: 1999-06-01
  • Publisher: Springer Verlag

Note: Supplemental materials are not guaranteed with Rental or Used book purchases.

Purchase Benefits

List Price: $164.00 Save up to $145.44
  • Buy Used
    $123.00
    Add to Cart Free Shipping Icon Free Shipping

    USUALLY SHIPS IN 2-4 BUSINESS DAYS

Supplemental Materials

What is included with this book?

Summary

Contains tutorial papers from the invited lecturers and a selection of the contributed papers presented at a meeting titled-- Computational Models of Speech Pattern Processing. DLC: Speech processing systems--Mathematical models.

Table of Contents

Foreword vii
Keith M. Ponting
Insight vs. performance viii
Dangers ix
Hot topics ix
Towards the future x
Integrating knowledge sources xi
A unified theory? xi
References xii
Speech Pattern Processing
1(9)
Roger K. Moore
The State-of-the-Art in Speech
1(1)
Speech Patterning
2(1)
Speech Pattern Processing
3(2)
Whither a Unified Theory?
5(1)
Towards a Theory
5(1)
Practical Issues
5(1)
What We Know
6(1)
Some Things We Don't Know
7(1)
The Way Forward
7(3)
References
8(2)
Psycho-acoustics and Speech Perception
10(8)
Louis C. W. Pols
Introduction
10(1)
Psycho-acoustics
11(2)
Speech Perception
13(2)
Vowel Reduction and Schwa
13(1)
Spectro-temporal Dynamics of Formant Transitions
14(1)
Consonant Reduction
14(1)
Discussion
15(3)
References
16(2)
Acoustic Modelling for Large Vocabulary Continuous Speech Recognition
18(22)
Steve Young
Introduction
18(1)
Overview of LVCSR Architecture
18(3)
Front End Processing
21(1)
Basic Phone Modelling
22(8)
HMM Phone Models
22(2)
HMM Parameter Estimation
24(2)
Context-Dependent Phone Models
26(4)
Adaptation for LVCSR
30(3)
Maximum Likelihood Linear Regression
31(1)
Estimating the MLLR Transforms
31(2)
Progress in LVCSR
33(1)
Discriminative Training for LVCSR
34(3)
Conclusions
37(3)
References
38(2)
Tree-based Dependence Models for Speech Recognition
40(14)
Mari Ostendorf
Ashvin Kannan
Orith Ronen
Introduction
40(1)
Hidden Tree Framework
41(2)
Hidden Dependence Trees
43(4)
The Mathematical Framework
43(1)
Application to Speech
44(1)
Topology Design and Parameter Estimation
44(2)
Experiments
46(1)
Multiscale Tree Processes
47(4)
The Mathematical Framework
47(1)
Application to Speech
48(1)
Topology Design and Parameter Estimation
49(1)
Experiments
50(1)
Discussion
51(3)
References
52(2)
Connectionist and Hybrid Models for Automatic Speech Recognition
54(13)
Jean-Paul Haton
Introduction
54(1)
A Brief Overview of Neural Networks
55(2)
Basic Principles
55(1)
Main Models for ASR
56(1)
Signal Processing and Feature Extraction using ANNs
57(1)
Neural Networks as Static Pattern Classifiers
58(1)
Speech Pattern Classification with Perceptrons
58(1)
Feature Maps
58(1)
Dynamic Aspects
59(2)
Position of the Problem
59(1)
Time Delays
59(1)
Dynamic Classifiers
59(1)
Recurrent NNs
60(1)
Hybrid Models
61(2)
Position of the Problem
61(1)
Proposed Solutions
61(2)
Conclusion
63(4)
References
63(4)
Computational Models for Auditory Speech Processing
67(11)
Li Deng
Introduction
67(1)
A nonlinear computational model for basilar membrane wave motions
67(1)
Frequency-domain and time-domain computational solutions to the BM model
68(2)
Interval analysis of auditory model's outputs for temporal information extraction
70(1)
IPIH representation of clean and noisy speech sounds
71(2)
Speech recognition experiments
73(2)
Summary and discussions
75(3)
References
76(2)
Speaker Adaptation of CDHMMs Using Bayesian Learning
78(6)
Claudio Vair
Luciano Fissore
Introduction
78(1)
Bayesian Estimation of CDHMMs
78(3)
Prior Density Definition
79(1)
Forgetting Mechanism
79(1)
Prior Parameter Estimation and MAP Solution
80(1)
Acoustic Normalization
81(1)
Tasks, Corpus and System
81(1)
Speaker Adaptation Experiments
82(1)
Conclusions
83(1)
References
83(1)
Discriminative Improvement of the Representation Space for Continuous Speech Recognition
84(6)
Angel de la Torre
Antonio M. Peinado
Antonio J. Rubio
Jose C. Segura
Introduction
84(1)
Discriminative Feature Extraction
84(1)
SGDFE Algorithm for CSR
85(1)
Experimental Results
86(3)
Conclusions
89(1)
References
89(1)
Dealing with Loss of Synchronism in Multi-Band Continuous Speech Recognition Systems
90(6)
Christophe Cerisara
Introduction
90(1)
Forcing Synchronism Between the Bands
91(1)
First Approach
91(1)
Experiments
92(1)
Modeling Loss of Synchronism
92(2)
Theoretical Approach
92(1)
Experimental Approach
93(1)
Conclusion
94(2)
References
95(1)
K-Nearest Neighbours Estimator in a HMM-Based Recognition System
96(6)
Fabrice Lefevre
Claude Montacie
Marie-Jose Caraty
Introduction
96(1)
K-NN Assessment
96(1)
K-NN estimator in HMM
97(2)
Adaptation Principle
98(1)
HMM Estimation Improvement
98(1)
Evaluations
99(2)
Recognition rates
99(1)
SNALC Evaluation
100(1)
Perspectives
101(1)
References
101(1)
Robust Speech Recognition
102(10)
Sadaoki Furui
Mismatches between Training and Testing
102(2)
Speech Variation
102(2)
Inter-Speaker Variation
104(1)
Reducing Mismatches to Improve Speech Recognition
104(5)
Principles of Adaptive Speech Recognition
104(2)
Three Principal Adaptation Methods for Reducing Mismatches
106(1)
Important Practical Issues
107(1)
N-Best-Based Unsupervised Adaptation
108(1)
Conclusion
109(3)
References
109(3)
Channel Adaptation
112(10)
Keith M. Ponting
Introduction
112(1)
Matched condition training
112(1)
Robust features
112(1)
Model adaptation
113(1)
Channel adaptation
113(1)
Speech enhancement
113(1)
Models of distortion
113(2)
Minimum mean square error
114(1)
Additive noise estimation
114(1)
Methods for channel adaptation
115(4)
Global transformations
115(1)
Class-specific corrections
116(1)
Empirical methods based on stereo data
117(1)
Model-based compensation
118(1)
Conclusion
119(3)
References
120(2)
Speaker Characterization, Speaker Adaptation and Voice Conversion
122(10)
Sadaoki Furui
Introduction
122(1)
Speaker-Characterization
122(1)
Speaker Recognition
123(1)
Speaker-Adaptation Techniques for Speech Recognition
124(4)
Classification of Speaker-Adaptation/Normalization Methods
124(1)
Speaker Cluster Selection Methods
124(1)
Interpolated Re-Estimation Algorithm
125(1)
Spectral Mapping Algorithm
125(3)
Individuality Problems in Speech Synthesis and Coding
128(1)
Conclusion
129(3)
References
130(2)
Speaker Recognition
132(11)
Sadaoki Furui
Principles of Speaker Recognition
132(1)
Text-Independent Speaker Recognition Methods
133(4)
Long-Term-Statistics-Based Methods
133(2)
VQ-Based Methods
135(1)
Ergodic-HMM-Based Methods
135(1)
Speech-Recognition-Based Methods
136(1)
Text-prompted Speaker Recognition
137(1)
Normalization and Adaptation Techniques
137(3)
Parameter-Domain Normalization
138(1)
Likelihood Normalization
138(1)
HMM Adaptation for Noisy Conditions
139(1)
Updating Models and A Priori Threshold for Speaker Verification
139(1)
Open Questions and Concluding Remarks
140(3)
References
140(3)
Application of Acoustic Discriminative Training in an Ergodic HMM for Speaker Identification
143(6)
Leandro Rodriguez Linares
Carmen Garcia Mateo
Introduction
143(1)
Experimental Conditions
144(1)
System Architecture
145(1)
Acoustic Segmentation
145(1)
The PTE-HMM Model
145(1)
Experimental Results
145(2)
Conclusions
147(2)
References
148(1)
Comparison of Several Compensation Techniques for Robust Speaker Verification
149(8)
Laura Docio-Fernandez
Carmen Garcia-Mateo
Introduction
149(2)
The HMM recognition system
151(1)
Mismatch Compensation Techniques
151(1)
CMS
151(1)
SM1
152(1)
SM2
152(1)
Experiments and Results
152(4)
Discussion and Conclusion
156(1)
References
156(1)
Segmental Acoustic Modeling for Speech Recognition
157(16)
Mari Ostendorf
Introduction
157(1)
Segmental and Hidden Markov Models
158(7)
General Modeling Framework
159(2)
Models of Feature Dynamics
161(4)
Recognition and Training
165(3)
Recognition Algorithms
165(1)
Parameter Estimation Algorithms
166(2)
Segmental Features
168(1)
Summary
169(4)
References
170(3)
Trajectory Representations and Acoustic Descriptions for a Segment-Modelling Approach to Automatic Speech Recognition
173(8)
Wendy J. Holmes
Introduction
173(1)
Modelling Trajectories in Speech
174(1)
Representing an Unobserved Trajectory with Segmental HMMs
175(2)
Calculating segment probabilities
175(1)
Recognition experiment
176(1)
HMM Recognition with Formant Features
177(1)
Modelling trajectories of cepstrum and formant features
178(1)
Conclusions
178(3)
References
179(2)
Suprasegmental Modelling
181(18)
E. Noth
A. Batliner
A. Kiebling
R. Kompe
H. Niemann
Introduction
181(2)
The Verbmobil System
183(1)
Computation of Prosodic Information
183(7)
Extraction of Prosodic Features
185(1)
Prosodic Classes
185(1)
New Boundary Labels: The Syntactic-prosodic M-labels
186(1)
Classification of Prosodic Events
187(1)
Improving the Classification Results with Stochastic Language Models
187(2)
Prosodic scoring of WHGs
189(1)
The Use of Prosodic Information
190(6)
Prosody and Syntax --- Interaction with the TUG-Grammar
190(4)
Prosody and the Other Linguistic Modules
194(2)
Concluding Remarks
196(3)
References
196(3)
Computational Models for Speech Production
199(15)
Li Deng
Introduction
199(1)
Speech production models in science/technology literatures
200(2)
Derivation of discrete-time version of statistical task-dynamic model
202(2)
Algorithms for learning task-dynamic model parameters and for likelihood computation
204(6)
Model with deterministic, time-invariant parameters
205(2)
Model with random, time-invariant parameters
207(1)
Model with random, smoothly time-varying parameters
208(2)
Discriminative learning of production models parameters
210(1)
Other types of computational models of speech production
210(2)
Summary and discussions
212(2)
References
212(2)
Articulatory Features and Associated Production Models in Statistical Speech Recognition
214(11)
Li Deng
Introduction
214(1)
Functional description of human speech communication as an encoding-decoding process
214(1)
Overview of theories of speech perception
215(1)
A general framework of statistical speech recognition
216(1)
Brief analysis of weaknesses of current speech recognition technology
217(1)
Phonological model: Overlapping articulatory features and related HMMs
218(1)
Task-dynamic model of speech production
219(1)
Interfacing overlapping features to task-dynamic model and a general architecture for speech recognition
220(1)
Discussions: Machine speech recognition
220(5)
References
223(2)
Talker Normalization with Articulatory Analysis-by-Synthesis
225(8)
Richard S. McGowan
Introduction
225(1)
Normalization Procedure
226(3)
Experiments
229(2)
Conclusion
231(2)
References
231(2)
The Psycholinguistics of Spoken Word Recognition
233(19)
Cynthia M. Connine
Thomas Deelman
Introduction
233(1)
Overview: Models of spoken word recognition
233(2)
Currency of mapping: units and the nature of lexical representations
235(2)
Temporal nature of speech: early vs delayed commitment
237(2)
Delayed commitment
238(1)
Multiple lexical hypotheses, lexical competition and graded activation
239(3)
Language architecture: Lexical and segmental levels
242(2)
Language architecture: Lexical and sentential
244(1)
Contribution of attention
245(7)
References
247(5)
Issues in Using Models for Self Evaluation and Correction of Speech
252(7)
Marie-Christine Haton
Introduction
252(1)
Using models
253(1)
Norm building
254(1)
Matching between the subject's world and the technical world
255(1)
Settlement of the speech education program
256(1)
Management of the education program
257(1)
Conclusion
257(2)
References
257(2)
The Use of the Maximum Likelihood Criterion in Language Modelling
259(21)
Hermann Ney
Introduction
259(1)
Perplexity and Maximum Likelihood
260(3)
Smoothing and Discounting for Sparse Data
263(4)
Modelfree Discounting and Turing-Good Estimates
263(3)
Absolute Discounting
266(1)
Partitioning-Based Models
267(5)
Equivalence Classes of Histories and Decision Trees
267(3)
Two-Sided Partitionings and Word Classes
270(2)
Word Trigger Pairs
272(3)
Maximum Entropy Approach
275(2)
Conclusions
277(3)
References
277(3)
Language Model Adaptation
280(24)
Renato DeMori
Marcello Federico
Introduction
280(1)
Background on Language Models
281(2)
Adaptation paradigms
283(2)
LM adaptation in dialogue systems
284(1)
Basis statistical methods
285(10)
Maximum a-posteriori estimation
285(1)
Linear interpolation
286(2)
Sublanguages mixture adaptation
288(1)
Backing-off
288(2)
Maximum Entropy
290(1)
Minimum Discrimination Information
291(1)
Generalized iterative scaling
292(1)
Cache model and word triggers
293(2)
Practical applications of adaptation paradigms
295(6)
The 1993 ARPA evaluation method
295(1)
Mixture based adaptation
296(2)
Adaptation with a cache model
298(1)
ME and MDI adaptation
299(1)
LM adaptation in interactive systems
299(2)
Conclusion
301(3)
References
301(3)
Using Natural-Language Knowledge Sources in Speech Recognition
304(24)
Robert C. Moore
Introduction
304(1)
Issues in Language Modeling for Speech Recognition
305(2)
Formal Models for Natural Language
307(5)
Finite-State Grammars
307(1)
Context-Free Grammars
308(1)
Augmented Context-Free Grammars
309(1)
Expressive Power of Grammar Formalisms and the Requirements of Natural Language
310(2)
Search Architectures for Natural-Language-Based Language Models
312(2)
Word Lattice Parsing
312(1)
N-best Filtering or Rescoring
312(1)
Dynamic Generation of Partial Grammar Networks
313(1)
Compiling Unification Grammars into Context-Free Grammars
314(4)
Instantiating Unification Grammars
314(2)
Removing Left Recursion from Context-Free Grammars
316(2)
Robust Natural-Language-Based Language Models
318(7)
Combining Linguistics and Statistics in a Language Model
318(2)
Fully Statistical Natural-Language Grammars
320(5)
Summary
325(3)
References
326(2)
How May I Help You?
328(22)
A.L. Gorin
G. Riccardi
J.H. Wright
Introduction
328(1)
A Spoken Dialog System
329(2)
Database
331(2)
Algorithms
333(10)
Salient Fragment Acquisition
335(4)
Recognizing Fragments in Speech
339(2)
Call Classification
341(2)
Experiment Results
343(4)
Conclusions
347(3)
References
348(2)
Introduction of Rules into a Stochastic Approach for Language Modelling
350(6)
Thierry Spriet
Marc El-Beze
Introduction
350(1)
Stack Decoding Strategy
351(2)
The Algorithm
351(1)
The Evaluation Function
351(1)
Peculiar Advantages of the Algorithm
352(1)
Rules
353(1)
Correction of Biases
353(1)
Under-represented Structures and Long Span Dependencies
353(1)
Multi Level Interactions
354(1)
Linguistic and Syntactic
354(1)
Phonology
354(1)
Conclusion
355(1)
References
355(1)
History Integration into Semantic Classification
356(6)
Mauro Cettolo
Anna Corazza
Introduction
356(1)
Classifier
357(1)
Data
357(1)
Dialogue History Integration
358(2)
Discussion
360(2)
References
361(1)
Multilingual Speech Recognition
362(13)
E. Noth
S. Harbeck
H. Niemann
Introduction
362(1)
Architecture of the National SQEL Demonstrators
363(1)
Language Identification with Different Amounts of Knowledge about the Training Data
364(6)
A System with Explicit Language Identification
365(2)
A System with Implicit Language Identification
367(2)
Language Identification Based on Cepstral Feature Vectors
369(1)
Results
370(3)
Conclusions and Future Work
373(2)
References
373(2)
Toward ALISP: A proposal for Automatic Language Independent Speech Processing.
375(14)
Gerard Chollet
Jan Cernocky
Andrei Constantinescu
Sabine Deligne
Frederic Bimbot
Introduction
375(1)
Practical benefit of ALISP
376(1)
Issues specific to ALISP
377(2)
Selecting features
377(1)
Modeling speech units
377(1)
Defining a derivation criterion
378(1)
Building a lexicon
378(1)
Some tools for ALISP
379(2)
Temporal Decomposition
379(1)
The multigram model
380(1)
Experiments
381(5)
Cross-Language Recognition
381(1)
Very low bit rate speech coding
382(2)
Mono-Speaker Continuous Speech Recognition
384(2)
Conclusions
386(3)
References
387(2)
Interactive Translation of Conversational Speech
389(15)
Alex Waibel
Introduction
389(1)
Background
390(2)
The Problem of Spoken Language Translation
390(1)
Research Efforts on Speech Translation
391(1)
JANUS-II--A Conversational Speech Translator
392(8)
Task Domains and Data Collection
392(2)
System Description
394(4)
Performance Evaluation
398(2)
Applications and Forms of Deployment
400(4)
Interactive Dialog Translation
401(1)
Portable Speech Translation Device
402(1)
Passive Simultaneous Dialog Translation
402(1)
References
403(1)
Multimodal Speech Systems
404(27)
Francoise D. Neel
Wolfgang M. Minker
Introduction
404(1)
System Architecture: Knowledge Sources and Controllers
405(9)
Environment Model
406(1)
System Model
406(2)
User Model
408(2)
Task Model
410(1)
Dialogue Model
411(2)
Models Interdependency
413(1)
Role of Speech in Multimodal Applications
413(1)
Information Speech Systems
414(13)
Spontaneous Language Characteristics
414(2)
Case Grammar Formalism used for Task Modelling
416(1)
Different Parsing Methods
417(9)
Task and Dialogue Model Integration
426(1)
Conclusion
427(4)
References
428(3)
Multimodal Interfaces for Multimedia Information Agents
431(9)
Alex Waibel
Bernhard Suhm
Minh Tue Vo
Ji Yang
Introduction
431(1)
Interpretation of Multimodal Input
432(1)
Multimodal Components
432(1)
Joint Interpretation
432(1)
Multimodal Error Correction
433(1)
Multimodal Interactive Error Repair
433(1)
Error Repair for Multimedia Information Agents
433(1)
Evaluating Interactive Error Repair
434(1)
Multimodal Information Agents
434(3)
Information Access
434(1)
Information Creation
435(1)
Information Manipulation
435(1)
Information Dissemination
436(1)
Controlling the Interface
436(1)
The QuickDoc Application
437(1)
Conclusions
437(3)
References
438(2)
Index 440

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program