What is included with this book?
Imed Zitouni is a senior research scientist at IBM. He has led IBM’s Arabic information extraction and data resources efforts since 2004. He previously led both DIALOCA’s Speech/NLP group and Bell Labs/ Alcatel-Lucent’s language modeling and call routing activities. His work involves machine translation, NLP, and spoken dialog systems.
Acknowledgments xxv
About the Authors xxvii
Part I: In Theory 1
Chapter 1: Finding the Structure of Words 3
1.1 Words and Their Components 4
1.2 Issues and Challenges 8
1.3 Morphological Models 15
1.4 Summary 22
Chapter 2: Finding the Structure of Documents 29
2.1 Introduction 29
2.2 Methods 33
2.3 Complexity of the Approaches 40
2.4 Performances of the Approaches 41
2.5 Features 41
2.6 Processing Stages 48
2.7 Discussion 48
2.8 Summary 49
Chapter 3: Syntax 57
3.1 Parsing Natural Language 57
3.2 Treebanks: A Data-Driven Approach to Syntax 59
3.3 Representation of Syntactic Structure 63
3.4 Parsing Algorithms 70
3.5 Models for Ambiguity Resolution in Parsing 80
3.6 Multilingual Issues: What Is a Token? 87
3.7 Summary 92
Chapter 4: Semantic Parsing 97
4.1 Introduction 97
4.2 Semantic Interpretation 98
4.3 System Paradigms 101
4.4 Word Sense 102
4.5 Predicate-Argument Structure 118
4.6 Meaning Representation 147
4.7 Summary 152
Chapter 5: Language Modeling 169
5.1 Introduction 169
5.2 n-Gram Models 170
5.3 Language Model Evaluation 170
5.4 Parameter Estimation 171
5.5 Language Model Adaptation 176
5.6 Types of Language Models 178
5.7 Language-Specific Modeling Problems 188
5.8 Multilingual and Crosslingual Language Modeling 195
5.9 Summary 198
Chapter 6: Recognizing Textual Entailment 209
6.1 Introduction 209
6.2 The Recognizing Textual Entailment Task 210
6.3 A Framework for Recognizing Textual Entailment 219
6.4 Case Studies 238
6.5 Taking RTE Further 248
6.6 Useful Resources 252
6.7 Summary 253
Chapter 7: Multilingual Sentiment and Subjectivity Analysis 259
7.1 Introduction 259
7.2 Definitions 260
7.3 Sentiment and Subjectivity Analysis on English 262
7.4 Word- and Phrase-Level Annotations 264
7.5 Sentence-Level Annotations 270
7.6 Document-Level Annotations 272
7.7 What Works, What Doesn’t 274
7.8 Summary 277
Part II: In Practice 283
Chapter 8: Entity Detection and Tracking 285
8.1 Introduction 285
8.2 Mention Detection 287
8.3 Coreference Resolution 296
8.4 Summary 303
Chapter 9: Relations and Events 309
9.1 Introduction 309
9.2 Relations and Events 310
9.3 Types of Relations 311
9.4 Relation Extraction as Classification 312
9.5 Other Approaches to Relation Extraction 317
9.6 Events 320
9.7 Event Extraction Approaches 320
9.8 Moving Beyond the Sentence 323
9.9 Event Matching 323
9.10 Future Directions for Event Extraction 326
9.11 Summary 326
Chapter 10: Machine Translation 331
10.1 Machine Translation Today 331
10.2 Machine Translation Evaluation 332
10.3 Word Alignment 337
10.4 Phrase-Based Models 343
10.5 Tree-Based Models 350
10.6 Linguistic Challenges 354
10.7 Tools and Data Resources 356
10.8 Future Directions 358
10.9 Summary 359
Chapter 11: Multilingual Information Retrieval 365
11.1 Introduction 366
11.2 Document Preprocessing 366
11.3 Monolingual Information Retrieval 372
11.4 CLIR 378
11.5 MLIR 382
11.6 Evaluation in Information Retrieval 386
11.7 Tools, Software, and Resources 391
11.8 Summary 393
Chapter 12: Multilingual Automatic Summarization 397
12.1 Introduction 397
12.2 Approaches to Summarization 399
12.3 Evaluation 412
12.4 How to Build a Summarizer 420
12.5 Competitions and Datasets 424
12.6 Summary 426
Chapter 13: Question Answering 433
13.1 Introduction and History 433
13.2 Architectures 435
13.3 Source Acquisition and Preprocessing 437
13.4 Question Analysis 440
13.5 Search and Candidate Extraction 443
13.6 Answer Scoring 450
13.7 Crosslingual Question Answering 454
13.8 A Case Study 455
13.9 Evaluation 460
13.10 Current and Future Challenges 464
13.11 Summary and Further Reading 465
Chapter 14: Distillation 475
14.1 Introduction 475
14.2 An Example 476
14.3 Relevance and Redundancy 477
14.4 The Rosetta Consortium Distillation System 479
14.5 Other Distillation Approaches 488
14.6 Evaluation and Metrics 491
14.7 Summary 495
Chapter 15: Spoken Dialog Systems 499
15.1 Introduction 499
15.2 Spoken Dialog Systems 499
15.3 Forms of Dialog 509
15.4 Natural Language Call Routing 510
15.5 Three Generations of Dialog Applications 510
15.6 Continuous Improvement Cycle 512
15.7 Transcription and Annotation of Utterances 513
15.8 Localization of Spoken Dialog Systems 513
15.9 Summary 520
Chapter 16: Combining Natural Language Processing Engines 523
16.1 Introduction 523
16.2 Desired Attributes of Architectures for Aggregating Speech and NLP Engines 524
16.3 Architectures for Aggregation 527
16.4 Case Studies 531
16.5 Lessons Learned 540
16.6 Summary 542
16.7 Sample UIMA Code 542
Index 551
The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.