did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

9780137151448

Multilingual Natural Language Processing Applications From Theory to Practice

by ;
  • ISBN13:

    9780137151448

  • ISBN10:

    0137151446

  • Format: Hardcover
  • Copyright: 2012-05-10
  • Publisher: IBM Press
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $130.00 Save up to $11.89
  • Digital
    $118.11
    Add to Cart

    DURATION
    PRICE

Supplemental Materials

What is included with this book?

Summary

Global organizations must quickly and cost-effectively analyze, translate, synthesize, and distill massive amount of text in multiple languages. The technology needed to automate this process - multilingual natural language processing (NLP)- is advancing rapidly. This is the first comprehensive, "one-stop-shop" guide to building robust and accurate multilingual NLP systems. Multilingual Natural Language Applicationscombines all the essential background and realistic, up-to-date guidance practitioners will need to succeed. Containing new contributions from leading researchers at IBM, Google, Stanford, CMU, Columbia, and ISI, it integrates cutting-edge advances with practical solutions drawn from extensive field experience. Part I focuses primarily on multilingual NLP's core technologies, including technologies for understanding the structure of words and documents; analyzing syntax; modeling language; recognizing entailment, and detecting redundancy. Part II delves into the theoretical and practical considerations involved in using these technologies to construct real-world applications. It contains detailed chapters on information extraction, machine translation, information retrieval and search, summarization, question answering, distillation, and processing pipelines.

Author Biography

Daniel M. Bikel is a senior research scientist at Google, developing new methods for NLP and speech recognition. While at IBM, he architected the distillation system for IBM’s GALE multilingual information extraction and question-answering system. While pursuing his doctorate at Penn, he built the first extensible multilingual syntactic parsing engine.

 

Imed Zitouni is a senior research scientist at IBM. He has led IBM’s Arabic information extraction and data resources efforts since 2004. He previously led both DIALOCA’s Speech/NLP group and Bell Labs/ Alcatel-Lucent’s language modeling and call routing activities. His work involves machine translation, NLP, and spoken dialog systems.

Table of Contents

Preface         xxi

Acknowledgments         xxv

About the Authors         xxvii

 

Part I: In Theory         1

Chapter 1: Finding the Structure of Words         3

1.1 Words and Their Components   4

1.2 Issues and Challenges   8

1.3 Morphological Models   15

1.4 Summary   22

 

Chapter 2: Finding the Structure of Documents         29

2.1 Introduction   29

2.2 Methods   33

2.3 Complexity of the Approaches   40

2.4 Performances of the Approaches   41

2.5 Features   41

2.6 Processing Stages   48

2.7 Discussion   48

2.8 Summary   49

 

Chapter 3: Syntax         57

3.1 Parsing Natural Language   57

3.2 Treebanks: A Data-Driven Approach to Syntax   59

3.3 Representation of Syntactic Structure   63

3.4 Parsing Algorithms 70

3.5 Models for Ambiguity Resolution in Parsing   80

3.6 Multilingual Issues: What Is a Token?   87

3.7 Summary   92

 

Chapter 4: Semantic Parsing         97

4.1 Introduction   97

4.2 Semantic Interpretation   98

4.3 System Paradigms   101

4.4 Word Sense   102

4.5 Predicate-Argument Structure 118

4.6 Meaning Representation   147

4.7 Summary   152

 

Chapter 5: Language Modeling          169

5.1 Introduction   169

5.2 n-Gram Models   170

5.3 Language Model Evaluation   170

5.4 Parameter Estimation   171

5.5 Language Model Adaptation   176

5.6 Types of Language Models   178

5.7 Language-Specific Modeling Problems  188

5.8 Multilingual and Crosslingual Language Modeling   195

5.9 Summary   198

 

Chapter 6: Recognizing Textual Entailment         209

6.1 Introduction   209

6.2 The Recognizing Textual Entailment Task   210

6.3 A Framework for Recognizing Textual Entailment   219

6.4 Case Studies   238

6.5 Taking RTE Further   248

6.6 Useful Resources   252

6.7 Summary   253

 

Chapter 7: Multilingual Sentiment and Subjectivity Analysis         259

7.1 Introduction   259

7.2 Definitions   260

7.3 Sentiment and Subjectivity Analysis on English   262

7.4 Word- and Phrase-Level Annotations   264

7.5 Sentence-Level Annotations   270

7.6 Document-Level Annotations   272

7.7 What Works, What Doesn’t   274

7.8 Summary   277

 

Part II: In Practice         283

Chapter 8: Entity Detection and Tracking         285

8.1 Introduction   285

8.2 Mention Detection   287

8.3 Coreference Resolution   296

8.4 Summary   303

 

Chapter 9: Relations and Events         309

9.1 Introduction   309

9.2 Relations and Events   310

9.3 Types of Relations   311

9.4 Relation Extraction as Classification   312

9.5 Other Approaches to Relation Extraction   317

9.6 Events   320

9.7 Event Extraction Approaches   320

9.8 Moving Beyond the Sentence   323

9.9 Event Matching   323

9.10 Future Directions for Event Extraction   326

9.11 Summary   326

 

Chapter 10: Machine Translation         331

10.1 Machine Translation Today   331

10.2 Machine Translation Evaluation   332

10.3 Word Alignment   337

10.4 Phrase-Based Models   343

10.5 Tree-Based Models   350

10.6 Linguistic Challenges   354

10.7 Tools and Data Resources   356

10.8 Future Directions   358

10.9 Summary   359

 

Chapter 11: Multilingual Information Retrieval         365

11.1 Introduction   366

11.2 Document Preprocessing   366

11.3 Monolingual Information Retrieval   372

11.4 CLIR   378

11.5 MLIR   382

11.6 Evaluation in Information Retrieval   386

11.7 Tools, Software, and Resources   391

11.8 Summary   393

 

Chapter 12: Multilingual Automatic Summarization         397

12.1 Introduction   397

12.2 Approaches to Summarization   399

12.3 Evaluation   412

12.4 How to Build a Summarizer   420

12.5 Competitions and Datasets   424

12.6 Summary   426

 

Chapter 13: Question Answering         433

13.1 Introduction and History   433

13.2 Architectures   435

13.3 Source Acquisition and Preprocessing   437

13.4 Question Analysis   440

13.5 Search and Candidate Extraction   443

13.6 Answer Scoring   450

13.7 Crosslingual Question Answering   454

13.8 A Case Study   455

13.9 Evaluation   460

13.10 Current and Future Challenges   464

13.11 Summary and Further Reading   465

 

Chapter 14: Distillation         475

14.1 Introduction   475

14.2 An Example   476

14.3 Relevance and Redundancy   477

14.4 The Rosetta Consortium Distillation System   479

14.5 Other Distillation Approaches   488

14.6 Evaluation and Metrics   491

14.7 Summary   495

 

Chapter 15: Spoken Dialog Systems         499

15.1 Introduction   499

15.2 Spoken Dialog Systems   499

15.3 Forms of Dialog   509

15.4 Natural Language Call Routing   510

15.5 Three Generations of Dialog Applications   510

15.6 Continuous Improvement Cycle   512

15.7 Transcription and Annotation of Utterances   513

15.8 Localization of Spoken Dialog Systems   513

15.9 Summary   520

 

Chapter 16: Combining Natural Language Processing Engines         523

16.1 Introduction   523

16.2 Desired Attributes of Architectures for Aggregating Speech and NLP Engines   524

16.3 Architectures for Aggregation   527

16.4 Case Studies   531

16.5 Lessons Learned   540

16.6 Summary   542

16.7 Sample UIMA Code   542

 

Index         551

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program