Note: Supplemental materials are not guaranteed with Rental or Used book purchases.
Purchase Benefits
Preface | p. v |
About the Authors | p. xiii |
Introduction | p. 1 |
Audience and Objective | p. 1 |
Scope | p. 1 |
Structure | p. 2 |
Data Quality: What It is, Why It is Important, and How to Achieve It | |
What Is Data Quality and Why Should We Care? | p. 7 |
When Are Data of High Quality? | p. 7 |
Why Care About Data Quality? | p. 10 |
How Do You Obtain High-Quality Data? | p. 11 |
Practical Tips | p. 13 |
Where Are We Now? | p. 13 |
Examples of Entities Using Data to their Advantage/Disadvantage | p. 17 |
Data Quality as a Competitive Advantage | p. 17 |
Data Quality Problems and their Consequences | p. 20 |
How Many People Really Live to 100 and Beyond? Views from the United States, Canada, and the United Kingdom | p. 25 |
Disabled Airplane Pilots - A Successful Application of Record Linkage | p. 26 |
Completeness and Accuracy of a Billing Database: Why It Is Important to the Bottom Line | p. 26 |
Where Are We Now? | p. 27 |
Properties of Data Quality and Metrics for Measuring It | p. 29 |
Desirable Properties of Databases/Lists | p. 29 |
Examples of Merging Two or More Lists and the Issues that May Arise | p. 31 |
Metrics Used when Merging Lists | p. 33 |
Where Are We Now? | p. 35 |
Basic Data Quality Tools | p. 37 |
Data Elements | p. 37 |
Requirements Document | p. 38 |
A Dictionary of Tests | p. 39 |
Deterministic Tests | p. 40 |
Probabilistic Tests | p. 44 |
Exploratory Data Analysis Techniques | p. 44 |
Minimizing Processing Errors | p. 46 |
Practical Tips | p. 46 |
Where Are We Now? | p. 48 |
Specialized Tools for Database Improvement | |
Mathematical Preliminaries for Specialized Data Quality Techniques | p. 51 |
Conditional Independence | p. 51 |
Statistical Paradigms | p. 53 |
Capture-Recapture Procedures and Applications | p. 54 |
Automatic Editing and Imputation of Sample Survey Data | p. 61 |
Introduction | p. 61 |
Early Editing Efforts | p. 63 |
Fellegi-Holt Model for Editing | p. 64 |
Practical Tips | p. 65 |
Imputation | p. 66 |
Constructing a Unified Edit/Imputation Model | p. 71 |
Implicit Edits - A Key Construct of Editing Software | p. 73 |
Editing Software | p. 75 |
Is Automatic Editing Taking Up Too Much Time and Money? | p. 78 |
Selective Editing | p. 79 |
Tips on Automatic Editing and Imputation | p. 79 |
Where Are We Now? | p. 80 |
Record Linkage - Methodology | p. 81 |
Introduction | p. 81 |
Why Did Analysts Begin Linking Records? | p. 82 |
Deterministic Record Linkage | p. 82 |
Probabilistic Record Linkage - A Frequentist Perspective | p. 83 |
Probabilistic Record Linkage - A Bayesian Perspective | p. 91 |
Where Are We Now? | p. 92 |
Estimating the Parameters of the Fellegi-Sunter Record Linkage Model | p. 93 |
Basic Estimation of Parameters Under Simple Agreement/Disagreement Patterns | p. 93 |
Parameter Estimates Obtained via Frequency-Based Matching | p. 94 |
Parameter Estimates Obtained Using Data from Current Files | p. 96 |
Parameter Estimates Obtained via the EM Algorithm | p. 97 |
Advantages and Disadvantages of Using the EM Algorithm to Estimate m- and u-probabilities | p. 101 |
General Parameter Estimation Using the EM Algorithm | p. 103 |
Where Are We Now? | p. 106 |
Standardization and Parsing | p. 107 |
Obtaining and Understanding Computer Files | p. 109 |
Standardization of Terms | p. 110 |
Parsing of Fields | p. 111 |
Where Are We Now? | p. 114 |
Phonetic Coding Systems for Names | p. 115 |
Soundex System of Names | p. 115 |
NYSIIS Phonetic Decoder | p. 119 |
Where Are We Now? | p. 121 |
Blocking | p. 123 |
Independence of Blocking Strategies | p. 124 |
Blocking Variables | p. 125 |
Using Blocking Strategies to Identify Duplicate List Entries | p. 126 |
Using Blocking Strategies to Match Records Between Two Sample Surveys | p. 128 |
Estimating the Number of Matches Missed | p. 130 |
Where Are We Now? | p. 130 |
String Comparator Metrics for Typographical Error | p. 131 |
Jaro String Comparator Metric for Typographical Error | p. 131 |
Adjusting the Matching Weight for the Jaro String Comparator | p. 133 |
Winkler String Comparator Metric for Typographical Error | p. 133 |
Adjusting the Weights for the Winkler Comparator Metric | p. 134 |
Where are We Now? | p. 135 |
Record Linkage Case Studies | |
Duplicate FHA Single-Family Mortgage Records: A Case Study of Data Problems, Consequences, and Corrective Steps | p. 139 |
Introduction | p. 139 |
FHA Case Numbers on Single-Family Mortgages | p. 141 |
Duplicate Mortgage Records | p. 141 |
Mortgage Records with an Incorrect Termination Status | p. 145 |
Estimating the Number of Duplicate Mortgage Records | p. 148 |
Record Linkage Case Studies in the Medical, Biomedical, and Highway Safety Areas | p. 151 |
Biomedical and Genetic Research Studies | p. 151 |
Who goes to a Chiropractor? | p. 153 |
National Master Patient Index | p. 154 |
Provider Access to Immunization Register Securely (PAiRS) System | p. 155 |
Studies Required by the Intermodal Surface Transportation Efficiency Act of 1991 | p. 156 |
Crash Outcome Data Evaluation System | p. 157 |
Constructing List Frames and Administrative Lists | p. 159 |
National Address Register of Residences in Canada | p. 160 |
USDA List Frame of Farms in the United States | p. 162 |
List Frame Development for the US Census of Agriculture | p. 165 |
Post-enumeration Studies of US Decennial Census | p. 166 |
Social Security and Related Topics | p. 169 |
Hidden Multiple Issuance of Social Security Numbers | p. 169 |
How Social Security Stops Benefit Payments after Death | p. 173 |
CPS-IRS-SSA Exact Match File | p. 175 |
Record Linkage and Terrorism | p. 177 |
Other Topics | |
Confidentiality: Maximizing Access to Micro-data while Protecting Privacy | p. 181 |
Importance of High Quality of Data in the Original File | p. 182 |
Documenting Public-use Files | p. 183 |
Checking Re-identifiability | p. 183 |
Elementary Masking Methods and Statistical Agencies | p. 186 |
Protecting Confidentiality of Medical Data | p. 193 |
More-advanced Masking Methods - Synthetic Datasets | p. 195 |
Where Are We Now? | p. 198 |
Review of Record Linkage Software | p. 201 |
Government | p. 201 |
Commercial | p. 202 |
Checklist for Evaluating Record Linkage Software | p. 203 |
Summary Chapter | p. 209 |
Bibliography | p. 211 |
Index | p. 221 |
Table of Contents provided by Ingram. All Rights Reserved. |
The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.