rent-now

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

9780387951218

Elements of Statistical Disclosure Control

by ;
  • ISBN13:

    9780387951218

  • ISBN10:

    0387951210

  • Format: Paperback
  • Copyright: 2000-11-01
  • Publisher: Springer Verlag
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $149.99 Save up to $114.35
  • Digital
    $77.22*
    Add to Cart

    DURATION
    PRICE
    *To support the delivery of the digital material to you, a digital delivery fee of $3.99 will be charged on each digital item.

Summary

Statistical disclosure control is the discipline that deals with producing statistical data that are safe enough to be released to external researchers. This book concentrates on the methodology of the area. It deals with both microdata (individual data) and tabular (aggregated) data. The book attempts to develop the theory from what can be called the paradigm of statistical confidentiality: to modify unsafe data in such a way that safe (enough) data emerge, with minimum information loss. This book discusses what safe data, are, how information loss can be measured, and how to modify the data in a (near) optimal way. Once it has been decided how to measure safety and information loss, the production of safe data from unsafe data is often a matter of solving an optimization problem. Several such problems are discussed in the book, and most of them turn out to be hard problems that can be solved only approximately. The authors present new results that have not been published before. The book is not a description of an area that is closed, but, on the contrary, one that still has many spots awaiting to be more fully explored. Some of these are indicated in the book. The book will be useful for official, social and medical statisticians and others who are involved in releasing personal or business data for statistical use. Operations researchers may be interested in the optimization problems involved, particularly for the challenges they present. Leon Willenborg has worked at the Department of Statistical Methods at Statistics Netherlands since 1983, first as a researcher and since 1989 as a senior researcher. Since 1989 his main field of research and consultancy has been statistical disclosure control. From 1996-1998 he was the project coordinator of the EU co-funded SDC project.

Table of Contents

Preface v
Overview of the Area
1(38)
Introduction
1(3)
Types of Variables
4(9)
Categorical variable
4(2)
Hierarchical variable
6(1)
Continuous/Numerical/Quantitative Variable
6(1)
Identifying Variable
7(2)
Sensitive Variable
9(1)
Weight Variable
9(1)
Regional Variable
10(1)
Household Variable
11(1)
Spanning Variable and Response Variable
12(1)
Shadow Variable
12(1)
Types of Microdata
13(1)
Simple Microdata
13(1)
Complex Microdata
14(1)
Types of Tabular Data
14(5)
Single Tables
15(1)
Marginal Tables
15(2)
Hierarchical Tables
17(1)
Linked Tables
17(1)
Semi-linked Tables
17(1)
Complex Tables
18(1)
Tables from Hierarchical Microdata
19(1)
Introduction to SDC for Microdata and Tables
19(3)
Intruders and Disclosure Scenarios
22(1)
Information Loss
23(3)
Information Loss for Microdata
25(1)
Information Loss for Tables
25(1)
Disclosure Protection Techniques for Microdata
26(7)
Local Recoding
26(1)
Global Recoding
27(1)
Local Suppression
28(1)
Local Suppression with Imputation
29(1)
Synthetic Microdata and Multiple Imputation
29(1)
Subsampling
29(1)
Adding Noise
30(1)
Rounding
30(1)
Microaggregation
30(2)
PRAM
32(1)
Data Swapping
32(1)
Disclosure Protection Techniques for Tables
33(6)
Table Redesign
33(1)
Cell Suppression
33(2)
Adding Noise
35(1)
Rounding
36(1)
Source Data Perturbation
36(3)
Disclosure Risks for Microdata
39(32)
Introduction
39(1)
Microdata
40(1)
Disclosure Scenario
40(2)
Predictive Disclosure
42(4)
Re-identification Risk
46(6)
Risk Per Record and Overall Risk
52(1)
Population Uniqueness and Unsafe Combinations
53(1)
Modeling Risks with Discrete Key Variables
54(7)
Direct Approach
55(2)
Model Based Approach
57(4)
Disclosure Scenarios in Practice
61(3)
Researcher Scenario
62(1)
Hacker Scenario
63(1)
Combinations to Check
64(4)
A Priori Specified Combinations
64(2)
Data Driven Combinations: Fingerprinting
66(2)
Practical Safety Criteria for Perturbative Techniques
68(3)
Data Analytic Impact of SDC Techniques on Microdata
71(22)
Introduction
71(3)
The Variance Impact of SDC Procedures
74(1)
The Bias Impact of SDC Procedures
75(1)
Impact of SDC Procedures on Methods of Estimation
75(1)
Information Loss Measures Based on Entropy
76(8)
Local Recording
77(1)
Local Suppression
78(1)
Global Recoding
78(1)
PRAM
79(1)
Data Swapping
79(1)
Adding Noise
80(1)
Rounding
80(1)
Microaggregation
81(3)
Alternative Information Loss Measures
84(5)
Subjective Measures for Non-perturbative SDC Techniques
85(1)
Subjective Measures for Perturbative SDC Techniques
86(1)
Flow Measure for PRAM
87(2)
MSP for Microdata
89(4)
Application of Non-Perturbative SDC Techniques for Microdata
93(14)
Introduction
93(1)
Local Suppression
94(8)
MINUCs Introduced
94(1)
Minimizing the Number of Local Suppressions
95(3)
Minimizing the Number of Different Suppressed Categories
98(1)
Extended Local Suppression Models
99(2)
MINUCs and μ-ARGUS
101(1)
Global Recoding
102(4)
Free Global Recoding
103(2)
Precoded Global Recoding
105(1)
Global Recoding and Local Suppression Combined
106(1)
Application of Perturbative SDC Techniques for Micro-data
107(30)
Introduction
107(1)
Overview
107(1)
Adding Noise
108(2)
Rounding
110(5)
Univariate Deterministic Rounding
110(2)
Univariate Stochastic Rounding
112(1)
Multivariate Rounding
113(2)
Derivation of PRAM Matrices
115(11)
Preparations
116(2)
Model I: A Two-step Model
118(2)
Model II: A One-step Model
120(3)
Two-stage PRAM
123(2)
Construction of PRAM Matrices
125(1)
Some Comments on PRAM
126(1)
Data Swapping
126(2)
Adjustment Weights
128(9)
Disclosing Poststrata
128(2)
Disclosure for Multiplicative Weighting
130(4)
Disclosure Control for Poststrata
134(3)
Disclosure Risk for Tabular Data
137(22)
Introduction
137(1)
Disclosure Risk for Tables of Magnitude Tables
138(8)
Linear Sensitivity Measures
140(1)
Dominance Rule
141(1)
Prior-posterior Rule
141(2)
Intruder's Knowledge of the Sensitivity Criterion Used
143(1)
Magnitude Tables from a Sample
144(2)
Disclosure Risk for Frequency Count Tables
146(4)
Frequency Count Tables Based on a Complete Enumeration
147(2)
Frequency Count Tables Based on Sample Data
149(1)
Linked Tables
150(2)
Protection Intervals for Sensitive Cells
152(5)
Sensitivity Rules for General Tables
157(2)
Information Loss in Tabular Data
159(16)
Introduction
159(2)
Information Loss Based on Cell Weights
161(4)
Secondary Cell Suppression
161(3)
Rounding
164(1)
Table Redesign
164(1)
MSP for Tables
165(2)
Table Redesign
165(1)
Secondary Cell Suppression
166(1)
Rounding
167(1)
Entropy Considerations
167(8)
Some General Remarks
168(1)
Tabulation
169(1)
Cell Suppression
170(1)
Table Redesign
171(2)
Rounding
173(2)
Application of Non-Perturbative Techniques for Tabular Data
175(44)
Introduction
175(1)
Table Redesign
176(1)
Cell Suppression
177(2)
Some Additional Cell Suppression Terminology
179(5)
The Zero-Extended Table
179(2)
Paths, Cycles and Their Cells
181(2)
Network Formulation for Two-dimensional Tables
183(1)
Hypercube Method
184(4)
Secondary Suppression as an LP-Problem
188(2)
The Underlying Idea
188(2)
Secondary Suppression as a MIP
190(14)
Lougee-Heimer's Model
191(1)
Kelly's Model
192(3)
Geurts' Model
195(2)
Fischeti and Salazar's Model
197(6)
Partial Cell Suppression
203(1)
Cell Suppression in Linked Tables
204(4)
Top-Down Approach
204(2)
Approach Based on MIP
206(2)
Cell Suppression in General Two-Dimensional Tables
208(4)
Cell Suppression in General Three-Dimensional Tables
212(4)
Comments on Cell Suppression
216(3)
Application of Perturbative Techniques for Tabular Data
219(26)
Introduction
219(1)
Adding Noise
220(1)
Unrestricted Rounding
221(4)
Deterministic Rounding
221(3)
Stochastic Rounding
224(1)
Controlled Rounding
225(7)
Controlled Rounding in One-Dimensional Tables
226(1)
Controlled Rounding in Two-dimensional Tables
227(5)
Controlled Rounding by Means of Simulated Annealing
232(3)
Simulated Annealing
232(2)
Applying Simulated Annealing to the Controlled Rounding Problem
234(1)
Controlled Rounding as a MIP
235(4)
The Controlled Rounding Problem for Two-dimensional Tables
237(1)
The Controlled Rounding Problem for Three-dimensional Tables
238(1)
Linked Tables
239(6)
Rounding in Linked Tables
239(1)
Source Data Perturbation
240(5)
References 245

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program