Foreword | p. xi |
Preface | p. xiii |
Introduction | |
Introduction and Problem Formulation | p. 3 |
Machine Learning under Covariate Shift | p. 3 |
Quick Tour of Covariate Shift Adaptation | p. 5 |
Problem Formulation | p. 7 |
Function Learning from Examples | p. 7 |
Loss Functions | p. 8 |
Generalization Error | p. 9 |
Covariate Shift | p. 9 |
Models for Function Learning | p. 10 |
Specification of Models | p. 13 |
Structure of This Book | p. 14 |
Part II: Learning under Covariate Shift | p. 14 |
Part III: Learning Causing Covariate Shift | p. 17 |
Learning Under Covariate Shift | |
Function Approximation | p. 21 |
Importance-Weighting Techniques for Covariate Shift Adaptation | p. 22 |
Importance-Weighted ERM | p. 22 |
Adaptive IWERM | p. 23 |
Regularized IWERM | p. 23 |
Examples of Importance-Weighted Regression Methods | p. 25 |
Squared Loss: Least-Squares Regression | p. 26 |
Absolute Loss: Least-Absolute Regression | p. 30 |
Huber Loss: Huber Regression | p. 31 |
Deadzone-Linear Loss: Support Vector Regression | p. 33 |
Examples of Importance-Weighted Classification Methods | p. 35 |
Squared Loss: Fisher Discriminant Analysis | p. 36 |
Logistic Loss: Logistic Regression Classifier | p. 38 |
Hinge Loss: Support Vector Machine | p. 39 |
Exponential Loss: Boosting | p. 40 |
Numerical Examples | p. 40 |
Regression | p. 40 |
Classification | p. 41 |
Summary and Discussion | p. 45 |
Model Selection | p. 47 |
Importance-Weighted Akaike Information Criterion | p. 47 |
Importance-Weighted Subspace Information Criterion | p. 50 |
Input Dependence vs. Input Independence in Generalization Error Analysis | p. 51 |
Approximately Correct Models | p. 53 |
Input-Dependent Analysis of Generalization Error | p. 54 |
Importance-Weighted Cross-Validation | p. 64 |
Numerical Examples | p. 66 |
Regression | p. 66 |
Classification | p. 69 |
Summary and Discussion | p. 70 |
Importance Estimation | p. 73 |
Kernel Density Estimation | p. 73 |
Kernel Mean Matching | p. 75 |
Logistic Regression | p. 76 |
Kullback-Leibler Importance Estimation Procedure | p. 78 |
Algorithm | p. 78 |
Model Selection by Cross-Validation | p. 81 |
Basis Function Design | p. 82 |
Least-Squares Importance Fitting | p. 83 |
Algorithm | p. 83 |
Basis Function Design and Model Selection | p. 84 |
Regularization Path Tracking | p. 85 |
Unconstrained Least-Squares Importance Fitting | p. 87 |
Algorithm | p. 87 |
Analytic Computation of Leave-One-Out Cross-Validation | p. 88 |
Numerical Examples | p. 88 |
Setting | p. 90 |
Importance Estimation by KLIEP | p. 90 |
Covariate Shift Adaptation by IWLS and IWCV | p. 92 |
Experimental Comparison | p. 94 |
Summary | p. 101 |
Direct Density-Ratio Estimation with Dimensionality Reduction | p. 103 |
Density Difference in Hetero-Distributional Subspace | p. 103 |
Characterization of Hetero-Distributional Subspace | p. 104 |
Identifying Hetero-Distributional Subspace | p. 106 |
Basic Idea | p. 106 |
Fisher Discriminant Analysis | p. 108 |
Local Fisher Discriminant Analysis | p. 109 |
Using LFDA for Finding Hetero-Distributional Subspace | p. 112 |
Density-Ratio Estimation in the Hetero-Distributional Subspace | p. 113 |
Numerical Examples | p. 113 |
Illustrative Example | p. 113 |
Performance Comparison Using Artificial Data Sets | p. 117 |
Summary | p. 121 |
Relation to Sample Selection Bias | p. 125 |
Heckman's Sample Selection Model | p. 125 |
Distributional Change and Sample Selection Bias | p. 129 |
The Two-Step Algorithm | p. 131 |
Relation to Covariate Shift Approach | p. 134 |
Applications of Covariate Shift Adaptation | p. 137 |
Brain-Computer Interface | p. 137 |
Background | p. 137 |
Experimental Setup | p. 138 |
Experimental Results | p. 140 |
Speaker Identification | p. 142 |
Background | p. 142 |
Formulation | p. 142 |
Experimental Results | p. 144 |
Natural Language Processing | p. 149 |
Formulation | p. 149 |
Experimental Results | p. 151 |
Perceived Age Prediction from Face Images | p. 152 |
Background | p. 152 |
Formulation | p. 153 |
Incorporating Characteristics of Human Age Perception | p. 153 |
Experimental Results | p. 155 |
Human Activity Recognition from Accelerometric Data | p. 157 |
Background | p. 157 |
Importance-Weighted Least-Squares Probabilistic Classifier | p. 157 |
Experimental Results. | p. 160 |
Sample Reuse in Reinforcement Learning | p. 165 |
Markov Decision Problems | p. 165 |
Policy Iteration | p. 166 |
Value Function Approximation | p. 167 |
Sample Reuse by Covariate Shift Adaptation | p. 168 |
On-Policy vs. Off-Policy | p. 169 |
Importance Weighting in Value Function Approximation | p. 170 |
Automatic Selection of the Flattening Parameter | p. 174 |
Sample Reuse Policy Iteration | p. 175 |
Robot Control Experiments | p. 176 |
Learning Causing Covariate Shift | |
Active Learning | p. 183 |
Preliminaries | p. 183 |
Setup | p. 183 |
Decomposition of Generalization Error | p. 185 |
Basic Strategy of Active Learning | p. 188 |
Population-Based Active Learning Methods | p. 188 |
Classical Method of Active Learning for Correct Models | p. 189 |
Limitations of Classical Approach and Countermeasures | p. 190 |
Input-Independent Variance-Only Method | p. 191 |
Input-Dependent Variance-Only Method | p. 193 |
Input-Independent Bias-and-Variance Approach | p. 195 |
Numerical Examples of Population-Based Active Learning Methods | p. 198 |
Setup | p. 198 |
Accuracy of Generalization Error Estimation | p. 200 |
Obtained Generalization Error | p. 202 |
Pool-Based Active Learning Methods | p. 204 |
Classical Active Learning Method for Correct Models and Its Limitations | p. 204 |
Input-Independent Variance-Only Method | p. 205 |
Input-Dependent Variance-Only Method | p. 206 |
Input-Independent Bias-and-Variance Approach | p. 207 |
Numerical Examples of Pool-Based Active Learning Methods | p. 209 |
Summary and Discussion | p. 212 |
Active Learning with Model Selection | p. 215 |
Direct Approach and the Active Learning/Model Selection Dilemma | p. 215 |
Sequential Approach | p. 216 |
Batch Approach | p. 218 |
Ensemble Active Learning | p. 219 |
Numerical Examples | p. 220 |
Setting | p. 220 |
Analysis of Batch Approach | p. 221 |
Analysis of Sequential Approach | p. 222 |
Comparison of Obtained Generalization Error | p. 222 |
Summary and Discussion | p. 223 |
Applications of Active Learning | p. 225 |
Design of Efficient Exploration Strategies in Reinforcement Learning | p. 225 |
Efficient Exploration with Active Learning | p. 225 |
Reinforcement Learning Revisited | p. 226 |
Decomposition of Generalization Error | p. 228 |
Estimating Generalization Error for Active Learning | p. 229 |
Designing Sampling Policies | p. 230 |
Active Learning in Policy Iteration | p. 231 |
Robot Control Experiments | p. 232 |
Wafer Alignment in Semiconductor Exposure Apparatus | p. 234 |
Conclusions | |
Conclusions and Future Prospects | p. 241 |
Conclusions | p. 241 |
Future Prospects | p. 242 |
Appendix: List of Symbols and Abbreviations | p. 243 |
Bibliography | p. 247 |
Index | p. 259 |
Table of Contents provided by Ingram. All Rights Reserved. |
The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.