Foreword |
Preface |
Introduction / I: |
Introduction and Problem Formulation / 1: |
Machine Learning under Covariate Shift / 1.1: |
Quick Tour of Covariate Shift Adaptation / 1.2: |
Problem Formulation / 1.3: |
Function Learning from Examples / 1.3.1: |
Loss Functions / 1.3.2: |
Generalization Error / 1.3.3: |
Covariate Shift / 1.3.4: |
Models for Function Learning / 1.3.5: |
Specification of Models / 1.3.6: |
Structure of This Book / 1.4: |
Part II: Learning under Covariate Shift / 1.4.1: |
Part III: Learning Causing Covariate Shift / 1.4.2: |
Learning Under Covariate Shift / II: |
Function Approximation / 2: |
Importance-Weighting Techniques for Covariate Shift Adaptation / 2.1: |
Importance-Weighted ERM / 2.1.1: |
Adaptive IWERM / 2.1.2: |
Regularized IWERM / 2.1.3: |
Examples of Importance-Weighted Regression Methods / 2.2: |
Squared Loss: Least-Squares Regression / 2.2.1: |
Absolute Loss: Least-Absolute Regression / 2.2.2: |
Huber Loss: Huber Regression / 2.2.3: |
Deadzone-Linear Loss: Support Vector Regression / 2.2.4: |
Examples of Importance-Weighted Classification Methods / 2.3: |
Squared Loss: Fisher Discriminant Analysis / 2.3.1: |
Logistic Loss: Logistic Regression Classifier / 2.3.2: |
Hinge Loss: Support Vector Machine / 2.3.3: |
Exponential Loss: Boosting / 2.3.4: |
Numerical Examples / 2.4: |
Regression / 2.4.1: |
Classification / 2.4.2: |
Summary and Discussion / 2.5: |
Model Selection / 3: |
Importance-Weighted Akaike Information Criterion / 3.1: |
Importance-Weighted Subspace Information Criterion / 3.2: |
Input Dependence vs. Input Independence in Generalization Error Analysis / 3.2.1: |
Approximately Correct Models / 3.2.2: |
Input-Dependent Analysis of Generalization Error / 3.2.3: |
Importance-Weighted Cross-Validation / 3.3: |
Importance Estimation / 3.4: |
Kernel Density Estimation / 4.1: |
Kernel Mean Matching / 4.2: |
Logistic Regression / 4.3: |
Kullback-Leibler Importance Estimation Procedure / 4.4: |
Algorithm / 4.4.1: |
Model Selection by Cross-Validation / 4.4.2: |
Basis Function Design / 4.4.3: |
Least-Squares Importance Fitting / 4.5: |
Basis Function Design and Model Selection / 4.5.1: |
Regularization Path Tracking / 4.5.3: |
Unconstrained Least-Squares Importance Fitting / 4.6: |
Analytic Computation of Leave-One-Out Cross-Validation / 4.6.1: |
Setting / 4.7: |
Importance Estimation by KLIEP / 4.7.2: |
Covariate Shift Adaptation by IWLS and IWCV / 4.7.3: |
Experimental Comparison / 4.8: |
Summary / 4.9: |
Direct Density-Ratio Estimation with Dimensionality Reduction / 5: |
Density Difference in Hetero-Distributional Subspace / 5.1: |
Characterization of Hetero-Distributional Subspace / 5.2: |
Identifying Hetero-Distributional Subspace / 5.3: |
Basic Idea / 5.3.1: |
Fisher Discriminant Analysis / 5.3.2: |
Local Fisher Discriminant Analysis / 5.3.3: |
Using LFDA for Finding Hetero-Distributional Subspace / 5.4: |
Density-Ratio Estimation in the Hetero-Distributional Subspace / 5.5: |
Illustrative Example / 5.6: |
Performance Comparison Using Artificial Data Sets / 5.6.2: |
Relation to Sample Selection Bias / 5.7: |
Heckman's Sample Selection Model / 6.1: |
Distributional Change and Sample Selection Bias / 6.2: |
The Two-Step Algorithm / 6.3: |
Relation to Covariate Shift Approach / 6.4: |
Applications of Covariate Shift Adaptation / 7: |
Brain-Computer Interface / 7.1: |
Background / 7.1.1: |
Experimental Setup / 7.1.2: |
Experimental Results / 7.1.3: |
Speaker Identification / 7.2: |
Formulation / 7.2.1: |
Natural Language Processing / 7.2.3: |
Perceived Age Prediction from Face Images / 7.3.1: |
Incorporating Characteristics of Human Age Perception / 7.4.1: |
Human Activity Recognition from Accelerometric Data / 7.4.4: |
Importance-Weighted Least-Squares Probabilistic Classifier / 7.5.1: |
Experimental Results. / 7.5.3: |
Sample Reuse in Reinforcement Learning / 7.6: |
Markov Decision Problems / 7.6.1: |
Policy Iteration / 7.6.2: |
Value Function Approximation / 7.6.3: |
Sample Reuse by Covariate Shift Adaptation / 7.6.4: |
On-Policy vs. Off-Policy / 7.6.5: |
Importance Weighting in Value Function Approximation / 7.6.6: |
Automatic Selection of the Flattening Parameter / 7.6.7: |
Sample Reuse Policy Iteration / 7.6.8: |
Robot Control Experiments / 7.6.9: |
Learning Causing Covariate Shift / III: |
Active Learning / 8: |
Preliminaries / 8.1: |
Setup / 8.1.1: |
Decomposition of Generalization Error / 8.1.2: |
Basic Strategy of Active Learning / 8.1.3: |
Population-Based Active Learning Methods / 8.2: |
Classical Method of Active Learning for Correct Models / 8.2.1: |
Limitations of Classical Approach and Countermeasures / 8.2.2: |
Input-Independent Variance-Only Method / 8.2.3: |
Input-Dependent Variance-Only Method / 8.2.4: |
Input-Independent Bias-and-Variance Approach / 8.2.5: |
Numerical Examples of Population-Based Active Learning Methods / 8.3: |
Accuracy of Generalization Error Estimation / 8.3.1: |
Obtained Generalization Error / 8.3.3: |
Pool-Based Active Learning Methods / 8.4: |
Classical Active Learning Method for Correct Models and Its Limitations / 8.4.1: |
Numerical Examples of Pool-Based Active Learning Methods / 8.4.2: |
Active Learning with Model Selection / 8.6: |
Direct Approach and the Active Learning/Model Selection Dilemma / 9.1: |
Sequential Approach / 9.2: |
Batch Approach / 9.3: |
Ensemble Active Learning / 9.4: |
Analysis of Batch Approach / 9.5: |
Analysis of Sequential Approach / 9.5.3: |
Comparison of Obtained Generalization Error / 9.5.4: |
Applications of Active Learning / 9.6: |
Design of Efficient Exploration Strategies in Reinforcement Learning / 10.1: |
Efficient Exploration with Active Learning / 10.1.1: |
Reinforcement Learning Revisited / 10.1.2: |
Estimating Generalization Error for Active Learning / 10.1.3: |
Designing Sampling Policies / 10.1.5: |
Active Learning in Policy Iteration / 10.1.6: |
Wafer Alignment in Semiconductor Exposure Apparatus / 10.1.7: |
Conclusions / IV: |
Conclusions and Future Prospects / 11: |
Future Prospects / 11.1: |
Appendix: List of Symbols and Abbreviations |
Bibliography |
Index |