rent-now

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

9780262182539

Gaussian Processes for Machine Learning

by ;
  • ISBN13:

    9780262182539

  • ISBN10:

    026218253X

  • Format: Hardcover
  • Copyright: 2005-11-23
  • Publisher: The MIT Press

Note: Supplemental materials are not guaranteed with Rental or Used book purchases.

Purchase Benefits

  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
  • Buyback Icon We Buy This Book Back!
    In-Store Credit: $10.76
    Check/Direct Deposit: $10.25
    PayPal: $10.25
List Price: $53.33 Save up to $22.93
  • Rent Book $30.40
    Add to Cart Free Shipping Icon Free Shipping

    TERM
    PRICE
    DUE
    USUALLY SHIPS IN 3-5 BUSINESS DAYS
    *This item is part of an exclusive publisher rental program and requires an additional convenience fee. This fee will be reflected in the shopping cart.

How To: Textbook Rental

Looking to rent a book? Rent Gaussian Processes for Machine Learning [ISBN: 9780262182539] for the semester, quarter, and short term or search our site for other textbooks by Rasmussen, Carl Edward; Williams, Christopher K. I.. Renting a textbook can save you up to 90% from the cost of buying.

Summary

Winner, 2009 DeGroot Prize for the best book in statistical science, awarded by the International Society for Bayesian Analysis. Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade, and this book provides a long-needed systematic and unified treatment of theoretical and practical aspects of GPs in machine learning. The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics. The book deals with the supervised-learning problem for both regression and classification, and includes detailed algorithms. A wide variety of covariance (kernel) functions are presented and their properties discussed. Model selection is discussed both from a Bayesian and a classical perspective. Many connections to other well-known techniques from machine learning and statistics are discussed, including support-vector machines, neural networks, splines, regularization networks, relevance vector machines and others. Theoretical issues including learning curves and the PAC-Bayesian framework are treated, and several approximation methods for learning with large datasets are discussed. The book contains illustrative examples and exercises, and code and datasets are available on the Web. Appendixes provide mathematical background and a discussion of Gaussian Markov processes.

Author Biography

Carl Edward Rasmussen is a Lecturer at the Department of Engineering, University of Cambridge, and Adjunct Research Scientist at the Max Planck Institute for Biological Cybernetics, Tübingen.

Christopher K. I. Williams is Professor of Machine Learning and Director of the Institute for Adaptive and Neural Computation in the School of Informatics, University of Edinburgh.

Table of Contents

Series Foreword xi
Preface xiii
Symbols and Notation xvii
Introduction
1(6)
A Pictorial Introduction to Bayesian Modelling
3(2)
Roadmap
5(2)
Regression
7(26)
Weight-space View
7(6)
The Standard Linear Model
8(3)
Projections of Inputs into Feature Space
11(2)
Function-space View
13(6)
Varying the Hyperparameters
19(2)
Decision Theory for Regression
21(1)
An Example Application
22(2)
Smoothing, Weight Functions and Equivalent Kernels
24(3)
Incorporating Explicit Basis Functions
27(2)
Marginal Likelihood
29(1)
History and Related Work
29(1)
Exercises
30(3)
Classification
33(46)
Classification Problems
34(3)
Decision Theory for Classification
35(2)
Linear Models for Classification
37(2)
Gaussian Process Classification
39(2)
The Laplace Approximation for the Binary GP Classifier
41(7)
Posterior
42(2)
Predictions
44(1)
Implementation
45(2)
Marginal Likelihood
47(1)
Multi-class Laplace Approximation
48(4)
Implementation
51(1)
Expectation Propagation
52(8)
Predictions
56(1)
Marginal Likelihood
57(1)
Implementation
57(3)
Experiments
60(12)
A Toy Problem
60(2)
One-dimensional Example
62(1)
Binary Handwritten Digit Classification Example
63(7)
10-class Handwritten Digit Classification Example
70(2)
Discussion
72(2)
Appendix: Moment Derivations
74(1)
Exercises
75(4)
Covariance Functions
79(26)
Preliminaries
79(2)
Mean Square Continuity and Differentiability
81(1)
Examples of Covariance Functions
81(15)
Stationary Covariance Functions
82(7)
Dot Product Covariance Functions
89(1)
Other Non-stationary Covariance Functions
90(4)
Making New Kernels from Old
94(2)
Eigenfunction Analysis of Kernels
96(3)
An Analytic Example
97(1)
Numerical Approximation of Eigenfunctions
98(1)
Kernels for Non-vectorial Inputs
99(3)
String Kernels
100(1)
Fisher Kernels
101(1)
Exercises
102(3)
Model Selection and Adaptation of Hyperparameters
105(24)
The Model Selection Problem
106(2)
Bayesian Model Selection
108(3)
Cross-validation
111(1)
Model Selection for GP Regression
112(12)
Marginal Likelihood
112(4)
Cross-validation
116(2)
Examples and Discussion
118(6)
Model Selection for GP Classification
124(4)
Derivatives of the Marginal Likelihood for Laplace's Approximation
125(2)
Derivatives of the Marginal Likelihood for EP
127(1)
Cross-validation
127(1)
Example
128(1)
Exercises
128(1)
Relationships between GPs and Other Models
129(22)
Reproducing Kernel Hilbert Spaces
129(3)
Regularization
132(4)
Regularization Defined by Differential Operators
133(2)
Obtaining the Regularized Solution
135(1)
The Relationship of the Regularization View to Gaussian Process Prediction
135(1)
Spline Models
136(5)
A 1-d Gaussian Process Spline Construction
138(3)
Support Vector Machines
141(5)
Support Vector Classification
141(4)
Support Vector Regression
145(1)
Least-squares Classification
146(3)
Probabilistic Least-squares Classification
147(2)
Relevance Vector Machines
149(1)
Exercises
150(1)
Theoretical Perspectives
151(20)
The Equivalent Kernel
151(4)
Some Specific Examples of Equivalent Kernels
153(2)
Asymptotic Analysis
155(4)
Consistency
155(2)
Equivalence and Orthogonality
157(2)
Average-case Learning Curves
159(2)
PAC-Bayesian Analysis
161(4)
The PAC Framework
162(1)
PAC-Bayesian Analysis
163(1)
PAC-Bayesian Analysis of GP Classification
164(1)
Comparison with Other Supervised Learning Methods
165(3)
Appendix: Learning Curve for the Ornstein-Uhlenbeck Process
168(1)
Exercises
169(2)
Approximation Methods for Large Datasets
171(18)
Reduced-rank Approximations of the Gram Matrix
171(3)
Greedy Approximation
174(1)
Approximations for GPR with Fixed Hyperparameters
175(10)
Subset of Regressors
175(2)
The Nystrom Method
177(1)
Subset of Datapoints
177(1)
Projected Process Approximation
178(2)
Bayesian Committee Machine
180(1)
Iterative Solution of Linear Systems
181(1)
Comparison of Approximate GPR Methods
182(3)
Approximations for GPC with Fixed Hyperparameters
185(1)
Approximating the Marginal Likelihood and its Derivatives
185(2)
Appendix: Equivalence of SR and GPR Using the Nystrom Approximate Kernel
187(1)
Exercises
187(2)
Further Issues and Conclusions
189(10)
Multiple Outputs
190(1)
Noise Models with Dependencies
190(1)
Non-Gaussian Likelihoods
191(1)
Derivative Observations
191(1)
Prediction with Uncertain Inputs
192(1)
Mixtures of Gaussian Processes
192(1)
Global Optimization
193(1)
Evaluation of Integrals
193(1)
Student's t Process
194(1)
Invariances
194(2)
Latent Variable Models
196(1)
Conclusions and Future Directions
196(3)
Appendix A Mathematical Background
199(8)
Joint, Marginal and Conditional Probability
199(1)
Gaussian Identities
200(1)
Matrix Identities
201(1)
Matrix Derivatives
202(1)
Matrix Norms
202(1)
Cholesky Decomposition
202(1)
Entropy and Kullback-Leibler Divergence
203(1)
Limits
204(1)
Measure and Integration
204(1)
Lp Spaces
205(1)
Fourier Transforms
205(1)
Convexity
206(1)
Appendix B Gaussian Markov Processes
207(14)
Fourier Analysis
208(3)
Sampling and Periodization
209(2)
Continuous-time Gaussian Markov Processes
211(3)
Continuous-time GMPs on R
211(2)
The Solution of the Corresponding SDE on the Circle
213(1)
Discrete-time Gaussian Markov Processes
214(3)
Discrete-time GMPs on Z
214(1)
The Solution of the Corresponding Difference Equation on PN
215(2)
The Relationship Between Discrete-time and Sampled Continuous-time GMPs
217(1)
Markov Processes in Higher Dimensions
218(3)
Appendix C Datasets and Code
221(2)
Bibliography 223(16)
Author Index 239(6)
Subject Index 245

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program