did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

9780387952840

The Elements of Statistical Learning

by ; ;
  • ISBN13:

    9780387952840

  • ISBN10:

    0387952845

  • Format: Hardcover
  • Copyright: 2001-10-01
  • Publisher: Springer Verlag
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $104.00 Save up to $73.96
  • Digital
    $65.08
    Add to Cart

    DURATION
    PRICE

Supplemental Materials

What is included with this book?

Summary

During the past decade there has been an explosion in computation and information technology. With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics.Many of these tools have common underpinnings but are often expressed with different terminology. This book descibes theimprtant ideas in these areas ina common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a vluable resource for statisticians and anyone interested in data mining in science or industry.The book's coverage is broad, from supervised learing (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting--the first comprehensive treatment of this topic in any book.Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.

Table of Contents

Preface vii
Introduction
1(8)
Overview of Supervised Learning
9(32)
Introduction
9(1)
Variable Types and Terminology
9(2)
Two Simple Approaches to Prediction: Least Squares and Nearest Neighbors
11(7)
Linear Models and Least Squares
11(3)
Nearest-Neighbor Methods
14(2)
From Least Squares to Nearest Neighbors
16(2)
Statistical Decision Theory
18(4)
Local Methods in High Dimensions
22(6)
Statistical Models, Supervised Learning and Function Approximation
28(4)
A Statistical Model for the Joint Distribution Pr (X, Y)
28(1)
Supervised Learning
29(1)
Function Approximation
29(3)
Structured Regression Models
32(1)
Difficulty of the Problem
32(1)
Classes of Restricted Estimators
33(4)
Roughness Penalty and Bayesian Methods
34(1)
Kernel Methods and Local Regression
34(1)
Basis Functions and Dictionary Methods
35(2)
Model Selection and the Bias --- Variance Tradeoff
37(4)
Bibliographic Notes
39(1)
Exercises
39(2)
Linear Methods for Regression
41(38)
Introduction
41(1)
Linear Regression Models and Least Squares
42(8)
Example: Prostate Cancer
47(2)
The Gauss --- Markov Theorem
49(1)
Multiple Regression from Simple Univariate Regression
50(5)
Multiple Outputs
54(1)
Subset Selection and Coefficient Shrinkage
55(20)
Subset Selection
55(2)
Prostate Cancer Data Example (Continued)
57(2)
Shrinkage Methods
59(7)
Methods Using Derived Input Directions
66(2)
Discussion: A Comparison of the Selection and Shrinkage Methods
68(5)
Multiple Outcome Shrinkage and Selection
73(2)
Computational Considerations
75(4)
Bibliographic Notes
75(1)
Exercises
75(4)
Linear Methods for Classification
79(36)
Introduction
79(2)
Linear Regression of an Indicator Matrix
81(3)
Linear Discriminant Analysis
84(11)
Regularized Discriminant Analysis
90(1)
Computations for LDA
91(1)
Reduced-Rank Linear Discriminant Analysis
91(4)
Logistic Regression
95(10)
Fitting Logistic Regression Models
98(2)
Example: South African Heart Disease
100(2)
Quadratic Approximations and Inference
102(1)
Logistic Regression or LDA?
103(2)
Separating Hyperplanes
105(10)
Rosenblatt's Perceptron Learning Algorithm
107(1)
Optimal Separating Hyperplanes
108(3)
Bibliographic Notes
111(1)
Exercises
111(4)
Basis Expansions and Regularization
115(50)
Introduction
115(2)
Piecewise Polynomials and Splines
117(9)
Natural Cubic Splines
120(2)
Example: South African Heart Disease (Continued)
122(2)
Example: Phoneme Recognition
124(2)
Filtering and Feature Extraction
126(1)
Smoothing Splines
127(7)
Degrees of Freedom and Smoother Matrices
129(5)
Automatic Selection of the Smoothing Parameters
134(3)
Fixing the Degrees of Freedom
134(1)
The Bias --- Variance Tradeoff
134(3)
Nonparametric Logistic Regression
137(1)
Multidimensional Splines
138(6)
Regularization and Reproducing Kernel Hilbert Spaces
144(4)
Spaces of Functions Generated by Kernels
144(2)
Examples of RKHS
146(2)
Wavelet Smoothing
148(17)
Wavelet Bases and the Wavelet Transform
150(3)
Adaptive Wavelet Filtering
153(2)
Bibliographic Notes
155(1)
Exercises
155(5)
Appendix: Computational Considerations for Splines
160(1)
Appendix: B-splines
160(3)
Appendix: Computations for Smoothing Splines
163(2)
Kernel Methods
165(28)
One-Dimensional Kernel Smoothers
165(7)
Local Linear Regression
168(3)
Local Polynomial Regression
171(1)
Selecting the Width of the Kernel
172(2)
Local Regression in IRp
174(1)
Structured Local Regression Models in IRp
175(4)
Structured Kernels
177(1)
Structured Regression Functions
177(2)
Local Likelihood and Other Models
179(3)
Kernel Density Estimation and Classification
182(4)
Kernel Density Estimation
182(2)
Kernel Density Classification
184(1)
The Naive Bayes Classifier
184(2)
Radial Basis Functions and Kernels
186(2)
Mixture Models for Density Estimation and Classification
188(2)
Computational Considerations
190(3)
Bibliographic Notes
190(1)
Exercises
190(3)
Model Assessment and Selection
193(32)
Introduction
193(1)
Bias, Variance and Model Complexity
193(3)
The Bias --- Variance Decomposition
196(4)
Example: Bias --- Variance Tradeoff
198(2)
Optimism of the Training Error Rate
200(3)
Estimates of In-Sample Prediction Error
203(2)
The Effective Number of Parameters
205(1)
The Bayesian Approach and BIC
206(2)
Minimum Description Length
208(2)
Vapnik---Chernovenkis Dimension
210(4)
Example (Continued)
212(2)
Cross-Validation
214(3)
Bootstrap Methods
217(8)
Example (Continued)
220(2)
Bibliographic Notes
222(1)
Exercises
222(3)
Model Inference and Averaging
225(32)
Introduction
225(1)
The Bootstrap and Maximum Likelihood Methods
225(6)
A Smoothing Example
225(4)
Maximum Likelihood Inference
229(2)
Bootstrap versus Maximum Likelihood
231(1)
Bayesian Methods
231(4)
Relationship Between the Bootstrap and Bayesian Inference
235(1)
The EM Algorithm
236(7)
Two-Component Mixture Model
236(4)
The EM Algorithm in General
240(1)
EM as a Maximization-Maximization Procedure
241(2)
MCMC for Sampling from the Posterior
243(3)
Bagging
246(4)
Example: Trees with Simulated Data
247(3)
Model Averaging and Stacking
250(3)
Stochastic Search: Bumping
253(4)
Bibliographic Notes
254(1)
Exercises
255(2)
Additive Models, Trees, and Related Methods
257(42)
Generalized Additive Models
257(9)
Fitting Additive Models
259(2)
Example: Additive Logistic Regression
261(5)
Summary
266(1)
Tree-Based Methods
266(13)
Background
266(1)
Regression Trees
267(3)
Classification Trees
270(2)
Other Issues
272(3)
Spam Example (Continued)
275(4)
PRIM --- Bump Hunting
279(4)
Spam Example (Continued)
282(1)
MARS: Multivariate Adaptive Regression Splines
283(7)
Spam Example (Continued)
287(1)
Example (Simulated Data)
288(1)
Other Issues
289(1)
Hierarchical Mixtures of Experts
290(3)
Missing Data
293(2)
Computational Considerations
295(4)
Bibliographic Notes
295(1)
Exercises
296(3)
Boosting and Additive Trees
299(48)
Boosting Methods
299(4)
Outline of this Chapter
302(1)
Boosting Fits an Additive Model
303(1)
Forward Stagewise Additive Modeling
304(1)
Exponential Loss and AdaBoost
305(1)
Why Exponential Loss?
306(2)
Loss Functions and Robustness
308(4)
``Off-the-Shelf'' Procedures for Data Mining
312(2)
Example --- Spam Data
314(2)
Boosting Trees
316(3)
Numerical Optimization
319(4)
Steepest Descent
320(1)
Gradient Boosting
320(2)
MART
322(1)
Right-Sized Trees for Boosting
323(1)
Regularization
324(7)
Shrinkage
326(2)
Penalized Regression
328(2)
Virtues of the L1 Penalty (Lasso) over L2
330(1)
Interpretation
331(4)
Relative Importance of Predictor Variables
331(2)
Partial Dependence Plots
333(2)
Illustrations
335(12)
California Housing
335(4)
Demographics Data
339(1)
Bibliographic Notes
340(4)
Exercises
344(3)
Neural Networks
347(24)
Introduction
347(1)
Projection Pursuit Regression
347(3)
Neural Networks
350(3)
Fitting Neural Networks
353(2)
Some Issues in Training Neural Networks
355(4)
Starting Values
355(1)
Overfitting
356(2)
Scaling of the Inputs
358(1)
Number of Hidden Units and Layers
358(1)
Multiple Minima
359(1)
Example: Simulated Data
359(3)
Example: ZIP Code Data
362(4)
Discussion
366(1)
Computational Considerations
367(4)
Bibliographic Notes
367(2)
Exercises
369(2)
Support Vector Machines and Flexible Discriminants
371(40)
Introduction
371(1)
The Support Vector Classifier
371(6)
Computing the Support Vector Classifier
373(2)
Mixture Example (Continued)
375(2)
Support Vector Machines
377(13)
Computing the SVM for Classification
377(3)
The SVM as a Penalization Method
380(1)
Function Estimation and Reproducing Kernels
381(3)
SVMs and the Curse of Dimensionality
384(1)
Support Vector Machines for Regression
385(2)
Regression and Kernels
387(2)
Discussion
389(1)
Generalizing Linear Discriminant Analysis
390(1)
Flexible Discriminant Analysis
391(6)
Computing the FDA Estimates
394(3)
Penalized Discriminant Analysis
397(2)
Mixture Discriminant Analysis
399(12)
Example: Waveform Data
402(4)
Bibliographic Notes
406(1)
Exercises
406(5)
Prototype Methods and Nearest - Neighbors
411(26)
Introduction
411(1)
Prototype Methods
411(4)
K-means Clustering
412(2)
Learning Vector Quantization
414(1)
Gaussian Mixtures
415(1)
κ-Nearest-Neighbor Classifiers
415(12)
Example: A Comparative Study
420(2)
Example: κ-Nearest-Neighbors and Image Scene Classification
422(1)
Invariant Metrics and Tangent Distance
423(4)
Adaptive Nearest-Neighbor Methods
427(5)
Example
430(1)
Global Dimension Reduction for Nearest-Neighbors
431(1)
Computational Considerations
432(5)
Bibliographic Notes
433(1)
Exercises
433(4)
Unsupervised Learning
437(72)
Introduction
437(2)
Association Rules
439(14)
Market Basket Analysis
440(1)
The Apriori Algorithm
441(3)
Example: Market Basket Analysis
444(3)
Unsupervised as Supervised Learning
447(2)
Generalized Association Rules
449(2)
Choice of Supervised Learning Method
451(1)
Example: Market Basket Analysis (Continued)
451(2)
Cluster Analysis
453(27)
Proximity Matrices
455(1)
Dissimilarities Based on Attributes
455(2)
Object Dissimilarity
457(2)
Clustering Algorithms
459(1)
Combinatorial Algorithms
460(1)
K-means
461(2)
Gaussian Mixtures as Soft K-means Clustering
463(1)
Example: Human Tumor Microarray Data
463(3)
Vector Quantization
466(2)
K-medoids
468(2)
Practical Issues
470(2)
Hierarchical Clustering
472(8)
Self-Organizing Maps
480(5)
Principal Components, Curves and Surfaces
485(9)
Principal Components
485(6)
Principal Curves and Surfaces
491(3)
Independent Component Analysis and Exploratory Projection Pursuit
494(8)
Latent Variables and Factor Analysis
494(2)
Independent Component Analysis
496(4)
Exploratory Projection Pursuit
500(1)
A Different Approach to ICA
500(2)
Multidimensional Scaling
502(7)
Bibliographic Notes
503(1)
Exercises
504(5)
References 509(14)
Author Index 523(4)
Index 527

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program