rent-now

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

9780387981345

Principles and Theory for Data Mining and Machine Learning

by ; ;
  • ISBN13:

    9780387981345

  • ISBN10:

    0387981349

  • Format: Hardcover
  • Copyright: 2009-07-01
  • Publisher: Springer Nature
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $219.99 Save up to $115.95
  • Digital
    $225.42*
    Add to Cart

    DURATION
    PRICE
    *To support the delivery of the digital material to you, a digital delivery fee of $3.99 will be charged on each digital item.

Summary

This book is a thorough introduction to the most important topics in data mining and machine learning. It begins with a detailed review of classical function estimation and proceeds with chapters on nonlinear regression, classification, and ensemble methods. The final chapters focus on clustering, dimension reduction, variable selection, and multiple comparisons. All these topics have undergone extraordinarily rapid development in recent years and this treatment offers a modern perspective emphasizing the most recent contributions. The presentation of foundational results is detailed and includes many accessible proofs not readily available outside original sources. While the orientation is conceptual and theoretical, the main points are regularly reinforced by computational comparisons.Intended primarily as a graduate level textbook for statistics, computer science, and electrical engineering students, this book assumes only a strong foundation in undergraduate statistics and mathematics, and facility with using R packages. The text has a wide variety of problems, many of an exploratory nature. There are numerous computed examples, complete with code, so that further computations can be carried out readily. The book also serves as a handbook for researchers who want a conceptual overview of the central topics in data mining and machine learning.

Author Biography

Bertrand Clarke is a Professor of Statistics in the Department of Medicine, Department of Epidemiology and Public Health, and the Center for Computational Sciences at the University of Miami. He has been on the Editorial Board of the Journal of the American Statistical Association, the Journal of Statistical Planning and Inference, and Statistical Papers. He is co-winner, with Andrew Barron, of the 1990 Browder J. Thompson Prize from the Institute of Electrical and Electronic Engineers. Ernest Fokou is an Assistant Professor of Statistics at Kettering University. He has also taught at Ohio State University and been a long term visitor at the Statistical and Mathematical Sciences Institute where he was a Post-doctoral Research Fellow in the Data Mining and Machine Learning Program. In 2000, he was the winner of the Young Researcher Award from the International Association for Statistical Computing. Hao Helen Zhang is an Associate Professor of Statistics in the Department of Statistics at North Carolina State University. For 2003-2004, she was a Research Fellow at SAMSI and in 2007, she won a Faculty Early Career Development Award from the National Science Foundation. She is on the Editorial Board of the Journal of the American Statistical Association and Biometrics.

Table of Contents

Prefacep. v
Variability, Information, and Predictionp. 1
The Curse of Dimensionalityp. 3
The Two Extremesp. 4
Perspectives on the Cursep. 5
Sparsityp. 6
Exploding Numbers of Modelsp. 8
Multicollinearity and Concurvityp. 9
The Effect of Noisep. 10
Coping with the Cursep. 11
Selecting Design Pointsp. 11
Local Dimensionp. 12
Parsimonyp. 17
Two Techniquesp. 18
The Bootstrapp. 18
Cross-Validationp. 27
Optimization and Searchp. 32
Univariate Searchp. 32
Multivariate Searchp. 33
General Searchesp. 34
Constraint Satisfaction and Combinatorial Searchp. 35
Notesp. 38
Hammersley Pointsp. 38
Edgeworth Expansions for the Meanp. 39
Bootstrap Asymptotics for the Studentized Meanp. 41
Exercisesp. 43
Local Smoothersp. 53
Early Smoothersp. 55
Transition to Classical Smoothersp. 59
Global Versus Local Approximationsp. 60
LOESSp. 64
Kernel Smoothersp. 67
Statistical Function Approximationp. 68
The Concept of Kernel Methods and the Discrete Casep. 73
Kernels and Stochastic Designs: Density Estimationp. 78
Stochastic Designs: Asymptotics for Kernel Smoothersp. 81
Convergence Theorems and Rates for Kernel Smoothersp. 86
Kernel and Bandwidth Selectionp. 90
Linear Smoothersp. 95
Nearest Neighborsp. 96
Applications of Kernel Regressionp. 100
A Simulated Examplep. 100
Ethanol Datap. 102
Exercisesp. 107
Spline Smoothingp. 117
Interpolating Splinesp. 117
Natural Cubic Splinesp. 123
Smoothing Splines for Regressionp. 126
Model Selection for Spline Smoothingp. 129
Spline Smoothing Meets Kernel Smoothingp. 130
Asymptotic Bias, Variance, and MISE for Spline Smoothersp. 131
Ethanol Data Example - Continuedp. 133
Splines Redux: Hilbert Space Formulationp. 136
Reproducing Kernelsp. 138
Constructing an RKHSp. 141
Direct Sum Construction for Splinesp. 146
Explicit Formsp. 149
Nonparametrics in Data Mining and Machine Learningp. 152
Simulated Comparisonsp. 154
What Happens with Dependent Noise Models?p. 157
Higher Dimensions and the Curse of Dimensionalityp. 159
Notesp. 163
Sobolev Spaces: Definitionp. 163
Exercisesp. 164
New Wave Nonparametricsp. 171
Additive Modelsp. 172
The Backfitting Algorithmp. 173
Concurvity and Inferencep. 177
Nonparametric Optimalityp. 180
Generalized Additive Modelsp. 181
Projection Pursuit Regressionp. 184
Neural Networksp. 189
Backpropagation and Inferencep. 192
Barron's Result and the Cursep. 197
Approximation Propertiesp. 198
Barron's Theorem: Formal Statementp. 200
Recursive Partitioning Regressionp. 202
Growing Treesp. 204
Pruning and Selectionp. 207
Regressionp. 208
Bayesian Additive Regression Trees: BARTp. 210
MARSp. 210
Sliced Inverse Regressionp. 215
ACE and AVASp. 218
Notesp. 220
Proof of Barron's Theoremp. 220
Exercisesp. 224
Supervised Learning: Partition Methodsp. 231
Multiclass Learningp. 233
Discriminant Analysisp. 235
Distance-Based Discriminant Analysisp. 236
Bayes Rulesp. 241
Probability-Based Discriminant Analysisp. 245
Tree-Based Classifiersp. 249
Splitting Rulesp. 249
Logic Treesp. 253
Random Forestsp. 254
Support Vector Machinesp. 262
Margins and Distancesp. 262
Binary Classification and Riskp. 265
Prediction Bounds for Function Classesp. 268
Constructing SVM Classifiersp. 271
SVM Classification for Nonlinearly Separable Populationsp. 279
SVMs in the General Nonlinear Casep. 282
Some Kernels Used in SVM Classificationp. 288
Kernel Choice, SVMs and Model Selectionp. 289
Support Vector Regressionp. 290
Multiclass Support Vector Machinesp. 293
Neural Networksp. 294
Notesp. 296
Hoeffding's Inequalityp. 296
VC Dimensionp. 297
Exercisesp. 300
Alternative Nonparametricsp. 307
Ensemble Methodsp. 308
Bayes Model Averagingp. 310
Baggingp. 312
Stackingp. 316
Boostingp. 318
Other Averaging Methodsp. 326
Oracle Inequalitiesp. 328
Bayes Nonparametricsp. 334
Dirichlet Process Priorsp. 334
Polya Tree Priorsp. 336
Gaussian Process Priorsp. 338
The Relevance Vector Machinep. 344
RVM Regression: Formal Descriptionp. 345
RVM Classificationp. 349
Hidden Markov Models - Sequential Classificationp. 352
Notesp. 354
Proof of Yang's Oracle Inequalityp. 354
Proof of Lecue's Oracle Inequalityp. 357
Exercisesp. 359
Computational Comparisonsp. 365
Computational Results: Classificationp. 366
Comparison on Fisher's Iris Datap. 366
Comparison on Ripley's Datap. 369
Computational Results: Regressionp. 376
Vapnik's sinc Functionp. 377
Friedman's Functionp. 389
Conclusionsp. 392
Systematic Simulation Studyp. 397
No Free Lunchp. 400
Exercisesp. 402
Unsupervised Learning: Clusteringp. 405
Centroid-Based Clusteringp. 408
K-Means Clusteringp. 409
Variantsp. 412
Hierarchical Clusteringp. 413
Agglomerative Hierarchical Clusteringp. 414
Divisive Hierarchical Clusteringp. 422
Theory for Hierarchical Clusteringp. 426
Partitional Clusteringp. 430
Model-Based Clusteringp. 432
Graph-Theoretic Clusteringp. 447
Spectral Clusteringp. 452
Bayesian Clusteringp. 458
Probabilistic Clusteringp. 458
Hypothesis Testingp. 461
Computed Examplesp. 463
Ripley's Datap. 465
Iris Datap. 475
Cluster Validationp. 480
Notesp. 484
Derivatives of Functions of a Matrixp. 484
Kruskal's Algorithm: Proofp. 484
Prim's Algorithm: Proofp. 485
Exercisesp. 485
Learning in High Dimensionsp. 493
Principal Componentsp. 495
Main Theoremp. 496
Key Propertiesp. 498
Extensionsp. 500
Factor Analysisp. 502
Finding ¿ and ¿p. 504
Finding Kp. 506
Estimating Factor Scoresp. 507
Projection Pursuitp. 508
Independent Components Analysisp. 511
Main Definitionsp. 511
Key Resultsp. 513
Computational Approachp. 515
Nonlinear PCs and ICAp. 516
Nonlinear PCsp. 517
Nonlinear ICAp. 518
Geometric Summarizationp. 518
Measuring Distances to an Algebraic Shapep. 519
Principal Curves and Surfacesp. 520
Supervised Dimension Reduction: Partial Least Squaresp. 523
Simple PLSp. 523
PLS Proceduresp. 524
Properties of PLSp. 526
Supervised Dimension Reduction: Sufficient Dimensions in Regressionp. 527
Visualization I: Basic Plotsp. 531
Elementary Visualizationp. 534
Projectionsp. 541
Time Dependencep. 543
Visualization II: Transformationsp. 546
Chernoff Facesp. 546
Multidimensional Scalingp. 547
Self-Organizing Mapsp. 553
Exercisesp. 560
Variable Selectionp. 569
Concepts from Linear Regressionp. 570
Subset Selectionp. 572
Variable Rankingp. 575
Overviewp. 577
Traditional Criteriap. 578
Akaike Information Criterion (AIC)p. 580
Bayesian Information Criterion (BIC)p. 583
Choices of Information Criteriap. 585
Cross Validationp. 587
Shrinkage Methodsp. 599
Shrinkage Methods for Linear Modelsp. 601
Grouping in Variable Selectionp. 615
Least Angle Regressionp. 617
Shrinkage Methods for Model Classesp. 620
Cautionary Notesp. 631
Bayes Variable Selectionp. 632
Prior Specificationp. 635
Posterior Calculation and Explorationp. 643
Evaluating Evidencep. 647
Connections Between Bayesian and Frequentist Methodsp. 650
Computational Comparisonsp. 653
The n>p Casep. 653
When p>np. 665
Notesp. 667
Code for Generating Data in Section 10.5p. 667
Exercisesp. 671
Multiple Testingp. 679
Analyzing the Hypothesis Testing Problemp. 681
A Paradigmatic Settingp. 681
Counts for Multiple Testsp. 684
Measures of Error in Multiple Testingp. 685
Aspects of Error Controlp. 687
Controlling the Familywise Error Ratep. 690
One-Step Adjustmentsp. 690
Stepwise p-Value Adjustmentsp. 693
PCER and PFERp. 695
Null Dominationp. 696
Two Proceduresp. 697
Controlling the Type I Error Ratep. 702
Adjusted p-Values for PFER/PCERp. 706
Controlling the False Discovery Ratep. 707
FDR and other Measures of Errorp. 709
The Benjamini-Hochberg Procedurep. 710
A BH Theorem for a Dependent Settingp. 711
Variations on BHp. 713
Controlling the Positive False Discovery Ratep. 719
Bayesian Interpretationsp. 719
Aspects of Implementationp. 723
Bayesian Multiple Testingp. 727
Fully Bayes: Hierarchicalp. 728
Fully Bayes: Decision theoryp. 731
Notesp. 736
Proof of the Benjamini-Hochberg Theoremp. 736
Proof of the Benjamini-Yekutieli Theoremp. 739
Referencesp. 743
Indexp. 773
Table of Contents provided by Ingram. All Rights Reserved.

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program