Introduction to Machine Learning

  • ISBN13:


  • ISBN10:


  • Edition: 2nd
  • Format: Hardcover
  • Copyright: 2/26/2010
  • Publisher: Mit Pr
  • Purchase Benefits
  • Free Shipping On Orders Over $59!
    Your order must be $59 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $60.00 Save up to $1.80
  • Buy New


Supplemental Materials

What is included with this book?

  • The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
  • The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.


A new edition of an introductory text in machine learning that gives a unified treatment of machine learning problems and solutions.

The goal of machine learning is to program computers to use example data or past experience to solve a given problem. Many successful applications of machine learning exist already, including systems that analyze past sales data to predict customer behavior, optimize robot behavior so that a task can be completed using minimum resources, and extract knowledge from bioinformatics data.

The second edition of Introduction to Machine Learning is a comprehensive textbook on the subject, covering a broad array of topics not usually included in introductory machine learning texts. In order to present a unified treatment of machine learning problems and solutions, it discusses many methods from different fields, including statistics, pattern recognition, neural networks, artificial intelligence, signal processing, control, and data mining. All learning algorithms are explained so that the student can easily move from the equations in the book to a computer program.

The text covers such topics as supervised learning, Bayesian decision theory, parametric methods, multivariate methods, multilayer perceptrons, local models, hidden Markov models, assessing and comparing classification algorithms, and reinforcement learning.

New to the second edition are chapters on kernel machines, graphical models, and Bayesian estimation; expanded coverage of statistical tests in a chapter on design and analysis of machine learning experiments; case studies available on the Web (with downloadable results for instructors); and many additional exercises.

All chapters have been revised and updated. Introduction to Machine Learning can be used by advanced undergraduates and graduate students who have completed courses in computer programming, probability, calculus, and linear algebra. It will also be of interest to engineers in the field who are concerned with the application of machine learning methods.

"This volume offers a very accessible introduction to the field of machine learning. Ethem Alpaydin gives a comprehensive exposition of the kinds of modeling and prediction problems addressed by machine learning, as well as an overview of the most common families of paradigms, algorithms, and techniques in the field. The volume will be particularly useful to the newcomer eager to quickly get a grasp of the elements that compose this relatively new and rapidly evolving field."-Joaquin Quiñonero-Candela, co-editor, Data-Set Shift in Machine Learning

Table of Contents

Series Forewordp. xvii
Figuresp. xix
Tablesp. xxix
Prefacep. xxxi
Acknowledgmentsp. xxxiii
Notes for the Second Editionp. xxxv
Notationsp. xxxix
Introductionp. 1
What Is Machine Learning?p. 1
Examples of Machine Learning Applicationsp. 4
Learning Associationsp. 4
Classificationp. 5
Regressionp. 9
Unsupervised Learningp. 11
Reinforcement Learningp. 13
Notesp. 14
Relevant Resourcesp. 16
Exercisesp. 18
Referencesp. 19
Supervised Learningp. 21
Learning a Class from Examplesp. 21
Vapnik-Chervonenkis (VC) Dimensionp. 27
Probably Approximately Correct (PAC) Learningp. 29
Noisep. 30
Learning Multiple Classesp. 32
Regressionp. 34
Model Selection and Generalizationp. 37
Dimensions of a Supervised Machine Learning Algorithmp. 41
Notesp. 42
Exercisesp. 43
Referencesp. 44
Bayesian Decision Theoryp. 47
Introductionp. 47
Classificationp. 49
Losses and Risksp. 51
Discriminant Functionsp. 53
Utility Theoryp. 54
Association Rulesp. 55
Notesp. 58
Exercisesp. 58
Referencesp. 59
Parametric Methodsp. 61
Introductionp. 61
Maximum Likelihood Estimationp. 62
Bernoulli Densityp. 63
Multinomial Densityp. 64
Gaussian (Normal) Densityp. 64
Evaluating an Estimator: Bias and Variancep. 65
The Bayes' Estimatorp. 66
Parametric Classificationp. 69
Regressionp. 73
Tuning Model Complexity: Bias/Variance Dilemmap. 76
Model Selection Proceduresp. 80
Notesp. 84
Exercisesp. 84
Referencesp. 85
Multivariate Methodsp. 87
Multivariate Datap. 87
Parameter Estimationp. 88
Estimation of Missing Valuesp. 89
Multivariate Normal Distributionp. 90
Multivariate Classificationp. 94
Tuning Complexityp. 99
Discrete Featuresp. 102
Multivariate Regressionp. 103
Notesp. 105
Exercisesp. 106
Referencesp. 107
Dimensionality Reductionp. 109
Introductionp. 109
Subset Selectionp. 110
Principal Components Analysisp. 113
Factor Analysisp. 120
Multidimensional Scalingp. 125
Linear Discriminant Analysisp. 128
Isomapp. 133
Locally Linear Embeddingp. 135
Notesp. 138
Exercisesp. 139
Referencesp. 140
Clusteringp. 143
Introductionp. 143
Mixture Densitiesp. 144
k-Means Clusteringp. 145
Expectation-Maximization Algorithmp. 149
Mixtures of Latent Variable Modelsp. 154
Supervised Learning after Clusteringp. 155
Hierarchical Clusteringp. 157
Choosing the Number of Clustersp. 158
Notesp. 160
Exercisesp. 160
Referencesp. 161
Nonparametric Methodsp. 163
Introductionp. 163
Nonparametric Density Estimationp. 165
Histogram Estimatorp. 165
Kernel Estimatorp. 167
k-Nearest Neighbor Estimatorp. 168
Generalization to Multivariate Datap. 170
Nonparametric Classificationp. 171
Condensed Nearest Neighborp. 172
Nonparametric Regression: Smoothing Modelsp. 174
Running Mean Smootherp. 175
Kernel Smootherp. 176
Running Line Smootherp. 177
How to Choose the Smoothing Parameterp. 178
Notesp. 180
Exercisesp. 181
Referencesp. 182
Decision Treesp. 185
Introductionp. 185
Univariate Treesp. 187
Classification Treesp. 188
Regression Treesp. 192
Pruningp. 194
Rule Extraction from Treesp. 197
Learning Rules from Datap. 198
Multivariate Treesp. 202
Notesp. 204
Exercisesp. 207
Referencesp. 207
Linear Discriminationp. 209
Introductionp. 209
Generalizing the Linear Modelp. 211
Geometry of the Linear Discriminantp. 212
Two Classesp. 212
Multiple Classesp. 214
Pairwise Separationp. 216
Parametric Discrimination Revisitedp. 217
Gradient Descentp. 218
Logistic Discriminationp. 220
Two Classesp. 220
Multiple Classesp. 224
Discrimination by Regressionp. 228
Notesp. 230
Exercisesp. 230
Referencesp. 231
Multilayer Perceptronsp. 233
Introductionp. 233
Understanding the Brainp. 234
Neural Networks as a Paradigm for Parallel Processingp. 235
The Perceptronp. 237
Training a Perceptronp. 240
Learning Boolean Functionsp. 243
Multilayer Perceptronsp. 245
MLP as a Universal Approximatorp. 248
Backpropagation Algorithmp. 249
Nonlinear Regressionp. 250
Two-Class Discriminationp. 252
Multiclass Discriminationp. 254
Multiple Hidden Layersp. 256
Training Proceduresp. 256
Improving Convergencep. 256
Overtrainingp. 257
Structuring the Networkp. 258
Hintsp. 261
Tuning the Network Sizep. 263
Bayesian View of Learningp. 266
Dimensionality Reductionp. 267
Learning Timep. 270
Time Delay Neural Networksp. 270
Recurrent Networksp. 271
Notesp. 272
Exercisesp. 274
Referencesp. 275
Local Modelsp. 279
Introductionp. 279
Competitive Learningp. 280
Online k-Meansp. 280
Adaptive Resonance Theoryp. 285
Self-Organizing Mapsp. 286
Radial Basis Functionsp. 288
Incorporating Rule-Based Knowledgep. 294
Normalized Basis Functionsp. 295
Competitive Basis Functionsp. 297
Learning Vector Quantizationp. 300
Mixture of Expertsp. 300
Cooperative Expertsp. 303
Competitive Expertsp. 304
Hierarchical Mixture of Expertsp. 304
Notesp. 305
Exercisesp. 306
Referencesp. 307
Kernel Machinesp. 309
Introductionp. 309
Optimal Separating Hyperplanep. 311
The Nonseparable Case: Soft Margin Hyperplanep. 315
¿-SVMp. 318
Kernel Trickp. 319
Vectorial Kernelsp. 321
Defining Kernelsp. 324
Multiple Kernel Learningp. 325
Multiclass Kernel Machinesp. 327
Kernel Machines for Regressionp. 328
One-Class Kernel Machinesp. 333
Kernel Dimensionality Reductionp. 335
Notesp. 337
Exercisesp. 338
Referencesp. 339
Bayesian Estimationp. 341
Introductionp. 341
Estimating the Parameter of a Distributionp. 343
Discrete Variablesp. 343
Continuous Variablesp. 345
Bayesian Estimation of the Parameters of a Functionp. 348
Regressionp. 348
The Use of Basis/Kernel Functionsp. 352
Bayesian Classificationp. 353
Gaussian Processesp. 356
Notesp. 359
Exercisesp. 360
Referencesp. 361
Hidden Markov Modelsp. 363
Introductionp. 363
Discrete Markov Processesp. 364
Hidden Markov Modelsp. 367
Three Basic Problems of HMMsp. 369
Evaluation Problemp. 369
Finding the State Sequencep. 373
Learning Model Parametersp. 375
Continuous Observationsp. 378
The HMM with Inputp. 379
Model Selection in HMMp. 380
Notesp. 382
Exercisesp. 383
Referencesp. 384
Graphical Modelsp. 387
Introductionp. 387
Canonical Cases for Conditional Independencep. 389
Example Graphical Modelsp. 396
Naive Bayes' Classifierp. 396
Hidden Markov Modelp. 398
Linear Regressionp. 401
d-Separationp. 402
Belief Propagationp. 402
Chainsp. 403
Treesp. 405
Polytreesp. 407
Junction Treesp. 409
Undirected Graphs: Markov Random Fieldsp. 410
Learning the Structure of a Graphical Modelp. 413
Influence Diagramsp. 414
Notesp. 414
Exercisesp. 417
Referencesp. 417
Combining Multiple Learnersp. 419
Rationalep. 419
Generating Diverse Learnersp. 420
Model Combination Schemesp. 423
Votingp. 424
Error-Correcting Output Codesp. 427
Baggingp. 430
Boostingp. 431
Mixture of Experts Revisitedp. 434
Stacked Generalizationp. 435
Fine-Tuning an Ensemblep. 437
Cascadingp. 438
Notesp. 440
Exercisesp. 442
Referencesp. 443
Reinforcement Learningp. 447
Introductionp. 447
Single State Case: K-Armed Banditp. 449
Elements of Reinforcement Learningp. 450
Model-Based Learningp. 453
Value Iterationp. 453
Policy Iterationp. 454
Temporal Difference Learningp. 454
Exploration Strategiesp. 455
Deterministic Rewards and Actionsp. 456
Nondeterministic Rewards and Actionsp. 457
Eligibility Tracesp. 459
Generalizationp. 461
Partially Observable Statesp. 464
The Settingp. 464
Example: The Tiger Problemp. 465
Notesp. 470
Exercisesp. 472
Referencesp. 473
Design and Analysis of Machine Learning Experimentsp. 475
Introductionp. 475
Factors, Response, and Strategy of Experimentationp. 478
Response Surface Designp. 481
Randomization, Replication, and Blockingp. 482
Guidelines for Machine Learning Experimentsp. 483
Cross-Validation and Resampling Methodsp. 486
K-Fold Cross-Validationp. 487
5×2 Cross-Validationp. 488
Bootstrappingp. 489
Measuring Classifier Performancep. 489
Interval Estimationp. 493
Hypothesis Testingp. 496
Assessing a Classification Algorithm's Performancep. 498
Binomial Testp. 499
Approximate Normal Testp. 500
t Testp. 500
Comparing Two Classification Algorithmsp. 501
McNemar's Testp. 501
K-Fold Cross-Validated Paired t Testp. 501
5 × 2 cv Paired t Testp. 502
5 × 2 cv Paired F Testp. 503
Comparing Multiple Algorithms: Analysis of Variancep. 504
Comparison over Multiple Datasetsp. 508
Comparing Two Algorithmsp. 509
Multiple Algorithmsp. 511
Notesp. 512
Exercisesp. 513
Referencesp. 514
Probabilityp. 517
Elements of Probabilityp. 517
Axioms of Probabilityp. 518
Conditional Probabilityp. 518
Random Variablesp. 519
Probability Distribution and Density Functionsp. 519
Joint Distribution and Density Functionsp. 520
Conditional Distributionsp. 520
Bayes' Rulep. 521
Expectationp. 521
Variancep. 522
Weak Law of Large Numbersp. 523
Special Random Variablesp. 523
Bernoulli Distributionp. 523
Binomial Distributionp. 524
Multinomial Distributionp. 524
Uniform Distributionp. 524
Normal (Gaussian) Distributionp. 525
Chi-Square Distributionp. 526
t Distributionp. 527
F Distributionp. 527
Referencesp. 527
Indexp. 529
Table of Contents provided by Publisher. All Rights Reserved.

Rewards Program

Customer Reviews

A very good introduction to machine learning March 17, 2011
This is a very good introduction to Machine Learning. Each chapter has a notes section which I found particularly useful, since it gives a brief overview of the field with good references. I would recommend this textbook to anyone moving into the machine learning area. The price is very reasonable and the book is very practical and useful. Highly recommended.
Flag Review
Please provide a brief explanation for why you are flagging this review:
Your submission has been received. We will inspect this review as soon as possible. Thank you for your input!
Introduction to Machine Learning: 4 out of 5 stars based on 1 user reviews.

Write a Review