Foundations of Predictive Analytics

by ;
  • ISBN13:


  • ISBN10:


  • Edition: 1st
  • Format: Hardcover
  • Copyright: 2012-02-15
  • Publisher: Chapman & Hall

Note: Supplemental materials are not guaranteed with Rental or Used book purchases.

Purchase Benefits

  • Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $105.00 Save up to $77.18
  • Rent Book $89.25
    Add to Cart Free Shipping


Supplemental Materials

What is included with this book?

  • The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
  • The Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.


Written by industry experts, this book introduces the various concepts, theorems, and algorithms widely used in statistical data analysis and data mining. It covers important topics in data mining, machine learning, and statistical pattern recognition, including linear and nonlinear regression models, time series analysis, and variable selection. The text also explores key topics that are not extensively covered in similar books, such as copula functions, incremental regression, censored data models, Dempster-Shafer theory, survival data analysis, and GARCH.

Table of Contents

List of Figuresp. xv
List of Tablesp. xvii
Prefacep. xix
Introductionp. 1
What Is a Model?p. 1
What Is a Statistical Model?p. 2
The Modeling Processp. 3
Modeling Pitfallsp. 4
Characteristics of Good Modelersp. 5
The Future of Predictive Analyticsp. 7
Properties of Statistical Distributionsp. 9
Fundamental Distributionsp. 9
Uniform Distributionp. 9
Details of the Normal (Gaussian) Distributionp. 10
Lognormal Distributionp. 19
Distributionp. 20
Chi-Squared Distributionp. 22
Non-Central Chi-Squared Distributionp. 25
Student's t-Distributionp. 28
Multivariate t-Distributionp. 29
F-Distributionp. 31
Binomial Distributionp. 31
Poisson Distributionp. 32
Exponential Distributionp. 32
Geometric Distributionp. 33
Hypergeometric Distributionp. 33
Negative Binomial Distributionp. 34
Inverse Gaussian (IG) Distributionp. 35
Normal Inverse Gaussian (NIG) Distributionp. 36
Central Limit Theoremp. 38
Estimate of Mean, Variance, Skewness, and Kurtosis from Sample Datap. 40
Estimate of the Standard Deviation of the Sample Meanp. 40
(Pseudo) Random Number Generatorsp. 41
Mersenne Twister Pseudorandom Number Generatorp. 42
Box-Muller Transform for Generating a Normal Distributionp. 42
Transformation of a Distribution Functionp. 43
Distribution of a Function of Random Variablesp. 43
Z = X + Yp. 44
Z = XYp. 44
(Z1,Z2,…,Zn) = (X1,X2,…,Xn) Yp. 44
Z = X/Yp. 45
Z = max(X,Y)p. 45
Z = min(X,Y)p. 45
Moment Generating Functionp. 46
Moment Generating Function of Binomial Distributionp. 46
Moment Generating Function of Normal Distributionp. 47
Moment Generating Function of the Distributionp. 47
Moment Generating Function of Chi-Square Distributionp. 47
Moment Generating Function of the Poisson Distributionp. 48
Cumulant Generating Functionp. 48
Characteristic Functionp. 50
Relationship between Cumulative Function and Characteristic Functionp. 51
Characteristic Function of Normal Distributionp. 52
Characteristic Function of Distributionp. 52
Chebyshev's Inequalityp. 53
Markov's Inequalityp. 54
Gram-Charlier Seriesp. 54
Edgeworth Expansionp. 55
Cornish-Fisher Expansionp. 56
Lagrange Inversion Theoremp. 56
Cornish-Fisher Expansionp. 57
Copula Functionsp. 58
Gaussian Copulap. 60
t-Copulap. 61
Archimedean Copulap. 62
Important Matrix Relationshipsp. 63
Pseudo-Inverse of a Matrixp. 63
A Lemma of Matrix Inversionp. 64
Identity for a Matrix Determinantp. 66
Inversion of Partitioned Matrixp. 66
Determinant of Partitioned Matrixp. 67
Matrix Sweep and Partial Correlationp. 67
Singular Value Decomposition (SVD)p. 69
Diagonalization of a Matrixp. 71
Spectral Decomposition of a Positive Semi-Definite Matrixp. 75
Normalization in Vector Spacep. 76
Conjugate Decomposition of a Symmetric Definite Matrixp. 77
Cholesky Decompositionp. 77
Cauchy-Schwartz Inequality .p. 80
Relationship of Correlation among Three Variablesp. 81
Linear Modeling and Regressionp. 83
Properties of Maximum Likelihood Estimatorsp. 84
Likelihood Ratio Testp. 87
Wald Testp. 87
Lagrange Multiplier Statisticp. 88
Linear Regressionp. 88
Ordinary Least Squares (OLS) Regressionp. 89
Interpretation of the Coefficients of Linear Regressionp. 95
Regression on Weighted Datap. 97
Incrementally Updating a Regression Model with Additional Datap. 100
Partitioned Regressionp. 101
How Does the Regression Change When Adding One More Variable?p. 101
Linearly Restricted Least Squares Regressionp. 103
Significance of the Correlation Coefficientp. 105
Partial Correlationp. 105
Ridge Regressionp. 105
Fisher's Linear Discriminant Analysisp. 106
Principal Component Regression (PCR)p. 109
Factor Analysisp. 110
Partial Least Squares Regression (PLSR)p. 111
Generalized Linear Model (GLM)p. 113
Logistic Regression: Binaryp. 116
Logistic Regression: Multiple Nominalp. 119
Logistic Regression: Proportional Multiple Ordinalp. 121
Fisher Scoring Method for Logistic Regression . .p. 123
Tobit Model: A Censored Regression Modelp. 125
Some Properties of the Normal Distributionp. 125
Formulation of the Tobit Modelp. 126
Nonlinear Modelingp. 129
Naive Bayesian Classifierp. 129
Neural Networkp. 131
Back Propagation Neural Networkp. 131
Segmentation and Tree Modelsp. 137
Segmentationp. 137
Tree Modelsp. 138
Sweeping to Find the Best Cutpointp. 140
Impurity Measure of a Population: Entropy and Gini Indexp. 143
Chi-Square Splitting Rulep. 147
Implementation of Decision Treesp. 148
Additive Modelsp. 151
Boosted Treep. 153
Least Squares Regression Boosting Treep. 154
Binary Logistic Regression Boosting Treep. 155
Support Vector Machine (SVM)p. 158
Wolfe Dualp. 158
Linearly Separable Problemp. 159
Linearly Inseparable Problemp. 161
Constructing Higher-Dimensional Space and Kernelp. 162
Model Outputp. 163
C-Support Vector Classification (C-SVC) for Classificationp. 164
-Support Vector Regression (-SVR) for Regressionp. 164
The Probability Estimatep. 167
Fuzzy Logic Systemp. 168
A Simple Fuzzy Logic Systemp. 168
Clusteringp. 169
K Means, Fuzzy C Meansp. 170
Nearest Neighbor, K Nearest Neighbor (KNNp. 171
Comments on Clustering Methodsp. 171
Time Series Analysisp. 173
Fundamentals of Forecastingp. 173
Box-Cox Transformationp. 174
Smoothing Algorithmsp. 175
Convolution of Linear Filtersp. 176
Linear Difference Equationp. 177
The Autocovariance Function and Autocorrelation Functionp. 178
The Partial Autocorrelation Functionp. 179
ARIMA Modelsp. 181
MA(q) Processp. 182
AR(p) Processp. 184
ARMA(p, q) Processp. 186
Survival Data Analysisp. 187
Sampling Methodp. 190
Exponentially Weighted Moving Average (EWMA) and GARCH(1, 1)p. 191
Exponentially Weighted Moving Average (EWMA)p. 191
ARCH and GARCH Modelsp. 192
Data Preparation and Variable Selectionp. 195
Data Quality and Explorationp. 196
Variable Scaling and Transformationp. 197
How to Bin Variables .p. 197
Equal Intervalp. 198
Equal Populationp. 198
Tree Algorithmsp. 199
Interpolation in One and Two Dimensionsp. 199
Weight of Evidence (WOE) Transformationp. 200
Variable Selection Overviewp. 204
Missing Data Imputationp. 206
Stepwise Selection Methodsp. 207
Forward Selection in Linear Regressionp. 208
Forward Selection in Logistic Regressionp. 208
Mutual Information, KL Distancep. 209
Detection of Multicollinearityp. 210
Model Goodness Measuresp. 213
Training, Testing, Validationp. 213
Continuous Dependent Variablep. 215
Example: Linear Regressionp. 217
Binary Dependent Variable (Two-Group Classification)p. 218
Kolmogorov-Smirnov (KS) Statisticp. 218
Confusion Matrixp. 220
Concordant and Discordantp. 221
R2 for Logistic Regressionp. 223
AIC and SBCp. 224
Hosmer-Lemeshow Goodness-of-Fit Testp. 224
Example: Logistic Regressionp. 225
Population Stability Index Using Relative Entropyp. 227
Optimization Methodsp. 231
Lagrange Multiplierp. 232
Gradient Descent Methodp. 234
Newton-Raphson Methodp. 236
Conjugate Gradient Methodp. 238
Quasi-Newton Methodp. 240
Genetic Algorithms (GA)p. 242
Simulated Annealingp. 242
Linear Programmingp. 243
Nonlinear Programming (NLP)p. 247
General Nonlinear Programming (GNLP)p. 248
Lagrange Dual Problemp. 249
Quadratic Programming (QP)p. 250
Linear Complementarity Programming (LCPp. 254
Sequential Quadratic Programming (SQP)p. 256
Nonlinear Equationsp. 263
Expectation-Maximization (EM) Algorithmp. 264
Optimal Design of Experimentp. 268
Miscellaneous Topicsp. 271
Multidimensional Scalingp. 271
Simulationp. 274
Odds Normalization and Score Transformationp. 278
Reject Inferencep. 280
Dempster-Shafer Theory of Evidencep. 281
Some Properties in Set Theoryp. 281
Basic Probability Assignment, Belief Function, and Plausibility Functionp. 282
Dempster-Shafer's Rule of Combinationp. 285
Applications of Dempster-Shafer Theory of Evidence: Multiple Classifier Functionp. 287
Useful Mathematical Relationsp. 291
Information Inequalityp. 291
Relative Entropyp. 291
Saddle-Point Methodp. 292
Stirling's Formulap. 293
Convex Function and Jensen's Inequalityp. 294
DataMinerXL - Microsoft Excel Add-In for Building Predictive Modelsp. 299
Overviewp. 299
Utility Functionsp. 299
Data Manipulation Functionsp. 300
Basic Statistical Functionsp. 300
Modeling Functions for All Modelsp. 301
Weight of Evidence Transformation Functionsp. 301
Linear Regression Functionsp. 302
Partial Least Squares Regression Functionsp. 302
Logistic Regression Functionsp. 303
Time Series Analysis Functionsp. 303
Naive Bayes Classifier Functionsp. 303
Tree-Based Model Functionsp. 304
Clustering and Segmentation Functionsp. 304
Neural Network Functionsp. 304
Support Vector Machine Functionsp. 304
Optimization Functionsp. 305
Matrix Operation Functionsp. 305
Numerical Integration Functionsp. 306
Excel Built-in Statistical Distribution Functionsp. 306
Bibliographyp. 309
Indexp. 313
Table of Contents provided by Ingram. All Rights Reserved.

Rewards Program

Write a Review