List of Figures | p. xv |

List of Tables | p. xvii |

Preface | p. xix |

Introduction | p. 1 |

What Is a Model? | p. 1 |

What Is a Statistical Model? | p. 2 |

The Modeling Process | p. 3 |

Modeling Pitfalls | p. 4 |

Characteristics of Good Modelers | p. 5 |

The Future of Predictive Analytics | p. 7 |

Properties of Statistical Distributions | p. 9 |

Fundamental Distributions | p. 9 |

Uniform Distribution | p. 9 |

Details of the Normal (Gaussian) Distribution | p. 10 |

Lognormal Distribution | p. 19 |

¿ Distribution | p. 20 |

Chi-Squared Distribution | p. 22 |

Non-Central Chi-Squared Distribution | p. 25 |

Student's t-Distribution | p. 28 |

Multivariate t-Distribution | p. 29 |

F-Distribution | p. 31 |

Binomial Distribution | p. 31 |

Poisson Distribution | p. 32 |

Exponential Distribution | p. 32 |

Geometric Distribution | p. 33 |

Hypergeometric Distribution | p. 33 |

Negative Binomial Distribution | p. 34 |

Inverse Gaussian (IG) Distribution | p. 35 |

Normal Inverse Gaussian (NIG) Distribution | p. 36 |

Central Limit Theorem | p. 38 |

Estimate of Mean, Variance, Skewness, and Kurtosis from Sample Data | p. 40 |

Estimate of the Standard Deviation of the Sample Mean | p. 40 |

(Pseudo) Random Number Generators | p. 41 |

Mersenne Twister Pseudorandom Number Generator | p. 42 |

Box-Muller Transform for Generating a Normal Distribution | p. 42 |

Transformation of a Distribution Function | p. 43 |

Distribution of a Function of Random Variables | p. 43 |

Z = X + Y | p. 44 |

Z = XY | p. 44 |

(Z_{1},Z_{2},…,Z_{n}) = (X_{1},X_{2},…,X_{n}) Y | p. 44 |

Z = X/Y | p. 45 |

Z = max(X,Y) | p. 45 |

Z = min(X,Y) | p. 45 |

Moment Generating Function | p. 46 |

Moment Generating Function of Binomial Distribution | p. 46 |

Moment Generating Function of Normal Distribution | p. 47 |

Moment Generating Function of the ¿ Distribution | p. 47 |

Moment Generating Function of Chi-Square Distribution | p. 47 |

Moment Generating Function of the Poisson Distribution | p. 48 |

Cumulant Generating Function | p. 48 |

Characteristic Function | p. 50 |

Relationship between Cumulative Function and Characteristic Function | p. 51 |

Characteristic Function of Normal Distribution | p. 52 |

Characteristic Function of ¿ Distribution | p. 52 |

Chebyshev's Inequality | p. 53 |

Markov's Inequality | p. 54 |

Gram-Charlier Series | p. 54 |

Edgeworth Expansion | p. 55 |

Cornish-Fisher Expansion | p. 56 |

Lagrange Inversion Theorem | p. 56 |

Cornish-Fisher Expansion | p. 57 |

Copula Functions | p. 58 |

Gaussian Copula | p. 60 |

t-Copula | p. 61 |

Archimedean Copula | p. 62 |

Important Matrix Relationships | p. 63 |

Pseudo-Inverse of a Matrix | p. 63 |

A Lemma of Matrix Inversion | p. 64 |

Identity for a Matrix Determinant | p. 66 |

Inversion of Partitioned Matrix | p. 66 |

Determinant of Partitioned Matrix | p. 67 |

Matrix Sweep and Partial Correlation | p. 67 |

Singular Value Decomposition (SVD) | p. 69 |

Diagonalization of a Matrix | p. 71 |

Spectral Decomposition of a Positive Semi-Definite Matrix | p. 75 |

Normalization in Vector Space | p. 76 |

Conjugate Decomposition of a Symmetric Definite Matrix | p. 77 |

Cholesky Decomposition | p. 77 |

Cauchy-Schwartz Inequality . | p. 80 |

Relationship of Correlation among Three Variables | p. 81 |

Linear Modeling and Regression | p. 83 |

Properties of Maximum Likelihood Estimators | p. 84 |

Likelihood Ratio Test | p. 87 |

Wald Test | p. 87 |

Lagrange Multiplier Statistic | p. 88 |

Linear Regression | p. 88 |

Ordinary Least Squares (OLS) Regression | p. 89 |

Interpretation of the Coefficients of Linear Regression | p. 95 |

Regression on Weighted Data | p. 97 |

Incrementally Updating a Regression Model with Additional Data | p. 100 |

Partitioned Regression | p. 101 |

How Does the Regression Change When Adding One More Variable? | p. 101 |

Linearly Restricted Least Squares Regression | p. 103 |

Significance of the Correlation Coefficient | p. 105 |

Partial Correlation | p. 105 |

Ridge Regression | p. 105 |

Fisher's Linear Discriminant Analysis | p. 106 |

Principal Component Regression (PCR) | p. 109 |

Factor Analysis | p. 110 |

Partial Least Squares Regression (PLSR) | p. 111 |

Generalized Linear Model (GLM) | p. 113 |

Logistic Regression: Binary | p. 116 |

Logistic Regression: Multiple Nominal | p. 119 |

Logistic Regression: Proportional Multiple Ordinal | p. 121 |

Fisher Scoring Method for Logistic Regression . . | p. 123 |

Tobit Model: A Censored Regression Model | p. 125 |

Some Properties of the Normal Distribution | p. 125 |

Formulation of the Tobit Model | p. 126 |

Nonlinear Modeling | p. 129 |

Naive Bayesian Classifier | p. 129 |

Neural Network | p. 131 |

Back Propagation Neural Network | p. 131 |

Segmentation and Tree Models | p. 137 |

Segmentation | p. 137 |

Tree Models | p. 138 |

Sweeping to Find the Best Cutpoint | p. 140 |

Impurity Measure of a Population: Entropy and Gini Index | p. 143 |

Chi-Square Splitting Rule | p. 147 |

Implementation of Decision Trees | p. 148 |

Additive Models | p. 151 |

Boosted Tree | p. 153 |

Least Squares Regression Boosting Tree | p. 154 |

Binary Logistic Regression Boosting Tree | p. 155 |

Support Vector Machine (SVM) | p. 158 |

Wolfe Dual | p. 158 |

Linearly Separable Problem | p. 159 |

Linearly Inseparable Problem | p. 161 |

Constructing Higher-Dimensional Space and Kernel | p. 162 |

Model Output | p. 163 |

C-Support Vector Classification (C-SVC) for Classification | p. 164 |

¿-Support Vector Regression (¿-SVR) for Regression | p. 164 |

The Probability Estimate | p. 167 |

Fuzzy Logic System | p. 168 |

A Simple Fuzzy Logic System | p. 168 |

Clustering | p. 169 |

K Means, Fuzzy C Means | p. 170 |

Nearest Neighbor, K Nearest Neighbor (KNN | p. 171 |

Comments on Clustering Methods | p. 171 |

Time Series Analysis | p. 173 |

Fundamentals of Forecasting | p. 173 |

Box-Cox Transformation | p. 174 |

Smoothing Algorithms | p. 175 |

Convolution of Linear Filters | p. 176 |

Linear Difference Equation | p. 177 |

The Autocovariance Function and Autocorrelation Function | p. 178 |

The Partial Autocorrelation Function | p. 179 |

ARIMA Models | p. 181 |

MA(q) Process | p. 182 |

AR(p) Process | p. 184 |

ARMA(p, q) Process | p. 186 |

Survival Data Analysis | p. 187 |

Sampling Method | p. 190 |

Exponentially Weighted Moving Average (EWMA) and GARCH(1, 1) | p. 191 |

Exponentially Weighted Moving Average (EWMA) | p. 191 |

ARCH and GARCH Models | p. 192 |

Data Preparation and Variable Selection | p. 195 |

Data Quality and Exploration | p. 196 |

Variable Scaling and Transformation | p. 197 |

How to Bin Variables . | p. 197 |

Equal Interval | p. 198 |

Equal Population | p. 198 |

Tree Algorithms | p. 199 |

Interpolation in One and Two Dimensions | p. 199 |

Weight of Evidence (WOE) Transformation | p. 200 |

Variable Selection Overview | p. 204 |

Missing Data Imputation | p. 206 |

Stepwise Selection Methods | p. 207 |

Forward Selection in Linear Regression | p. 208 |

Forward Selection in Logistic Regression | p. 208 |

Mutual Information, KL Distance | p. 209 |

Detection of Multicollinearity | p. 210 |

Model Goodness Measures | p. 213 |

Training, Testing, Validation | p. 213 |

Continuous Dependent Variable | p. 215 |

Example: Linear Regression | p. 217 |

Binary Dependent Variable (Two-Group Classification) | p. 218 |

Kolmogorov-Smirnov (KS) Statistic | p. 218 |

Confusion Matrix | p. 220 |

Concordant and Discordant | p. 221 |

R^{2} for Logistic Regression | p. 223 |

AIC and SBC | p. 224 |

Hosmer-Lemeshow Goodness-of-Fit Test | p. 224 |

Example: Logistic Regression | p. 225 |

Population Stability Index Using Relative Entropy | p. 227 |

Optimization Methods | p. 231 |

Lagrange Multiplier | p. 232 |

Gradient Descent Method | p. 234 |

Newton-Raphson Method | p. 236 |

Conjugate Gradient Method | p. 238 |

Quasi-Newton Method | p. 240 |

Genetic Algorithms (GA) | p. 242 |

Simulated Annealing | p. 242 |

Linear Programming | p. 243 |

Nonlinear Programming (NLP) | p. 247 |

General Nonlinear Programming (GNLP) | p. 248 |

Lagrange Dual Problem | p. 249 |

Quadratic Programming (QP) | p. 250 |

Linear Complementarity Programming (LCP | p. 254 |

Sequential Quadratic Programming (SQP) | p. 256 |

Nonlinear Equations | p. 263 |

Expectation-Maximization (EM) Algorithm | p. 264 |

Optimal Design of Experiment | p. 268 |

Miscellaneous Topics | p. 271 |

Multidimensional Scaling | p. 271 |

Simulation | p. 274 |

Odds Normalization and Score Transformation | p. 278 |

Reject Inference | p. 280 |

Dempster-Shafer Theory of Evidence | p. 281 |

Some Properties in Set Theory | p. 281 |

Basic Probability Assignment, Belief Function, and Plausibility Function | p. 282 |

Dempster-Shafer's Rule of Combination | p. 285 |

Applications of Dempster-Shafer Theory of Evidence: Multiple Classifier Function | p. 287 |

Useful Mathematical Relations | p. 291 |

Information Inequality | p. 291 |

Relative Entropy | p. 291 |

Saddle-Point Method | p. 292 |

Stirling's Formula | p. 293 |

Convex Function and Jensen's Inequality | p. 294 |

DataMinerXL - Microsoft Excel Add-In for Building Predictive Models | p. 299 |

Overview | p. 299 |

Utility Functions | p. 299 |

Data Manipulation Functions | p. 300 |

Basic Statistical Functions | p. 300 |

Modeling Functions for All Models | p. 301 |

Weight of Evidence Transformation Functions | p. 301 |

Linear Regression Functions | p. 302 |

Partial Least Squares Regression Functions | p. 302 |

Logistic Regression Functions | p. 303 |

Time Series Analysis Functions | p. 303 |

Naive Bayes Classifier Functions | p. 303 |

Tree-Based Model Functions | p. 304 |

Clustering and Segmentation Functions | p. 304 |

Neural Network Functions | p. 304 |

Support Vector Machine Functions | p. 304 |

Optimization Functions | p. 305 |

Matrix Operation Functions | p. 305 |

Numerical Integration Functions | p. 306 |

Excel Built-in Statistical Distribution Functions | p. 306 |

Bibliography | p. 309 |

Index | p. 313 |

Table of Contents provided by Ingram. All Rights Reserved. |