Data Mining for Business... | Buy

In today's world, businesses are becoming more capable of accessing their ideal consumers, and an understanding of data mining contributes to this success. Data Mining for Business Intelligence, which was developed from a course taught at the Massachusetts Institute of Technology's Sloan School of Management, and the University of Maryland's Smith School of Business, uses real data and actual cases to illustrate the applicability of data mining intelligence to the development of successful business models. Featuring XLMiner, the Microsoft Office Excel add-in, this book allows readers to follow along and implement algorithms at their own speed, with a minimal learning curve. In addition, students and practitioners of data mining techniques are presented with hands-on, business-oriented applications. An abundant amount of exercises and examples are provided to motivate learning and understanding. Data Mining for Business Intelligence: * Provides both a theoretical and practical understanding of the key methods of classification, prediction, reduction, exploration, and affinity analysis * Features a business decision-making context for these key methods * Illustrates the application and interpretation of these methods using real business cases and data This book helps readers understand the beneficial relationship that can be established between data mining and smart business practices, and is an excellent learning tool for creating valuable strategies and making wiser business decisions.

GALIT SHMUELI, PHD, is Assistant Professor of Statistics in the Decision and Information Technologies Department of the Robert H. Smith School of Business at the University of Maryland.

NITIN R. PATEL, PHD, is Chairman, Founder, and Chief Technology Officer of Cambridge-based Cytel Incorporated and a Visiting Professor in the Engineering Systems Division at the Massachusetts Institute of Technology.

PETER C. BRUCE is President and owner of statistics.com, the leading provider of professional development courses in statistics.

Foreword

xiii

Preface

xv

Acknowledgments

xvii

Introduction

1

(8)

What Is Data Mining?

1

(1)

Where Is Data Mining Used?

2

(1)

The Origins of Data Mining

2

(1)

The Rapid Growth of Data Mining

3

(1)

Why Are There So Many Different Methods?

4

(1)

Terminology and Notation

4

(2)

Road Maps to This Book

6

(3)

Overview of the Data Mining Process

9

(26)

Introduction

9

(1)

Core Ideas in Data Mining

9

(2)

Supervised and Unsupervised Learning

11

(1)

The Steps in Data Mining

11

(2)

Preliminary Steps

13

(8)

Building a Model: Example with Linear Regression

21

(6)

Using Excel for Data Mining

27

(8)

Problems

31

(4)

Data Exploration and Dimension Reduction

35

(18)

Introduction

35

(1)

Practical Considerations

35

(2)

Example 1: House Prices in Boston

36

(1)

Data Summaries

37

(1)

Data Visualization

38

(2)

Correlation Analysis

40

(1)

Reducing the Number of Categories in Categorical Variables

41

(1)

Principal Components Analysis

41

(12)

Example 2: Breakfast Cereals

42

(3)

Principal Components

45

(1)

Normalizing the Data

46

(3)

Using Principal Components for Classification and Prediction

49

(2)

Problems

51

(2)

Evaluating Classification and Predictive Performance

53

(22)

Introduction

53

(1)

Judging Classification Performance

53

(19)

Accuracy Measures

53

(3)

Cutoff for Classification

56

(4)

Performance in Unequal Importance of Classes

60

(1)

Asymmetric Misclassification Costs

61

(5)

Oversampling and Asymmetric Costs

66

(6)

Classification Using a Triage Strategy

72

(1)

Evaluating Predictive Performance

72

(3)

Problems

74

(1)

Multiple Linear Regression

75

(16)

Introduction

75

(1)

Explanatory vs. Predictive Modeling

76

(1)

Estimating the Regression Equation and Prediction

76

(5)

Example: Predicting the Price of Used Toyota Corolla Automobiles

77

(4)

Variable Selection in Linear Regression

81

(10)

Reducing the Number of Predictors

81

(1)

How to Reduce the Number of Predictors

82

(4)

Problems

86

(5)

Three Simple Classification Methods

91

(20)

Introduction

91

(1)

Example 1: Predicting Fraudulent Financial Reporting

91

(1)

Example 2: Predicting Delayed Flights

92

(1)

The Naive Rule

92

(1)

Naive Bayes

93

(10)

Conditional Probabilities and Pivot Tables

94

(1)

A Practical Difficulty

94

(1)

A Solution: Naive Bayes

95

(5)

Advantages and Shortcomings of the naive Bayes Classifier

100

(3)

k-Nearest Neighbors

103

(8)

Example 3: Riding Mowers

104

(1)

Choosing k

105

(1)

k-NN for a Quantitative Response

106

(1)

Advantages and Shortcomings of k-NN Algorithms

106

(2)

Problems

108

(3)

Classification and Regression Trees

111

(26)

Introduction

111

(2)

Classification Trees

113

(1)

Recursive Partitioning

113

(1)

Example 1: Riding Mowers

113

(7)

Measures of Impurity

115

(5)

Evaluating the Performance of a Classification Tree

120

(1)

Example 2: Acceptance of Personal Loan

120

(1)

Avoiding Overfitting

121

(9)

Stopping Tree Growth: CHAID

121

(4)

Pruning the Tree

125

(5)

Classification Rules from Trees

130

(1)

Regression Trees

130

(2)

Prediction

130

(1)

Measuring Impurity

131

(1)

Evaluating Performance

132

(1)

Advantages, Weaknesses, and Extensions

132

(5)

Problems

134

(3)

Logistic Regression

137

(30)

Introduction

137

(1)

The Logistic Regression Model

138

(8)

Example: Acceptance of Personal Loan

139

(2)

Model with a Single Predictor

141

(2)

Estimating the Logistic Model from Data: Computing Parameter Estimates

143

(1)

Interpreting Results in Terms of Odds

144

(2)

Why Linear Regression Is Inappropriate for a Categorical Response

146

(2)

Evaluating Classification Performance

148

(2)

Variable Selection

148

(2)

Evaluating Goodness of Fit

150

(3)

Example of Complete Analysis: Predicting Delayed Flights

153

(7)

Data Preprocessing

154

(1)

Model Fitting and Estimation

155

(1)

Model Interpretation

155

(1)

Model Performance

155

(2)

Goodness of fit

157

(1)

Variable Selection

158

(2)

Logistic Regression for More Than Two Classes

160

(7)

Ordinal Classes

160

(1)

Nominal Classes

161

(2)

Problems

163

(4)

Neural Nets

167

(20)

Introduction

167

(1)

Concept and Structure of a Neural Network

168

(1)

Fitting a Network to Data

168

(13)

Example 1: Tiny Dataset

169

(1)

Computing Output of Nodes

170

(2)

Preprocessing the Data

172

(1)

Training the Model

172

(4)

Example 2: Classifying Accident Severity

176

(1)

Avoiding overfitting

177

(4)

Using the Output for Prediction and Classification

181

(1)

Required User Input

181

(1)

Exploring the Relationship Between Predictors and Response

182

(1)

Advantages and Weaknesses of Neural Networks

182

(5)

Problems

184

(3)

Discriminant Analysis

187

(16)

Introduction

187

(1)

Example 1: Riding Mowers

187

(1)

Example 2: Personal Loan Acceptance

188

(1)

Distance of an Observation from a Class

188

(3)

Fisher's Linear Classification Functions

191

(3)

Classification Performance of Discriminant Analysis

194

(1)

Prior Probabilities

195

(1)

Unequal Misclassification Costs

195

(1)

Classifying More Than Two Classes

196

(1)

Example 3: Medical Dispatch to Accident Scenes

196

(1)

Advantages and Weaknesses

197

(6)

Problems

200

(3)

Association Rules

203

(16)

Introduction

203

(1)

Discovering Association Rules in Transaction Databases

203

(1)

Example 1: Synthetic Data on Purchases of Phone Faceplates

204

(1)

Generating Candidate Rules

204

(2)

The Apriori Algorithm

205

(1)

Selecting Strong Rules

206

(6)

Support and Confidence

206

(1)

Lift Ratio

207

(1)

Data Format

207

(2)

The Process of Rule Selection

209

(1)

Interpreting the Results

210

(1)

Statistical Significance of Rules

211

(1)

Example 2: Rules for Similar Book Purchases

212

(1)

Summary

212

(7)

Problems

215

(4)

Cluster Analysis

219

(22)

Introduction

219

(1)

Example: Public Utilities

220

(2)

Measuring Distance Between Two Records

222

(5)

Euclidean Distance

223

(1)

Normalizing Numerical Measurements

223

(1)

Other Distance Measures for Numerical Data

223

(3)

Distance Measures for Categorical Data

226

(1)

Distance Measures for Mixed Data

226

(1)

Measuring Distance Between Two Clusters

227

(1)

Hierarchical (Agglomerative) Clustering

228

(5)

Minimum Distance (Single Linkage)

229

(1)

Maximum Distance (Complete Linkage)

229

(1)

Group Average (Average Linkage)

230

(1)

Dendrograms: Displaying Clustering Process and Results

230

(1)

Validating Clusters

231

(1)

Limitations of Hierarchical Clustering

232

(1)

Nonhierarchical Clustering: The k-Means Algorithm

233

(8)

Initial Partition into k Clusters

234

(3)

Problems

237

(4)

Cases

241

(30)

Charles Book Club

241

(9)

German Credit

250

(4)

Tayko Software Cataloger

254

(4)

Segmenting Consumers of Bath Soap

258

(4)

Direct-Mail Fundraising

262

(3)

Catalog Cross-Selling

265

(2)

Predicting Bankruptcy

267

(4)

References

271

(2)

Index

273

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.