Machine Learning for Business Analytics

Machine learning—also known as data mining or data analytics—is a fundamental part of data science. It is used by organizations in a wide variety of arenas to turn raw data into actionable information.

Machine Learning for Business Analytics: Concepts, Techniques and Applications in RapidMiner provides a comprehensive introduction and an overview of this methodology. This best-selling textbook covers both statistical and machine learning algorithms for prediction, classification, visualization, dimension reduction, rule mining, recommendations, clustering, text mining, experimentation and network analytics. Along with hands-on exercises and real-life case studies, it also discusses managerial and ethical issues for responsible use of machine learning techniques.

This is the seventh edition of Machine Learning for Business Analytics, and the first using RapidMiner software. This edition also includes:

A new co-author, Amit Deokar, who brings experience teaching business analytics courses using RapidMiner
Integrated use of RapidMiner, an open-source machine learning platform that has become commercially popular in recent years
An expanded chapter focused on discussion of deep learning techniques
A new chapter on experimental feedback techniques including A/B testing, uplift modeling, and reinforcement learning
A new chapter on responsible data science
Updates and new material based on feedback from instructors teaching MBA, Masters in Business Analytics and related programs, undergraduate, diploma and executive courses, and from their students
A full chapter devoted to relevant case studies with more than a dozen cases demonstrating applications for the machine learning techniques
End-of-chapter exercises that help readers gauge and expand their comprehension and competency of the material presented
A companion website with more than two dozen data sets, and instructor materials including exercise solutions, slides, and case solutions

This textbook is an ideal resource for upper-level undergraduate and graduate level courses in data science, predictive analytics, and business analytics. It is also an excellent reference for analysts, researchers, and data science practitioners working with quantitative data in management, finance, marketing, operations management, information systems, computer science, and information technology.

Galit Shmueli, PhD, is Distinguished Professor and Institute Director at National Tsing Hua University’s Institute of Service Science. She has designed and instructed business analytics courses since 2004 at University of Maryland, Statistics.com, The Indian School of Business, and National Tsing Hua University, Taiwan.

Peter C. Bruce, is Founder of the Institute for Statistics Education at Statistics.com, and Chief Learning Officer at Elder Research, Inc.

Amit V. Deokar, PhD, is Chair of the Operations & Information Systems Department and an Associate Professor of Management Information Systems at the Manning School of Business at University of Massachusetts Lowell. Since 2006, he has developed and taught courses in business analytics, with expertise in using the RapidMiner platform. He is an Association for Information Systems Distinguished Member Cum Laude.

Nitin R. Patel, PhD, is cofounder and lead researcher at Cytel Inc. He was also a co-founder of Tata Consultancy Services. A Fellow of the American Statistical Association, Dr. Patel has served as a visiting professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad, for 15 years.

Foreword by Ravi Bapna xxi

Preface to the RapidMiner Edition xxiii

Acknowledgments xxvii

PART I PRELIMINARIES

CHAPTER 1 Introduction 3

1.1 What Is Business Analytics? . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 What Is Machine Learning? . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Machine Learning, AI, and Related Terms . . . . . . . . . . . . . . . . . . . . 5

Statistical Modeling vs. Machine Learning . . . . . . . . . . . . . . . . . . . . 6

1.4 Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5 Data Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.6 Why Are There So Many Different Methods? . . . . . . . . . . . . . . . . . . . 9

1.7 Terminology and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.8 Road Maps to This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Order of Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.9 Using RapidMiner Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Importing and Loading Data in RapidMiner . . . . . . . . . . . . . . . . . . . 16

RapidMiner Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

CHAPTER 2 Overview of the Machine Learning Process 19

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2 Core Ideas in Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . 20

Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Association Rules and Recommendation Systems . . . . . . . . . . . . . . . . . 20

Predictive Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Data Reduction and Dimension Reduction . . . . . . . . . . . . . . . . . . . . 21

Data Exploration and Visualization . . . . . . . . . . . . . . . . . . . . . . . . 21

Supervised and Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . 22

2.3 The Steps in a Machine Learning Project . . . . . . . . . . . . . . . . . . . . . 23

vii

viii CONTENTS

2.4 Preliminary Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Organization of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Sampling from a Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Oversampling Rare Events in Classification Tasks . . . . . . . . . . . . . . . . . 26

Preprocessing and Cleaning the Data . . . . . . . . . . . . . . . . . . . . . . . 26

2.5 Predictive Power and Overfitting . . . . . . . . . . . . . . . . . . . . . . . . . 32

Overfitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Creation and Use of Data Partitions . . . . . . . . . . . . . . . . . . . . . . . 34

2.6 Building a Predictive Model with RapidMiner . . . . . . . . . . . . . . . . . . . 37

Predicting Home Values in the West Roxbury Neighborhood . . . . . . . . . . . 39

Modeling Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.7 Using RapidMiner for Machine Learning . . . . . . . . . . . . . . . . . . . . . 45

2.8 Automating Machine Learning Solutions . . . . . . . . . . . . . . . . . . . . . 47

Predicting Power Generator Failure . . . . . . . . . . . . . . . . . . . . . . . . 48

Uber’s Michelangelo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

2.9 Ethical Practice in Machine Learning . . . . . . . . . . . . . . . . . . . . . . . 52

Machine Learning Software Tools: The State of the Market by Herb Edelstein . . . 53

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

PART II DATA EXPLORATION AND DIMENSION REDUCTION

CHAPTER 3 Data Visualization 63

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.2 Data Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Example 1: Boston Housing Data . . . . . . . . . . . . . . . . . . . . . . . . 65

Example 2: Ridership on Amtrak Trains . . . . . . . . . . . . . . . . . . . . . . 66

3.3 Basic Charts: Bar Charts, Line Charts, and Scatter Plots . . . . . . . . . . . . . 66

Distribution Plots: Boxplots and Histograms . . . . . . . . . . . . . . . . . . . 69

Heatmaps: Visualizing Correlations and Missing Values . . . . . . . . . . . . . . 72

3.4 Multidimensional Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Adding Attributes: Color, Size, Shape, Multiple Panels, and Animation . . . . . . 75

Manipulations: Rescaling, Aggregation and Hierarchies, Zooming, and Filtering . . 78

Reference: Trend Lines and Labels . . . . . . . . . . . . . . . . . . . . . . . . 81

Scaling Up to Large Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Multivariate Plot: Parallel Coordinates Plot . . . . . . . . . . . . . . . . . . . . 83

Interactive Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3.5 Specialized Visualizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Visualizing Networked Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Visualizing Hierarchical Data: Treemaps . . . . . . . . . . . . . . . . . . . . . 89

Visualizing Geographical Data: Map Charts . . . . . . . . . . . . . . . . . . . . 90

3.6 Summary: Major Visualizations and Operations, by Machine Learning Goal . . . . 92

Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Time Series Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

CONTENTS ix

CHAPTER 4 Dimension Reduction 97

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4.2 Curse of Dimensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

4.3 Practical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Example 1: House Prices in Boston . . . . . . . . . . . . . . . . . . . . . . . 99

4.4 Data Summaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

Aggregation and Pivot Tables . . . . . . . . . . . . . . . . . . . . . . . . . . 102

4.5 Correlation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

4.6 Reducing the Number of Categories in Categorical Attributes . . . . . . . . . . . 105

4.7 Converting a Categorical Attribute to a Numerical Attribute . . . . . . . . . . . 107

4.8 Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 107

Example 2: Breakfast Cereals . . . . . . . . . . . . . . . . . . . . . . . . . . 107

Principal Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

Normalizing the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Using Principal Components for Classification and Prediction . . . . . . . . . . . 117

4.9 Dimension Reduction Using Regression Models . . . . . . . . . . . . . . . . . . 117

4.10 Dimension Reduction Using Classification and Regression Trees . . . . . . . . . . 119

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

PART III PERFORMANCE EVALUATION

CHAPTER 5 Evaluating Predictive Performance 125

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

5.2 Evaluating Predictive Performance . . . . . . . . . . . . . . . . . . . . . . . . 126

Naive Benchmark: The Average . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Prediction Accuracy Measures . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Comparing Training and Holdout Performance . . . . . . . . . . . . . . . . . . 130

Lift Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

5.3 Judging Classifier Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 131

Benchmark: The Naive Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

Class Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

The Confusion (Classification) Matrix . . . . . . . . . . . . . . . . . . . . . . . 133

Using the Holdout Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

Accuracy Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

Propensities and Threshold for Classification . . . . . . . . . . . . . . . . . . . 136

Performance in Case of Unequal Importance of Classes . . . . . . . . . . . . . . 139

Asymmetric Misclassification Costs . . . . . . . . . . . . . . . . . . . . . . . . 143

Generalization to More Than Two Classes . . . . . . . . . . . . . . . . . . . . . 146

5.4 Judging Ranking Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 146

Lift Charts for Binary Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

Decile Lift Charts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

Beyond Two Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Lift Charts Incorporating Costs and Benefits . . . . . . . . . . . . . . . . . . . 150

Lift as a Function of Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . 150

x CONTENTS

5.5 Oversampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

Creating an Over-sampled Validation (or Holdout) Set . . . . . . . . . . . . . . 154

Evaluating Model Performance Using a Non-oversampled Holdout Set . . . . . . . 155

Evaluating Model Performance if Only Oversampled Holdout Set Exists . . . . . . 155

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

PART IV PREDICTION AND CLASSIFICATION METHODS

CHAPTER 6 Multiple Linear Regression 163

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

6.2 Explanatory vs. Predictive Modeling . . . . . . . . . . . . . . . . . . . . . . . 164

6.3 Estimating the Regression Equation and Prediction . . . . . . . . . . . . . . . . 166

Example: Predicting the Price of Used Toyota Corolla Cars . . . . . . . . . . . . 167

6.4 Variable Selection in Linear Regression . . . . . . . . . . . . . . . . . . . . . 171

Reducing the Number of Predictors . . . . . . . . . . . . . . . . . . . . . . . 171

How to Reduce the Number of Predictors . . . . . . . . . . . . . . . . . . . . . 174

Regularization (Shrinkage Models) . . . . . . . . . . . . . . . . . . . . . . . . 180

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

CHAPTER 7 k-Nearest Neighbors (k-NN) 189

7.1 The k-NN Classifier (Categorical Label) . . . . . . . . . . . . . . . . . . . . . . 189

Determining Neighbors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

Classification Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

Example: Riding Mowers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

Choosing Parameter k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

Setting the Threshold Value . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

Weighted k-NN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

k-NN with More Than Two Classes . . . . . . . . . . . . . . . . . . . . . . . . 199

Working with Categorical Attributes . . . . . . . . . . . . . . . . . . . . . . . 199

7.2 k-NN for a Numerical Label . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

7.3 Advantages and Shortcomings of k-NN Algorithms . . . . . . . . . . . . . . . . 202

Appendix: Computing Distances Between Records in RapidMiner . . . . . . . . . . . . 204

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

CHAPTER 8 The Naive Bayes Classifier 209

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

Threshold Probability Method . . . . . . . . . . . . . . . . . . . . . . . . . . 210

Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

Example 1: Predicting Fraudulent Financial Reporting . . . . . . . . . . . . . . 210

8.2 Applying the Full (Exact) Bayesian Classifier . . . . . . . . . . . . . . . . . . . 211

Using the “Assign to the Most Probable Class” Method . . . . . . . . . . . . . . 212

Using the Threshold Probability Method . . . . . . . . . . . . . . . . . . . . . 212

Practical Difficulty with the Complete (Exact) Bayes Procedure . . . . . . . . . . 212

CONTENTS xi

8.3 Solution: Naive Bayes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

The Naive Bayes Assumption of Conditional Independence . . . . . . . . . . . . 214

Using the Threshold Probability Method . . . . . . . . . . . . . . . . . . . . . 214

Example 2: Predicting Fraudulent Financial Reports, Two Predictors . . . . . . . 215

Example 3: Predicting Delayed Flights . . . . . . . . . . . . . . . . . . . . . . 216

Working with Continuous Attributes . . . . . . . . . . . . . . . . . . . . . . . 222

8.4 Advantages and Shortcomings of the Naive Bayes Classifier . . . . . . . . . . . 223

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

CHAPTER 9 Classification and Regression Trees 229

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

Tree Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

Decision Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

Classifying a New Record . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

9.2 Classification Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

Recursive Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

Example 1: Riding Mowers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

Measures of Impurity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

9.3 Evaluating the Performance of a Classification Tree . . . . . . . . . . . . . . . . 240

Example 2: Acceptance of Personal Loan . . . . . . . . . . . . . . . . . . . . . 240

Sensitivity Analysis Using Cross Validation . . . . . . . . . . . . . . . . . . . . 243

9.4 Avoiding Overfitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

Stopping Tree Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

Stopping Tree Growth: Grid Search for Parameter Tuning . . . . . . . . . . . . . 247

Stopping Tree Growth: CHAID . . . . . . . . . . . . . . . . . . . . . . . . . . 249

Pruning the Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

9.5 Classification Rules from Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 255

9.6 Classification Trees for More Than Two Classes . . . . . . . . . . . . . . . . . . 256

9.7 Regression Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256

Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

Measuring Impurity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

Evaluating Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

9.8 Improving Prediction: Random Forests and Boosted Trees . . . . . . . . . . . . 259

Random Forests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

Boosted Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260

9.9 Advantages and Weaknesses of a Tree . . . . . . . . . . . . . . . . . . . . . . 261

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

CHAPTER 10 Logistic Regression 269

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

10.2 The Logistic Regression Model . . . . . . . . . . . . . . . . . . . . . . . . . . 271

10.3 Example: Acceptance of Personal Loan . . . . . . . . . . . . . . . . . . . . . . 272

Model with a Single Predictor . . . . . . . . . . . . . . . . . . . . . . . . . . 273

xii CONTENTS

Estimating the Logistic Model from Data: Computing Parameter Estimates . . . . 275

Interpreting Results in Terms of Odds (for a Profiling Goal) . . . . . . . . . . . . 278

Evaluating Classification Performance . . . . . . . . . . . . . . . . . . . . . . 280

Variable Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

10.4 Logistic Regression for Multi-class Classification . . . . . . . . . . . . . . . . . 283

Example: Accidents Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

10.5 Example of Complete Analysis: Predicting Delayed Flights . . . . . . . . . . . . 286

Data Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

Model Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

Model Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

Model Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

Variable Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295

Appendix: Logistic Regression for Ordinal Classes . . . . . . . . . . . . . . . . . . . . 299

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

CHAPTER 11 Neural Networks 305

RapidMiner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306

11.2 Concept and Structure of a Neural Network . . . . . . . . . . . . . . . . . . . . 306

11.3 Fitting a Network to Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

Example 1: Tiny Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308

Computing Output of Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . 308

Preprocessing the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

Training the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

Example 2: Classifying Accident Severity . . . . . . . . . . . . . . . . . . . . . 316

Avoiding Overfitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318

Using the Output for Prediction and Classification . . . . . . . . . . . . . . . . 320

11.4 Required User Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

11.5 Exploring the Relationship Between Predictors and Target Attribute . . . . . . . 322

11.6 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323

Convolutional Neural Networks (CNNs) . . . . . . . . . . . . . . . . . . . . . . 324

Local Feature Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

A Hierarchy of Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

The Learning Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326

Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326

Example: Classification of Fashion Images . . . . . . . . . . . . . . . . . . . . 327

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333

11.7 Advantages and Weaknesses of Neural Networks . . . . . . . . . . . . . . . . . 334

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335

CHAPTER 12 Discriminant Analysis 337

12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337

Example 1: Riding Mowers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

Example 2: Personal Loan Acceptance . . . . . . . . . . . . . . . . . . . . . . 338

CONTENTS xiii

12.2 Distance of a Record from a Class . . . . . . . . . . . . . . . . . . . . . . . . 340

12.3 Fisher’s Linear Classification Functions . . . . . . . . . . . . . . . . . . . . . . 341

12.4 Classification Performance of Discriminant Analysis . . . . . . . . . . . . . . . 346

12.5 Prior Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348

12.6 Unequal Misclassification Costs . . . . . . . . . . . . . . . . . . . . . . . . . 348

12.7 Classifying More Than Two Classes . . . . . . . . . . . . . . . . . . . . . . . . 349

Example 3: Medical Dispatch to Accident Scenes . . . . . . . . . . . . . . . . . 349

12.8 Advantages and Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . 351

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355

CHAPTER 13 Generating, Comparing, and Combining Multiple

Models

359

13.1 Automated Machine Learning (AutoML) . . . . . . . . . . . . . . . . . . . . . 359

AutoML: Explore and Clean Data . . . . . . . . . . . . . . . . . . . . . . . . . 360

AutoML: Determine Machine Learning Task . . . . . . . . . . . . . . . . . . . . 361

AutoML: Choose Attributes and Machine Learning Methods . . . . . . . . . . . . 361

AutoML: Evaluate Model Performance . . . . . . . . . . . . . . . . . . . . . . 363

AutoML: Model Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . 365

Advantages and Weaknesses of Automated Machine Learning . . . . . . . . . . . 365

13.2 Explaining Model Predictions . . . . . . . . . . . . . . . . . . . . . . . . . . 367

Explaining Model Predictions: LIME . . . . . . . . . . . . . . . . . . . . . . . 368

Counterfactual Explanations of Predictions: What-If Scenarios . . . . . . . . . . 369

13.3 Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373

Why Ensembles Can Improve Predictive Power . . . . . . . . . . . . . . . . . . 373

Simple Averaging or Voting . . . . . . . . . . . . . . . . . . . . . . . . . . . 375

Bagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376

Boosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378

Bagging and Boosting in RapidMiner . . . . . . . . . . . . . . . . . . . . . . . 378

Stacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380

Advantages and Weaknesses of Ensembles . . . . . . . . . . . . . . . . . . . . 381

13.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383

PART V INTERVENTION AND USER FEEDBACK

CHAPTER 14 Interventions: Experiments, Uplift Models, and

Reinforcement Learning

387

14.1 A/B Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387

Example: Testing a New Feature in a Photo Sharing App . . . . . . . . . . . . . 389

The Statistical Test for Comparing Two Groups (T-Test) . . . . . . . . . . . . . . 389

Multiple Treatment Groups: A/B/n Tests . . . . . . . . . . . . . . . . . . . . . 392

Multiple A/B Tests and the Danger of Multiple Testing . . . . . . . . . . . . . . 392

14.2 Uplift (Persuasion) Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 393

Gathering the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394

A Simple Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396

xiv CONTENTS

Modeling Individual Uplift . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396

Computing Uplift with RapidMiner . . . . . . . . . . . . . . . . . . . . . . . . 398

Using the Results of an Uplift Model . . . . . . . . . . . . . . . . . . . . . . . 398

14.3 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400

Explore-Exploit: Multi-Armed Bandits . . . . . . . . . . . . . . . . . . . . . . 400

Markov Decision Process (MDP) . . . . . . . . . . . . . . . . . . . . . . . . . 402

14.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406

PART VI MINING RELATIONSHIPS AMONG RECORDS

CHAPTER 15 Association Rules and Collaborative Filtering 409

15.1 Association Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409

Discovering Association Rules in Transaction Databases . . . . . . . . . . . . . 410

Example 1: Synthetic Data on Purchases of Phone Faceplates . . . . . . . . . . 410

Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411

Generating Candidate Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 412

The Apriori Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413

FP-Growth Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414

Selecting Strong Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415

The Process of Rule Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 418

Interpreting the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420

Rules and Chance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422

Example 2: Rules for Similar Book Purchases . . . . . . . . . . . . . . . . . . . 424

15.2 Collaborative Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424

Data Type and Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426

Example 3: Netflix Prize Contest . . . . . . . . . . . . . . . . . . . . . . . . . 427

User-Based Collaborative Filtering: “People Like You” . . . . . . . . . . . . . . 428

Item-Based Collaborative Filtering . . . . . . . . . . . . . . . . . . . . . . . . 430

Evaluating Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431

Example 4: Predicting Movie Ratings with MovieLens Data . . . . . . . . . . . . 432

Advantages and Weaknesses of Collaborative Filtering . . . . . . . . . . . . . . 434

Collaborative Filtering vs. Association Rules . . . . . . . . . . . . . . . . . . . 437

15.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440

CHAPTER 16 Cluster Analysis 445

16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445

Example: Public Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447

16.2 Measuring Distance Between Two Records . . . . . . . . . . . . . . . . . . . . 449

Euclidean Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449

Normalizing Numerical Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 450

Other Distance Measures for Numerical Data . . . . . . . . . . . . . . . . . . . 451

Distance Measures for Categorical Data . . . . . . . . . . . . . . . . . . . . . . 454

Distance Measures for Mixed Data . . . . . . . . . . . . . . . . . . . . . . . . 454

16.3 Measuring Distance Between Two Clusters . . . . . . . . . . . . . . . . . . . . 455

Minimum Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455

CONTENTS xv

Maximum Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455

Average Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455

Centroid Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455

16.4 Hierarchical (Agglomerative) Clustering . . . . . . . . . . . . . . . . . . . . . 457

Single Linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458

Complete Linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458

Average Linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459

Centroid Linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459

Ward’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459

Dendrograms: Displaying Clustering Process and Results . . . . . . . . . . . . . 460

Validating Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463

Limitations of Hierarchical Clustering . . . . . . . . . . . . . . . . . . . . . . 464

16.5 Non-Hierarchical Clustering: The k-Means Algorithm . . . . . . . . . . . . . . . 466

Choosing the Number of Clusters (k) . . . . . . . . . . . . . . . . . . . . . . . 467

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473

PART VII FORECASTING TIME SERIES

CHAPTER 17 Handling Time Series 479

RapidMiner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479

17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480

17.2 Descriptive vs. Predictive Modeling . . . . . . . . . . . . . . . . . . . . . . . 481

17.3 Popular Forecasting Methods in Business . . . . . . . . . . . . . . . . . . . . . 481

Combining Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482

17.4 Time Series Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482

Example: Ridership on Amtrak Trains . . . . . . . . . . . . . . . . . . . . . . . 483

17.5 Data Partitioning and Performance Evaluation . . . . . . . . . . . . . . . . . . 486

Benchmark Performance: Naive Forecasts . . . . . . . . . . . . . . . . . . . . . 489

Generating Future Forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . 490

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493

CHAPTER 18 Regression-Based Forecasting 497

RapidMiner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497

18.1 A Model with Trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498

Linear Trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498

Exponential Trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502

Polynomial Trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503

18.2 A Model with Seasonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504

Additive vs. Multiplicative Seasonality . . . . . . . . . . . . . . . . . . . . . . 507

18.3 A Model with Trend and Seasonality . . . . . . . . . . . . . . . . . . . . . . . 508

18.4 Autocorrelation and ARIMA Models . . . . . . . . . . . . . . . . . . . . . . . . 509

Computing Autocorrelation . . . . . . . . . . . . . . . . . . . . . . . . . . . 510

Improving Forecasts by Integrating Autocorrelation Information . . . . . . . . . 514

Evaluating Predictability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521

xvi CONTENTS

CHAPTER 19 Smoothing and Deep Learning Methods for

Forecasting

533

RapidMiner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533

19.1 Smoothing Methods: Introduction . . . . . . . . . . . . . . . . . . . . . . . . 534

19.2 Moving Average . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534

Centered Moving Average for Visualization . . . . . . . . . . . . . . . . . . . . 534

Trailing Moving Average for Forecasting . . . . . . . . . . . . . . . . . . . . . 535

Choosing Window Width (w) . . . . . . . . . . . . . . . . . . . . . . . . . . . 541

19.3 Simple Exponential Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . 541

Choosing Smoothing Parameter α . . . . . . . . . . . . . . . . . . . . . . . . 542

Relation Between Moving Average and Simple Exponential Smoothing . . . . . . 543

19.4 Advanced Exponential Smoothing . . . . . . . . . . . . . . . . . . . . . . . . 545

Series with a Trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545

Series with a Trend and Seasonality . . . . . . . . . . . . . . . . . . . . . . . 546

Series with Seasonality (No Trend) . . . . . . . . . . . . . . . . . . . . . . . . 547

19.5 Deep Learning for Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . 549

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553

PART VIII DATA ANALYTICS

CHAPTER 20 Social Network Analytics 563

20.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563

20.2 Directed vs. Undirected Networks . . . . . . . . . . . . . . . . . . . . . . . . 564

20.3 Visualizing and Analyzing Networks . . . . . . . . . . . . . . . . . . . . . . . 567

Plot Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567

Edge List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570

Adjacency Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571

Using Network Data in Classification and Prediction . . . . . . . . . . . . . . . 571

20.4 Social Data Metrics and Taxonomy . . . . . . . . . . . . . . . . . . . . . . . . 571

Node-Level Centrality Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 572

Egocentric Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573

Network Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573

20.5 Using Network Metrics in Prediction and Classification . . . . . . . . . . . . . . 577

Link Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577

Entity Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579

Collaborative Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581

20.6 Collecting Social Network Data with RapidMiner . . . . . . . . . . . . . . . . . 584

20.7 Advantages and Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . 584

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587

CHAPTER 21 Text Mining 589

RapidMiner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589

21.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589

21.2 The Tabular Representation of Text: Term–Document Matrix and “Bag-of-Words’’ . 590

CONTENTS xvii

21.3 Bag-of-Words vs. Meaning Extraction at Document Level . . . . . . . . . . . . . 592

21.4 Preprocessing the Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593

Tokenization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593

Text Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595

Presence/Absence vs. Frequency (Occurrences) . . . . . . . . . . . . . . . . . . 597

Term Frequency–Inverse Document Frequency (TF-IDF) . . . . . . . . . . . . . . 598

From Terms to Concepts: Latent Semantic Indexing . . . . . . . . . . . . . . . . 600

Extracting Meaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601

From Terms to High-Dimensional Word Vectors: Word2Vec . . . . . . . . . . . . 601

21.5 Implementing Machine Learning Methods . . . . . . . . . . . . . . . . . . . . 602

21.6 Example: Online Discussions on Autos and Electronics . . . . . . . . . . . . . . 602

Importing the Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603

Data Preparation and Labeling the Records . . . . . . . . . . . . . . . . . . . . 603

Text Preprocessing in RapidMiner . . . . . . . . . . . . . . . . . . . . . . . . 605

Producing a Concept Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 605

Fitting a Predictive Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606

Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607

21.7 Example: Sentiment Analysis of Movie Reviews . . . . . . . . . . . . . . . . . . 607

Data Loading, Preparation, and Partitioning . . . . . . . . . . . . . . . . . . . 607

Generating and Applying Word2vec Model . . . . . . . . . . . . . . . . . . . . 609

Fitting a Predictive Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611

Using a Pretrained Word2vec Model . . . . . . . . . . . . . . . . . . . . . . . 611

21.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615

CHAPTER 22 Responsible Data Science 617

22.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617

Example: Predicting Recidivism . . . . . . . . . . . . . . . . . . . . . . . . . 618

22.2 Unintentional Harm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618

22.3 Legal Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620

The General Data Protection Regulation (GDPR) . . . . . . . . . . . . . . . . . 620

Protected Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620

22.4 Principles of Responsible Data Science . . . . . . . . . . . . . . . . . . . . . . 621

Non-maleficence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621

Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622

Transparency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623

Accountability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624

Data Privacy and Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624

22.5 A Responsible Data Science Framework . . . . . . . . . . . . . . . . . . . . . . 624

Justification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625

Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625

Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626

Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627

Auditing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627

22.6 Documentation Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628

xviii CONTENTS

Impact Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628

Model Cards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629

Datasheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630

Audit Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630

22.7 Example: Applying the RDS Framework to the COMPAS Example . . . . . . . . . . 631

Unanticipated Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632

Ethical Concerns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632

Protected Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632

Data Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633

Fitting the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633

Auditing the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634

Bias Mitigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640

22.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641

Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643

PART IX CASES

CHAPTER 23 Cases 647

23.1 Charles Book Club . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647

The Book Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647

Database Marketing at Charles . . . . . . . . . . . . . . . . . . . . . . . . . . 648

Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 650

Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651

23.2 German Credit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653

Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654

Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654

23.3 Tayko Software Cataloger . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658

The Mailing Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659

Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659

Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660

23.4 Political Persuasion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662

Predictive Analytics Arrives in US Politics . . . . . . . . . . . . . . . . . . . . 662

Political Targeting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662

Uplift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663

Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664

Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664

23.5 Taxi Cancellations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665

Business Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665

Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666

23.6 Segmenting Consumers of Bath Soap . . . . . . . . . . . . . . . . . . . . . . . 667

Business Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667

Key Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667

CONTENTS xix

Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668

Measuring Brand Loyalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668

Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668

23.7 Direct-Mail Fundraising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670

Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670

Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670

23.8 Catalog Cross-Selling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 672

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 672

Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673

23.9 Time Series Case: Forecasting Public Transportation Demand . . . . . . . . . . . 673

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673

Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674

Available Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674

Assignment Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674

Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674

Tips and Suggested Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675

23.10 Loan Approval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675

Regulatory Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676

Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676

Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676

References 679

Data Files Used in the Book 683

Index 685

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Amazon no longer offers textbook rentals. We do!

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

Machine Learning for Business Analytics Concepts, Techniques and Applications in RapidMiner

9781119828792

1119828791

Supplemental Materials

Summary

Author Biography

Table of Contents

Supplemental Materials

Rewards Program