Data Mining for Business Intelligence : Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner

by ; ;
  • ISBN13:


  • ISBN10:


  • Edition: 2nd
  • Format: Hardcover
  • Copyright: 10/26/2010
  • Publisher: Wiley

Note: Supplemental materials are not guaranteed with Rental or Used book purchases.

  • Free Shipping On Orders Over $59!
    Your order must be $59 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • Get Rewarded for Ordering Your Textbooks! Enroll Now

Supplemental Materials

What is included with this book?

  • The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
  • The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.


Data Mining for Business Intelligence, Second Edition uses real data and actual cases to illustrate the applicability of data mining (DM) intelligence in the development of successful business models. Featuring complimentary access to XLMiner®, the Microsoft Office Excel® add-in, this book allows readers to follow along and implement algorithms at their own speed, with a minimal learning curve. In addition, students and practitioners of DM techniques are presented with hands-on, business-oriented applications. An abundant amount of exercises and examples, now doubled in number in the second edition, are provided to motivate learning and understanding.

This book helps readers understand the beneficial relationship that can be established between DM and smart business practices, and is an excellent learning tool for creating valuable strategies and making wiser business decisions. New topics include detailed coverage of visualization enhanced by Spotfire subroutines and time series forecasting, among a host of other subject matter.

The Second Edition now features:

-Three new chapters on time series forecasting, introducing popular business forecasting methods including moving average, exponential smoothing methods; regression-based models; and topics such as explanatory vs. predictive modeling, two-level models, and ensembles

-A revised chapter on data visualization that now features interactive visualization principles and added assignments that demonstrate interactive visualization in practice

-Separate chapters that each treat k-nearest neighbors and Naïve Bayes methods

-Summaries at the start of each chapter that supply an outline of key topics

Author Biography

GALIT SHMUELI, PhD, is Associate Professor of Statistics and Director of the eMarkets Research Lab in the Robert H. Smith School of Business at the University of Maryland. Dr. Shmueli is the coauthor of Statistical Methods in e-Commerce Research and Modeling Online Auctions, both published by Wiley.

NITIN R. PATEL, PhD, is Chairman and cofounder of Cytel, Inc., based in Cambridge, Massachusetts. A Fellow of the American Statistical Association, Dr. Patel has also served as a Visiting Professor at the Massachusetts Institute of Technology for over ten years.

PETER C. BRUCE is President and owner of statistics.com, the leading provider of online education in statistics.

Table of Contents

What Is Data Mining?
Where Is Data Mining Used?
The Origins of Data Mining
The Rapid Growth of Data Mining
Why Are There So Many Different Methods?
Terminology and Notation
Road Maps to This Book
Overview of the Data Mining Process
Core Ideas in Data Mining
Supervised and Unsupervised Learning
The Steps in Data Mining
Preliminary Steps
Building a Model: Example with Linear Regression
Using Excel for Data Mining
Data Exploration and Dimension Reduction
Data Visualization
Uses of Data Visualization
Data Examples
Boston Housing Data
Ridership on Amtrak Trains
Basic Charts: bar charts, line graphs, and scatterplots
Distribution Plots
Heatmaps: visualizing correlations and missing values
MultiDimensional Visualization
Adding Variables: color, hue, size, shape, multiple panels, animation
Manipulations: rescaling,aggregation and hierarchies, zooming and panning, filtering
Reference: trend line and labels
Scaling up: large datasets
Multivariate plot: parallel coordinates plot
Interactive visualization
Specialized Visualizations
Visualizing networked data
Visualizing hierarchical data: treemaps
Visualizing geographical data: maps
Summary of major visualizations and operations, according to data mining goal
Time series forecasting
Unsupervised learning
Dimension Reduction
Practical Considerations
House Prices in Boston
Data Summaries
Correlation Analysis
Reducing the Number of Categories in Categorical Variables
Converting A Categorical Variable to A Numerical Variable
Principal Components Analysis
Breakfast Cereals
Principal Components
Normalizing the Data
Using Principal Components for Classification and Prediction
Dimension Reduction Using Regression Models
Dimension Reduction Using Classification and Regression Trees
Performance Evaluation
Evaluating Classification and Predictive Performance
Judging Classification Performance
Benchmark: The Naive Rule
Class Separation
The Classification Matrix
Using the Validation Data
Accuracy Measures
Cutoff for Classification
Performance in Unequal Importance of Classes
Asymmetric Misclassification Costs
Oversampling and Asymmetric Costs
Classification Using a Triage Strategy
Evaluating Predictive Performance
Benchmark: The Average
Prediction Accuracy Measures
Prediction and Classification Methods
Multiple Linear Regression
Explanatory vs. Predictive Modeling
Estimating the Regression Equation and Prediction
Example: Predicting the Price of Used Toyota Corolla Automobiles
Variable Selection in Linear Regression
Reducing the Number of Predictors
How to Reduce the Number of Predictors
Neighbors (kNN)
The kNN
Determining Neighbors
Classification Rule
Example: Riding Mowers
Choosing k
Setting the Cutoff Value
With More Than 2 Classes
for a Numerical Response
Advantages and Shortcomings of kNN
Naive Bayes
Predicting Fraudulent Financial Reporting
The Practical Difficulty with the Complete (Exact) Bayes Procedure
The Solution: Na‹ve Bayes
Predicting Fraudulent Financial Reports, 2 Predictors
Predicting Delayed Flights
Advantages and Shortcomings of the naive Bayes Classifier
Classification and Regression Trees
Classification Trees
Recursive Partitioning
Riding Mowers
Measures of Impurity
Evaluating the Performance of a Classification Tree
Acceptance of Personal Loan
Avoiding Overfitting
Stopping Tree Growth: CHAID
Pruning the Tree
Classification Rules from Trees
Classification Trees for More Than 2 Classes
Regression Trees
Measuring Impurity
Evaluating Performance
Advantages, Weaknesses, and Extensions
Logistic Regression
The Logistic Regression Model
Example: Acceptance of Personal Loan
Model with a Single Predictor
Estimating the Logistic Model from Data: Computing Parameter
Interpreting Results in Terms of Odds
Evaluating Classification Performance
Variable Selection
Example of Complete Analysis: Predicting Delayed Flights
Data Preprocessing
Model Fitting and Estimation
Model Interpretation
Model Performance
Variable Selection
Appendix: Logistic Regression for Profiling
Appendix: Logistic regression for profiling
Appendix: B: Evaluating Goodness of Fit
Appendix B Evaluating Goodness of Fit
Appendix: C: Logistic Regression for More Than Two Classes
Appendix C Logistic Regression for More Than Two Classes
Neural Nets
Concept and Structure of a Neural Network
Fitting a Network to Data
Tiny Dataset
Computing Output of Nodes
Preprocessing the Data
Training the Model
Classifying Accident Severity
Avoiding overfitting
Using the Output for Prediction and Classification
Required User Input
Exploring the Relationship Between Predictors and Response
Advantages and Weaknesses of Neural Networks
Discriminant Analysis
Riding Mowers
Personal Loan Acceptance
Distance of an Observation from a Class
Fisher's Linear Classification Functions
Classification Performance of Discriminant Analysis
Prior Probabilities
Unequal Misclassification Costs
Classifying More Than Two Classes
Medical Dispatch to Accident Scenes
Advantages and Weaknesses
Mining Relationships Among Records
Association Rules
Discovering Association Rules in Transaction Databases
Synthetic Data on Purchases of Phone Faceplates
Generating Candidate Rules
The Apriori Algorithm
Selecting Strong Rules
Support and Confidence
Lift Ratio
Data Format
The Process of Rule Selection
Interpreting the Results
Statistical Significance of Rules
Rules for Similar Book Purchases
Cluster Analysis
Example: Public Utilities
Measuring Distance Between Two Records
Euclidean Distance
Normalizing Numerical Measurements
Other Distance Measures for Numerical Data
Distance Measures for Categorical Data
Distance Measures for Mixed Data
Measuring Distance Between Two Clusters
Hierarchical (Agglomerative) Clustering
Minimum Distance (Single Linkage)
Maximum Distance (Complete Linkage)
Average Distance (Average Linkage)
Dendrograms: Displaying Clustering Process and Results
Validating Clusters
Limitations of Hierarchical Clustering
Nonhierarchical Clustering: The kMeans Algorithm
Initial Partition into k Clusters
Forecasting Time Series
Handling Time Series
Explanatory vs. Predictive Modeling
Popular Forecasting Methods in Business
Combining Methods
Time Series Components
Example: Ridership on Amtrak Trains
Data Partitioning
Regression Based Forecasting
A Model with Trend
Linear Trend
Exponential Trend
Polynomial Trend
A Model with Seasonality
A model with trend and seasonality
Autocorrelation and ARIMA Models
Computing Autocorrelation
Computing Autocorrelation
Improving Forecasts by Integrating Autocorrelation Information
Improving Forecasts by Integrating Autocorrelation Information
Evaluating Predictability
Evaluating Predictability
Smoothing Methods
Moving Average
Centered Moving Average for Visualization
Trailing Moving Average for Forecasting
Choosing Window Width
Simple Exponential Smoothing
Choosing Smoothing Parameter
Relation Between Moving Average and Simple Exponential
Advanced Exponential Smoothing
Series with a trend
Series with a trend and seasonality
Series with seasonality
Charles Book Club
German Credit
Tayko Software Cataloger
Segmenting Consumers of Bath Soap
DirectMail Fundraising
Catalog CrossSelling
Predicting Bankruptcy
Time Series Case: Forecasting Public Transportation Demand
Table of Contents provided by Publisher. All Rights Reserved.

Rewards Program

Customer Reviews

Technically accurate and enjoyable to read! July 4, 2011
This textbook is designed for business students who have already had a course or two in statistics and want a good introduction to statistical methods for data-mining. This textbook, intended for MBA level students gives an excellent introduction to data mining. Overall, I think this is a good combination of the practical application and theory of data mining.
Flag Review
Please provide a brief explanation for why you are flagging this review:
Your submission has been received. We will inspect this review as soon as possible. Thank you for your input!
Data Mining for Business Intelligence : Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner: 5 out of 5 stars based on 1 user reviews.

Write a Review