did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

9780120884070

Data Mining : Practical Machine Learning Tools and Techniques

by ;
  • ISBN13:

    9780120884070

  • ISBN10:

    0120884070

  • Edition: 2nd
  • Format: Paperback
  • Copyright: 6/8/2005
  • Publisher: Elsevier Science

Note: Supplemental materials are not guaranteed with Rental or Used book purchases.

Purchase Benefits

  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
  • Complimentary 7-Day eTextbook Access - Read more
    When you rent or buy this book, you will receive complimentary 7-day online access to the eTextbook version from your PC, Mac, tablet, or smartphone. Feature not included on Marketplace Items.
List Price: $75.95 Save up to $26.58
  • Rent Book $49.37
    Add to Cart Free Shipping Icon Free Shipping

    TERM
    PRICE
    DUE

    7-Day eTextbook Access 7-Day eTextbook Access

    USUALLY SHIPS IN 24-48 HOURS
    *This item is part of an exclusive publisher rental program and requires an additional convenience fee. This fee will be reflected in the shopping cart.

Supplemental Materials

What is included with this book?

Summary

As with any burgeoning technology that enjoys commercial attention, the use of data mining is surrounded by a great deal of hype. Exaggerated reports tell of secrets that can be uncovered by setting algorithms loose on oceans of data. But there is no magic in machine learning, no hidden power, no alchemy. Instead there is an identifiable body of practical techniques that can extract useful information from raw data. This book describes these techniques and shows how they work. The book is a major revision of the first edition that appeared in 1999. While the basic core remains the same, it has been updated to reflect the changes that have taken place over five years, and now has nearly double the references. The highlights for the new edition include thirty new technique sections; an enhanced Weka machine learning workbench, which now features an interactive interface; comprehensive information on neural networks; a new section on Bayesian networks; plus much more. * Algorithmic methods at the heart of successful data miningincluding tried and true techniques as well as leading edge methods * Performance improvement techniques that work by transforming the input or output * Downloadable Weka, a collection of machine learning algorithms for data mining tasks, including tools for data pre-processing, classification, regression, clustering, association rules, and visualizationin a new, interactive interface

Table of Contents

Foreword v
Preface xxiii
Updated and revised content xxvii
Acknowledgments xxix
Part I Machine learning tools and techniques
1(362)
What's it all about?
3(38)
Data mining and machine learning
4(5)
Describing structural patterns
6(1)
Machine learning
7(2)
Data mining
9(1)
Simple examples: The weather problem and others
9(13)
The weather problem
10(3)
Contact lenses: An idealized problem
13(2)
Irises: A classic numeric dataset
15(1)
CPU performance: Introducing numeric prediction
16(1)
Labor negotiations: A more realistic example
17(1)
Soybean classification: A classic machine learning success
18(4)
Fielded applications
22(7)
Decisions involving judgment
22(1)
Screening images
23(1)
Load forecasting
24(1)
Diagnosis
25(1)
Marketing and sales
26(2)
Other applications
28(1)
Machine learning and statistics
29(1)
Generalization as search
30(5)
Enumerating the concept space
31(1)
Bias
32(3)
Data mining and ethics
35(2)
Further reading
37(4)
Input: Concepts, instances, and attributes
41(20)
What's a concept?
42(3)
What's in an example?
45(4)
What's in an attribute?
49(3)
Preparing the input
52(8)
Gathering the data together
52(1)
ARFF format
53(2)
Sparse data
55(1)
Attribute types
56(2)
Missing values
58(1)
Inaccurate values
59(1)
Getting to know your data
60(1)
Further reading
60(1)
Output: Knowledge representation
61(22)
Decision tables
62(1)
Decision trees
62(3)
Classification rules
65(4)
Association rules
69(1)
Rules with exceptions
70(3)
Rules involving relations
73(3)
Trees for numeric prediction
76(1)
Instance-based representation
76(5)
Clusters
81(1)
Further reading
82(1)
Algorithms: The basic methods
83(60)
Inferring rudimentary rules
84(4)
Missing values and numeric attributes
86(2)
Discussion
88(1)
Statistical modeling
88(9)
Missing values and numeric attributes
92(2)
Bayesian models for document classification
94(2)
Discussion
96(1)
Divide-and-conquer: Constructing decision trees
97(8)
Calculating information
100(2)
Highly branching attributes
102(3)
Discussion
105(1)
Covering algorithms: Constructing rules
105(7)
Rules versus trees
107(1)
A simple covering algorithm
107(4)
Rules versus decision lists
111(1)
Mining association rules
112(7)
Item sets
113(1)
Association rules
113(4)
Generating rules efficiently
117(1)
Discussion
118(1)
Linear models
119(9)
Numeric prediction: Linear regression
119(2)
Linear classification: Logistic regression
121(3)
Linear classification using the perceptron
124(2)
Linear classification using Winnow
126(2)
Instance-based learning
128(8)
The distance function
128(1)
Finding nearest neighbors efficiently
129(6)
Discussion
135(1)
Clustering
136(3)
Iterative distance-based clustering
137(1)
Faster distance calculations
138(1)
Discussion
139(1)
Further reading
139(4)
Credibility: Evaluating what's been learned
143(44)
Training and testing
144(2)
Predicting performance
146(3)
Cross-validation
149(2)
Other estimates
151(2)
Leave-one-out
151(1)
The bootstrap
152(1)
Comparing data mining methods
153(4)
Predicting probabilities
157(4)
Quadratic loss function
158(1)
Informational loss function
159(1)
Discussion
160(1)
Counting the cost
161(15)
Cost-sensitive classification
164(1)
Cost-sensitive learning
165(1)
Lift charts
166(2)
ROC curves
168(3)
Recall--precision curves
171(1)
Discussion
172(1)
Cost curves
173(3)
Evaluating numeric prediction
176(3)
The minimum description length principle
179(4)
Applying the MDL principle to clustering
183(1)
Further reading
184(3)
Implementations: Real machine learning schemes
187(98)
Decision trees
189(11)
Numeric attributes
189(2)
Missing values
191(1)
Pruning
192(1)
Estimating error rates
193(3)
Complexity of decision tree induction
196(2)
From trees to rules
198(1)
C4.5: Choices and options
198(1)
Discussion
199(1)
Classification rules
200(14)
Criteria for choosing tests
200(1)
Missing values, numeric attributes
201(1)
Generating good rules
202(3)
Using global optimization
205(2)
Obtaining rules from partial decision trees
207(3)
Rules with exceptions
210(3)
Discussion
213(1)
Extending linear models
214(21)
The maximum margin hyperplane
215(2)
Nonlinear class boundaries
217(2)
Support vector regression
219(3)
The kernel perceptron
222(1)
Multilayer perceptrons
223(12)
Discussion
235(1)
Instance-based learning
235(8)
Reducing the number of exemplars
236(1)
Pruning noisy exemplars
236(1)
Weighting attributes
237(1)
Generalizing exemplars
238(1)
Distance functions for generalized exemplars
239(2)
Generalized distance functions
241(1)
Discussion
242(1)
Numeric prediction
243(11)
Model trees
244(1)
Building the tree
245(1)
Pruning the tree
245(1)
Nominal attributes
246(1)
Missing values
246(1)
Pseudocode for model tree induction
247(3)
Rules from model trees
250(1)
Locally weighted linear regression
251(2)
Discussion
253(1)
Clustering
254(17)
Choosing the number of clusters
254(1)
Incremental clustering
255(5)
Category utility
260(2)
Probability-based clustering
262(3)
The EM algorithm
265(1)
Extending the mixture model
266(2)
Bayesian clustering
268(2)
Discussion
270(1)
Bayesian networks
271(14)
Making predictions
272(4)
Learning Bayesian networks
276(2)
Specific algorithms
278(2)
Data structures for fast learning
280(3)
Discussion
283(2)
Transformations: Engineering the input and output
285(60)
Attribute selection
288(8)
Scheme-independent selection
290(2)
Searching the attribute space
292(2)
Scheme-specific selection
294(2)
Discretizing numeric attributes
296(9)
Unsupervised discretization
297(1)
Entropy-based discretization
298(4)
Other discretization methods
302(1)
Entropy-based versus error-based discretization
302(2)
Converting discrete to numeric attributes
304(1)
Some useful transformations
305(7)
Principal components analysis
306(3)
Random projections
309(1)
Text to attribute vectors
309(2)
Time series
311(1)
Automatic data cleansing
312(3)
Improving decision trees
312(1)
Robust regression
313(1)
Detecting anomalies
314(1)
Combining multiple models
315(22)
Bagging
316(3)
Bagging with costs
319(1)
Randomization
320(1)
Boosting
321(4)
Additive regression
325(2)
Additive logistic regression
327(1)
Option trees
328(3)
Logistic model trees
331(1)
Stacking
332(2)
Error-correcting output codes
334(3)
Using unlabeled data
337(4)
Clustering for classification
337(2)
Co-training
339(1)
EM and co-training
340(1)
Further reading
341(4)
Moving on: Extensions and applications
345(18)
Learning from massive datasets
346(3)
Incorporating domain knowledge
349(2)
Text and Web mining
351(5)
Adversarial situations
356(2)
Ubiquitous data mining
358(3)
Further reading
361(2)
Part II The Weka machine learning workbench
363(122)
Introduction to Weka
365(4)
What's in Weka?
366(1)
How do you use it?
367(1)
What else can you do?
368(1)
How do you get it?
368(1)
The Explorer
369(58)
Getting started
369(11)
Preparing the data
370(1)
Loading the data into the Explorer
370(3)
Building a decision tree
373(1)
Examining the output
373(4)
Doing it again
377(1)
Working with models
377(1)
When things go wrong
378(2)
Exploring the Explorer
380(13)
Loading and filtering files
380(4)
Training and testing learning schemes
384(4)
Do it yourself: The User Classifier
388(1)
Using a metalearner
389(2)
Clustering and association rules
391(1)
Attribute selection
392(1)
Visualization
393(1)
Filtering algorithms
393(10)
Unsupervised attribute filters
395(5)
Unsupervised instance filters
400(1)
Supervised filters
401(2)
Learning algorithms
403(11)
Bayesian classifiers
403(3)
Trees
406(2)
Rules
408(1)
Functions
409(4)
Lazy classifiers
413(1)
Miscellaneous classifiers
414(1)
Metalearning algorithms
414(4)
Bagging and randomization
414(2)
Boosting
416(1)
Combining classifiers
417(1)
Cost-sensitive learning
417(1)
Optimizing performance
417(1)
Retargeting classifiers for different tasks
418(1)
Clustering algorithms
418(1)
Association-rule learners
419(1)
Attribute selection
420(7)
Attribute subset evaluators
422(1)
Single-attribute evaluators
422(1)
Search methods
423(4)
The Knowledge Flow interface
427(10)
Getting started
427(3)
The Knowledge Flow components
430(1)
Configuring and connecting the components
431(2)
Incremental learning
433(4)
The Experimenter
437(12)
Getting started
438(3)
Running an experiment
439(1)
Analyzing the results
440(1)
Simple setup
441(1)
Advanced setup
442(1)
The Analyze panel
443(2)
Distributing processing over several machines
445(4)
The command-line interface
449(12)
Getting started
449(1)
The structure of Weka
450(6)
Classes, instances, and packages
450(1)
The weka.core package
451(2)
The weka.classifiers package
453(2)
Other packages
455(1)
Javadoc indices
456(1)
Command-line options
456(5)
Generic options
456(2)
Scheme-specific options
458(3)
Embedded machine learning
461(10)
A simple data mining application
461(1)
Going through the code
462(9)
Main()
462(1)
MessageClassifier()
462(6)
updateData()
468(1)
classifyMessage()
468(3)
Writing new learning schemes
471(14)
An example classifier
471(12)
buildClassifier()
472(1)
makeTree()
472(8)
computeInfoGain()
480(1)
classifyInstance()
480(1)
main()
481(2)
Conventions for implementing classifiers
483(2)
References 485(20)
Index 505(20)
About the authors 525

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program