did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

9781558605527

Data Mining : Practical Machine Learning Tools and Techniques with Java Implementations

by ;
  • ISBN13:

    9781558605527

  • ISBN10:

    1558605525

  • Format: Paperback
  • Copyright: 1999-10-11
  • Publisher: Elsevier Science

Note: Supplemental materials are not guaranteed with Rental or Used book purchases.

Purchase Benefits

  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
  • Complimentary 7-Day eTextbook Access - Read more
    When you rent or buy this book, you will receive complimentary 7-day online access to the eTextbook version from your PC, Mac, tablet, or smartphone. Feature not included on Marketplace Items.
List Price: $60.95 Save up to $15.24
  • Buy Used
    $45.71
    Add to Cart Free Shipping Icon Free Shipping

    USUALLY SHIPS IN 2-4 BUSINESS DAYS

    7-Day eTextbook Access 7-Day eTextbook Access

Supplemental Materials

What is included with this book?

Summary

This book offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. Inside, you'll learn all you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data miningincluding both tried-and-true techniques of the past and Java-based methods at the leading edge of contemporary research. If you're involved at any level in the work of extracting usable knowledge from large collections of data, this clearly written and effectively illustrated book will prove an invaluable resource. Complementing the authors' instruction is a fully functional platform-independent Java software system for machine learning, available for download. Apply it to the sample data sets provided to refine your data mining skills, apply it to your own data to discern meaningful patterns and generate valuable insights, adapt it for your specialized data mining applications, or use it to develop your own machine learning schemes. * Helps you select appropriate approaches to particular problems and to compare and evaluate the results of different techniques. * Covers performance improvement techniques, including input preprocessing and combining output from different methods. * Comes with downloadable machine learning software: use it to master the techniques covered inside, apply it to your own projects, and/or customize it to meet special needs.

Author Biography

Ian H. Witten is professor of computer science at the University of Waikato in New Zealand. He is a fellow of the ACM and the Royal Society of New Zealand and a member of professional computing, information retrieval, and engineering associations in the U.K., U.S., Canada, and New Zealand. Eibe Frank is a researcher in the Machine Learning group at the University of Waikato. He holds a degree in computer science from the University of Karlsruhe in Germany and is the author of several papers, both presented at machine learning conferences and published in machine learning journals.

Table of Contents

Foreword v
Preface xix
What's it all about?
1(36)
Data mining and machine learning
2(6)
Describing structural patterns
4(1)
Machine learning
5(2)
Data mining
7(1)
Simple examples: The weather problem and others
8(12)
The weather problem
8(3)
Contact lenses: An idealized problem
11(2)
Irises: A classic numeric dataset
13(2)
CPU performance: Introducing numeric prediction
15(1)
Labor negotiations: A more realistic example
16(1)
Soybean classification: A classic machine learning success
17(3)
Fielded applications
20(6)
Decisions involving judgment
21(1)
Screening images
22(1)
Load forecasting
23(1)
Diagnosis
24(1)
Marketing and sales
25(1)
Machine learning and statistics
26(1)
Generalization as search
27(5)
Enumerating the concept space
28(1)
Bias
29(3)
Data mining and ethics
32(2)
Further reading
34(3)
Input: Concepts, instances, attributes
37(20)
What's a concept?
38(3)
What's in an example?
41(4)
What's in an attribute?
45(3)
Preparing the input
48(7)
Gathering the data together
48(1)
Arff format
49(2)
Attribute types
51(1)
Missing values
52(1)
Inaccurate values
53(1)
Getting to know your data
54(1)
Further reading
55(2)
Output: Knowledge representation
57(20)
Decision tables
58(1)
Decision trees
58(1)
Classification rules
59(4)
Association rules
63(1)
Rules with exceptions
64(3)
Rules involving relations
67(3)
Trees for numeric prediction
70(2)
Instance-based representation
72(3)
Clusters
75(1)
Further reading
76(1)
Algorithms: The basic methods
77(42)
Inferring rudimentary rules
78(4)
Missing values and numeric attributes
80(1)
Discussion
81(1)
Statistical modeling
82(7)
Missing values and numeric attributes
85(3)
Discussion
88(1)
Divide and conquer: Constructing decision trees
89(8)
Calculating information
93(1)
Highly branching attributes
94(3)
Discussion
97(1)
Covering algorithms: Constructing rules
97(7)
Rules versus trees
98(1)
A simple covering algorithm
98(5)
Rules versus decision lists
103(1)
Mining association rules
104(8)
Item sets
105(1)
Association rules
105(3)
Generating rules efficiently
108(3)
Discussion
111(1)
Linear models
112(2)
Numeric prediction
112(1)
Classification
113(1)
Discussion
113(1)
Instance-based learning
114(2)
The distance function
114(1)
Discussion
115(1)
Further reading
116(3)
Credibility: Evaluating what's been learned
119(38)
Training and testing
120(3)
Predicting performance
123(2)
Cross-validation
125(2)
Other estimates
127(2)
Leave-one-out
127(1)
The bootstrap
128(1)
Comparing data mining schemes
129(4)
Predicting probabilities
133(4)
Quadratic loss function
134(1)
Informational loss function
135(1)
Discussion
136(1)
Counting the cost
137(10)
Lift charts
139(2)
ROC curves
141(3)
Cost-sensitive learning
144(1)
Discussion
145(2)
Evaluating numeric prediction
147(3)
The minimum description length principle
150(4)
Applying MDL to clustering
154(1)
Further reading
155(2)
Implementations: Real machine learning schemes
157(72)
Decision trees
159(11)
Numeric attributes
159(2)
Missing values
161(1)
Pruning
162(2)
Estimating error rates
164(3)
Complexity of decision tree induction
167(1)
From trees to rules
168(1)
C4.5: Choices and options
169(1)
Discussion
169(1)
Classification rules
170(18)
Criteria for choosing tests
171(1)
Missing values, numeric attributes
172(1)
Good rules and bad rules
173(1)
Generating good rules
174(1)
Generating good decision lists
175(2)
Probability measure for rule evaluation
177(1)
Evaluating rules using a test set
178(3)
Obtaining rules from partial decision trees
181(3)
Rules with exceptions
184(3)
Discussion
187(1)
Extending linear classification: Support vector machines
188(5)
The maximum margin hyperplane
189(2)
Nonlinear class boundaries
191(2)
Discussion
193(1)
Instance-based learning
193(8)
Reducing the number of exemplars
194(1)
Pruning noisy exemplars
194(1)
Weighting attributes
195(1)
Generalizing exemplars
196(1)
Distance functions for generalized exemplars
197(2)
Generalized distance functions
199(1)
Discussion
200(1)
Numeric prediction
201(9)
Model trees
202(1)
Building the tree
202(1)
Pruning the tree
203(1)
Nominal attributes
204(1)
Missing values
204(1)
Pseudo-code for model tree induction
205(3)
Locally weighted linear regression
208(1)
Discussion
209(1)
Clustering
210(19)
Iterative distance-based clustering
211(1)
Incremental clustering
212(5)
Category utility
217(1)
Probability-based clustering
218(3)
The EM algorithm
221(2)
Extending the mixture model
223(2)
Bayesian clustering
225(1)
Discussion
226(3)
Moving on: Engineering the input and output
229(36)
Attribute selection
232(6)
Scheme-independent selection
233(2)
Searching the attribute space
235(1)
Scheme-specific selection
236(2)
Discretizing numeric attributes
238(9)
Unsupervised discretization
239(1)
Entropy-based discretization
240(3)
Other discretization methods
243(1)
Entropy-based versus error-based discretization
244(2)
Converting discrete to numeric attributes
246(1)
Automatic data cleansing
247(3)
Improving decision trees
247(1)
Robust regression
248(1)
Detecting anomalies
249(1)
Combining multiple models
250(13)
Bagging
251(3)
Boosting
254(4)
Stacking
258(2)
Error-correcting output codes
260(3)
Further reading
263(2)
Nuts and bolts: Machine learning algorithms in Java
265(56)
Getting started
267(4)
Javadoc and the class library
271(6)
Classes, instances, and packages
272(1)
The weka.core package
272(2)
The weka.classifiers package
274(2)
Other packages
276(1)
Indexes
277(1)
Processing datasets using the machine learning programs
277(20)
Using M5'
277(2)
Generic options
279(3)
Scheme-specific options
282(1)
Classifiers
283(3)
Meta-learning schemes
286(3)
Filters
289(5)
Association rules
294(2)
Clustering
296(1)
Embedded machine learning
297(9)
A simple message classifier
299(7)
Writing new learning schemes
306(15)
An example classifier
306(8)
Conventions for implementing classifiers
314(1)
Writing filters
314(2)
An example filter
316(1)
Conventions for writing filters
317(4)
Looking forward
321(18)
Learning from massive datasets
322(3)
Visualizing machine learning
325(4)
Visualizing the input
325(2)
Visualizing the output
327(2)
Incorporating domain knowledge
329(2)
Text mining
331(4)
Finding key phrases for documents
331(2)
Finding information in running text
333(1)
Soft parsing
334(1)
Mining the World Wide Web
335(1)
Further reading
336(3)
References 339(12)
Index 351(20)
About the authors 371

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program