rent-now

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

9781848822597

Guide to Intelligent Data Analysis

by ; ; ;
  • ISBN13:

    9781848822597

  • ISBN10:

    1848822596

  • Format: Hardcover
  • Copyright: 2010-07-30
  • Publisher: Springer-Nature New York Inc
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $99.99 Save up to $76.59
  • Digital
    $50.70*
    Add to Cart

    DURATION
    PRICE
    *To support the delivery of the digital material to you, a digital delivery fee of $3.99 will be charged on each digital item.

Summary

This book provides a systematic overview and classification of tasks in data analysis, methods to solve them and typical problems encountered.Different views from classical and non-classical statistics like Bayesian inference and robust statistics, exploratory data analysis, data mining and machine learning are combined together to provide a better understanding of the methods, their potentials and limitations.Features:• Focuses on validation and pitfalls related to real world applications of these techniques• Presents different approaches, analysing their advantages and disadvantages for certain types of tasks including exploratory data analysis, data mining, classical statistics and robust statistics• Contains case studies and examples to enhance understanding• A supplementary website provides numerous hands-on examplesThis collective view of data analysis problems and methods, their potentials and limitations is an indispensable learning tool for graduate and advanced undergraduate students.

Author Biography

Dr. Michael R. Berthold is Nycomed-Professor of Bioinformatics and Information Mining at the University of Konstanz, Germany. Dr. Christian Borgelt is Principal Researcher at the Intelligent Data Analysis and Graphical Models Research Unit of the European Centre for Soft Computing, Spain. Dr. Frank Hppner is Professor of Information Systems at Ostfalia University of Applied Sciences, Germany. Dr. Frank Klawonn is a Professor in the Department of Computer Science and Head of the Data Analysis and Pattern Recognition Laboratory at Ostfalia University of Applied Sciences, Germany. He is also Head of the Bioinformatics and Statistics group at the Helmholtz Centre for Infection Research, Braunschweig, Germany.

Table of Contents

Introductionp. 1
Motivationp. 1
Data and Knowledgep. 2
Tycho Brahe and Johannes Keplerp. 4
Intelligent Data Analysisp. 6
The Data Analysis Processp. 7
Methods, Tasks, and Toolsp. 11
How to Read This Bookp. 13
Referencesp. 14
Practical Data Analysis: An Examplep. 15
The Setupp. 15
Data Understanding and Pattern Findingp. 16
Explanation Findingp. 20
Predicting the Futurep. 21
Concluding Remarksp. 23
Project Understandingp. 25
Determine the Project Objectivep. 26
Assess the Situationp. 28
Determine Analysis Goalsp. 30
Further Readingp. 31
Referencesp. 32
Data Understandingp. 33
Attribute Understandingp. 34
Data Qualityp. 37
Data Visualizationp. 40
Methods for One and Two Attributesp. 40
Methods for Higher-Dimensional Datap. 48
Correlation Analysisp. 59
Outlier Detectionp. 62
Outlier Detection for Single Attributesp. 63
Outlier Detection for Multidimensional Datap. 64
Missing Valuesp. 65
A Checklist for Data Understandingp. 68
Data Understanding in Practicep. 69
Data Understanding in KNIMEp. 70
Data Understanding in Rp. 73
Referencesp. 78
Principles of Modelingp. 81
Model Classesp. 82
Fitting Criteria and Score Functionsp. 85
Error Functions for Classification Problemsp. 87
Measures of Interestingnessp. 89
Algorithms for Model Fittingp. 89
Closed Form Solutionsp. 89
Gradient Methodp. 90
Combinatorial Optimizationp. 92
Random Search, Greedy Strategies, and Other Heuristicsp. 92
Types of Errorsp. 93
Experimental Errorp. 94
Sample Errorp. 99
Model Errorp. 100
Algorithmic Errorp. 101
Machine Learning Bias and Variancep. 101
Learning Without Bias?p. 102
Model Validationp. 102
Training and Test Datap. 102
Cross-Validationp. 103
Bootstrappingp. 104
Measures for Model Complexityp. 105
Model Errors and Validation in Practicep. 111
Errors and Validation in KNIMEp. 111
Validation in Rp. 111
Further Readingp. 113
Referencesp. 113
Data Preparationp. 115
Select Datap. 115
Feature Selectionp. 116
Dimensionality Reductionp. 121
Record Selectionp. 121
Clean Datap. 123
Improve Data Qualityp. 123
Missing Valuesp. 124
Construct Datap. 127
Provide Operabilityp. 127
Assure Impartiallyp. 129
Maximize Efficiencyp. 131
Complex Data Typesp. 134
Data Integrationp. 135
Vertical Data Integrationp. 136
Horizontal Data Integrationp. 136
Data Preparation in Practicep. 138
Data Preparation in KNIMEp. 139
Data Preparation in Rp. 141
Referencesp. 142
Finding Patternsp. 145
Hierarchical Clusteringp. 147
Overviewp. 148
Constructionp. 150
Variations and Issuesp. 152
Notion of (Dis-)Similarityp. 155
Prototype-and Model-Based Clusteringp. 162
Overviewp. 162
Constructionp. 164
Variations and Issuesp. 167
Density-Based Clusteringp. 169
Overviewp. 170
Constructionp. 171
Variations and Issuesp. 173
Self-organizing Mapsp. 175
Overviewp. 175
Constructionp. 176
Frequent Pattern Mining and Association Rulesp. 179
Overviewp. 179
Constructionp. 181
Variations and Issuesp. 187
Deviation Analysisp. 194
Overviewp. 194
Constructionp. 195
Variations and Issuesp. 197
Finding Patterns in Practicep. 198
Finding Patterns with KNIMEp. 199
Finding Patterns in Rp. 201
Further Readingp. 203
Referencesp. 204
Finding Explanationsp. 207
Decision Treesp. 208
Overviewp. 209
Constructionp. 210
Variations and Issuesp. 213
Bayes Classifiersp. 218
Overviewp. 218
Constructionp. 220
Variations and Issuesp. 224
Regressionp. 229
Overviewp. 230
Constructionp. 231
Variations and Issuesp. 234
Two Class Problemsp. 242
Rule learningp. 244
Prepositional Rulesp. 245
Inductive Logic Programming or First-Order Rulesp. 251
Finding Explanations in Practicep. 253
Finding Explanations with KNIMEp. 253
Using Explanations with Rp. 255
Further Readingp. 257
Referencesp. 258
Finding Predictorsp. 259
Nearest-Neighbor Predictorsp. 261
Overviewp. 261
Constructionp. 263
Variations and Issuesp. 265
Artifical Neural Networksp. 269
Overviewp. 269
Constructionp. 272
Variations and Issuesp. 276
Support Vector Machinesp. 277
Overviewp. 278
Constructionp. 282
Variations and Issuesp. 283
Ensemble Methodsp. 284
Overviewp. 284
Constructionp. 286
Further Readingp. 289
Finding Predictors in Practicep. 290
Finding Predictors with KNIMEp. 290
Using Predictors in Rp. 292
Referencesp. 294
Evaluation and Deploymentp. 297
Evaluationp. 297
Deployment and Monitoringp. 299
Referencesp. 301
Statisticsp. 303
Terms and Notationp. 304
Descriptive Statisticsp. 305
Tabular Representationsp. 305
Graphical Representationsp. 306
Characteristic Measures for One-Dimensional Datap. 309
Characteristic Measures for Multidimensional Datap. 316
Principal Component Analysisp. 318
Probability Theoryp. 323
Probabilityp. 323
Basic Methods and Theoremsp. 327
Random Variablesp. 333
Characteristic Measures of Random Variablesp. 339
Some Special Distributionsp. 343
Inferential Statisticsp. 349
Random Samplesp. 350
Parameter Estimationp. 351
Hypothesis Testingp. 361
The R Projectp. 369
Installation and Overviewp. 369
Reading Files and R Objectsp. 370
R Functions and Commandsp. 372
Libraries/Packagesp. 373
R Workspacep. 373
Finding Helpp. 374
Further Readingp. 374
Knimep. 375
Installation and Overviewp. 375
Building Workflowsp. 377
Example Flowp. 378
R Integrationp. 380
Referencesp. 383
p. 383
p. 383
Indexp. 385
Table of Contents provided by Ingram. All Rights Reserved.

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program