rent-now

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

9783642333972

Understanding High-Dimensional Spaces

by
  • ISBN13:

    9783642333972

  • ISBN10:

    3642333974

  • Format: Paperback
  • Copyright: 2012-09-27
  • Publisher: Springer-Nature New York Inc
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $69.99 Save up to $50.19
  • Digital
    $42.90*
    Add to Cart

    DURATION
    PRICE
    *To support the delivery of the digital material to you, a digital delivery fee of $3.99 will be charged on each digital item.

Summary

High-dimensional spaces arise as a way of modelling datasets with many attributes. Such a dataset can be directly represented in a space spanned by its attributes, with each record represented as a point in the space with its position depending on its attribute values. Such spaces are not easy to work with because of their high dimensionality: our intuition about space is not reliable, and measures such as distance do not provide as clear information as we might expect. There are three main areas where complex high dimensionality and large datasets arise naturally: data collected by online retailers, preference sites, and social media sites, and customer relationship databases, where there are large but sparse records available for each individual; data derived from text and speech, where the attributes are words and so the corresponding datasets are wide, and sparse; and data collected for security, defense, law enforcement, and intelligence purposes, where the datasets are large and wide. Such datasets are usually understood either by finding the set of clusters they contain or by looking for the outliers, but these strategies conceal subtleties that are often ignored. In this book the author suggests new ways of thinking about high-dimensional spaces using two models: a skeleton that relates the clusters to one another; and boundaries in the empty space between clusters that provide new perspectives on outliers and on outlying regions. The book will be of value to practitioners, graduate students and researchers.

Table of Contents

Introductionp. 1
A Natural Representation of Data Similarityp. 3
Goalsp. 8
Outlinep. 10
Basic Structure of High-Dimensional Spacesp. 13
Comparing Attributesp. 13
Comparing Recordsp. 14
Similarityp. 14
High-Dimensional Spacesp. 16
Summaryp. 18
Algorithmsp. 19
Improving the Natural Geometryp. 19
Projectionp. 20
Singular Value Decompositionsp. 20
Random Projectionsp. 22
Algorithms that Find Standalone Clustersp. 23
Clusters Based on Densityp. 23
Parallel Coordinatesp. 24
Independent Component Analysisp. 24
Latent Dirichlet Allocationp. 25
Algorithms that Find Clusters and Their Relationshipsp. 25
Clusters Based on Distancep. 25
Clusters Based on Distributionp. 26
Semidiscrete Decompositionp. 27
Hierarchical Clusteringp. 29
Minimum Spanning Tree with Collapsingp. 29
Overall Process for Constructing a Skeletonp. 30
Algorithms that Wrap Clustersp. 31
Distance-Basedp. 32
Distribution-Basedp. 32
1-Class Support Vector Machinesp. 32
Autoassociative Neural Networksp. 33
Coversp. 34
Algorithms to Place Boundaries Between Clustersp. 34
Support Vector Machinesp. 35
Random Forestsp. 35
Overall Process for Constructing Empty Spacep. 36
Summaryp. 37
Spaces with a Single Centerp. 39
Using Distancep. 39
Using Densityp. 40
Understanding the Skeletonp. 42
Understanding Empty Spacep. 43
Summaryp. 45
Spaces with Multiple Centersp. 47
What is a Cluster?p. 48
Identifying Clustersp. 50
Clusters Known Alreadyp. 50
Finding Clustersp. 50
Finding the Skeletonp. 55
Empty Spacep. 58
An Outer Boundary and Novel Datap. 58
Interesting Datap. 60
One-Cluster Boundariesp. 63
One-Cluster-Against-the-Rest Boundariesp. 63
Summaryp. 64
Representation by Graphsp. 67
Building a Graph from Recordsp. 68
Local Similaritiesp. 68
Embedding Choicesp. 69
Using the Embedding for Clusteringp. 70
Summaryp. 71
Using Models of High-Dimensional Spacesp. 73
Understanding Clustersp. 73
Structure in the Set of Clustersp. 76
Semantic Stratified Samplingp. 77
Ranking Using the Skeletonp. 78
Ranking Using Empty Spacep. 87
Applications to Streaming Datap. 89
Concealmentp. 90
Summaryp. 91
Including Contextual Informationp. 93
What is Context?p. 93
Changing Datap. 93
Changing Analyst and Organizational Propertiesp. 94
Changing Algorithmic Propertiesp. 95
Letting Context Change the Modelsp. 95
Recomputing the Viewp. 95
Recomputing Derived Structuresp. 96
Recomputing the Clusteringp. 97
Summaryp. 98
Conclusionsp. 99
Referencesp. 103
Indexp. 107
Table of Contents provided by Ingram. All Rights Reserved.

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program