rent-now

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

9780521852678

Introduction to Clustering Large and High-Dimensional Data

by
  • ISBN13:

    9780521852678

  • ISBN10:

    0521852676

  • Format: Hardcover
  • Copyright: 2006-11-13
  • Publisher: Cambridge University Press
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $78.99 Save up to $34.83
  • Digital
    $44.16*
    Add to Cart

    DURATION
    PRICE
    *To support the delivery of the digital material to you, a digital delivery fee of $3.99 will be charged on each digital item.

Summary

There is a growing need for a more automated system of partitioning data sets into groups, or clusters. For example, digital libraries and the World Wide Web continue to grow exponentially, the ability to find useful information increasingly depends on the indexing infrastructure or search engine. Clustering techniques can be used to discover natural groups in data sets and to identify abstract structures that might reside there, without having any background knowledge of the characteristics of the data. Clustering has been used in a variety of areas, including computer vision, VLSI design, data mining, bio-informatics (gene expression analysis), and information retrieval, to name just a few. This book focuses on a few of the most important clustering algorithms, providing a detailed account of these major models in an information retrieval context. The beginning chapters introduce the classic algorithms in detail, while the later chapters describe clustering through divergences and show recent research for more advanced audiences.

Author Biography

Jacob Kogan is an Associate Professor in the Department of Mathematics and Statistics at the University of Maryland, Baltimore County.

Table of Contents

Forewordp. xi
Prefacep. xiii
Introduction and motivationp. 1
A way to embed ASCII documents into a finite dimensional Euclidean spacep. 3
Clustering and this bookp. 5
Bibliographic notesp. 6
Quadratic k-means algorithmp. 9
Classical batch k-means algorithmp. 10
Quadratic distance and centroidsp. 12
Batch k-means clustering algorithmp. 13
Batch k-means: advantages and deficienciesp. 14
Incremental algorithmp. 21
Quadratic functionsp. 21
Incremental k-means algorithmp. 25
Quadratic k-means: summaryp. 29
Numerical experiments with quadratic k-meansp. 29
Stable partitionsp. 31
Quadratic k-meansp. 35
Spectral relaxationp. 37
Bibliographic notesp. 38
Birchp. 41
Balanced iterative reducing and clustering algorithmp. 41
BIRCH-like k-meansp. 44
Bibliographic notesp. 49
Spherical k-means algorithmp. 51
Spherical batch k-means algorithmp. 51
Spherical batch k-means: advantages and deficienciesp. 53
Computational considerationsp. 55
Spherical two-cluster partition of one-dimensional datap. 57
One-dimensional line vs. the unit circlep. 57
Optimal two cluster partition on the unit circlep. 60
Spherical batch and incremental clustering algorithmsp. 64
First variation for spherical k-meansp. 65
Spherical incremental iterations-computations complexityp. 68
The "ping-pong" algorithmp. 69
Quadratic and spherical k-meansp. 71
Bibliographic notesp. 72
Linear algebra techniquesp. 73
Two approximation problemsp. 73
Nearest linep. 74
Principal directions divisive partitioningp. 77
Principal direction divisive partitioning (PDDP)p. 77
Spherical principal directions divisive partitioning (sPDDP)p. 80
Clustering with PDDP and sPDDPp. 82
Largest eigenvectorp. 87
Power methodp. 88
An application: hubs and authoritiesp. 88
Bibliographic notesp. 89
Information theoretic clusteringp. 91
Kullback-Leibler divergencep. 91
k-means with Kullback-Leibler divergencep. 94
Numerical experimentsp. 96
Distance between partitionsp. 98
Bibliographic notesp. 99
Clustering with optimization techniquesp. 101
Optimization frameworkp. 102
Smoothing k-means algorithmp. 103
Convergencep. 109
Numerical experimentsp. 114
Bibliographic notesp. 122
k-means clustering with divergencesp. 125
Bregman distancep. 125
ϕ-divergencesp. 128
Clustering with entropy-like distancesp. 132
BIRCH-type clustering with entropy-like distancesp. 135
Numerical experiments with (¿, ¿) k-meansp. 140
Smoothing with entropy-like distancesp. 144
Numerical experiments with (¿, ¿) smokap. 146
Bibliographic notesp. 152
Assessment of clustering resultsp. 155
Internal criteriap. 155
External criteriap. 156
Bibliographic notesp. 160
Appendix: Optimization and linear algebra backgroundp. 161
Eigenvalues of a symmetric matrixp. 161
Lagrange multipliersp. 163
Elements of convex analysisp. 164
Conjugate functionsp. 166
Asymptotic conesp. 169
Asymptotic functionsp. 173
Smoothingp. 176
Bibliographic notesp. 178
Solutions to selected problemsp. 179
Bibliographyp. 189
Indexp. 203
Table of Contents provided by Ingram. All Rights Reserved.

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program