Summary

Explores regular structures in graphs and contingency tables by spectral theory and statistical methods

This book bridges the gap between graph theory and statistics by giving answers to the demanding questions which arise when statisticians are confronted with large weighted graphs or rectangular arrays. Classical and modern statistical methods applicable to biological, social, communication networks, or microarrays are presented together with the theoretical background and proofs.

This book is suitable for a one-semester course for graduate students in data mining, multivariate statistics, or applied graph theory; but by skipping the proofs, the algorithms can also be used by specialists who just want to retrieve information from their data when analysing communication, social, or biological networks.

Spectral Clustering and Biclustering:

Provides a unified treatment for edge-weighted graphs and contingency tables via methods of multivariate statistical analysis (factoring, clustering, and biclustering).
Uses spectral embedding and relaxation to estimate multiway cuts of edge-weighted graphs and bicuts of contingency tables.
Goes beyond the expanders by describing the structure of dense graphs with a small spectral gap via the structural eigenvalues and eigen-subspaces of the normalized modularity matrix.
Treats graphs like statistical data by combining methods of graph theory and statistics.
Establishes a common outline structure for the contents of each algorithm, applicable to networks and microarrays, with unified notions and principles.

Author Biography

She is graduated from the Eötvös University of Budapest and holds a PhD (1984); further, a CSc degree (1993) from the Hungarian Academy of Sciences. Currently, she is a professor of the Institute of Mathematics, Budapest University of Technology and Economics and adjoint professor of the Central European University of Budapest. She also leads an undergraduate research course on Spectral Clustering in the Budapest Semester of Mathematics.

Her fields of expertise are multivariate statistics, applied graph theory, and data mining of social, biological, and communication networks. She has been working in various national and European research projects related to networks and data analysis.

She has published research papers in the Journal of Multivariate Analysis, Linear Algebra and Its Applications, Discrete Mathematics, Discrete Applied Mathematics, European Journal of Combinatorics, and the Physical Review E, among others.

She is the coauthor of the textbook in Hungarian: Bolla, M., Krámli, A., Theory of statistical inference, Typotex, Budapest (first ed. 2005, second ed. 2012) and another Hungarian book on multivariate statistical analysis. She was the managing editor of the book Contests in Higher Mathematics (ed. G. J. Székely), Springer, 1996.

Dedication

Preface

Acknowledgements

List of Abbreviations

Introduction

1 Multivariate analysis techniques for representing graphs and contingency tables

1.1 Quadratic placement problems for weighted graphs and hypergraphs

1.1.1 Representation of edge-weighted graphs

1.1.2 Representation of hypergraphs

1.1.3 Examples for spectra and representation of simple graphs

1.2 SVD of contingency tables and correspondence matrices

1.3 Normalized Laplacian and modularity spectra

1.4 Representation of joint distributions

1.4.1 General setup

1.4.2 Integral operators between L2 spaces

1.4.3 When the kernel is the joint distribution itself

1.4.4 Maximal correlation and optimal representations

1.5 Treating nonlinearities via reproducing kernel Hilbert spaces

1.5.1 Notion of the reproducing kernel

1.5.2 RKHS corresponding to a kernel

1.5.3 Two examples of an RKHS

1.5.4 Kernel – based on a sample – and the empirical feature map

References

2 Multiway cut problems

2.1 Estimating multiway cuts via spectral relaxation

2.1.1 Maximum, minimum, and ratio cuts of edge-weighted graphs

2.1.2 Multiway cuts of hypergraphs

2.2 Normalized cuts

2.3 The isoperimetric number and sparse cuts

2.4 The Newman–Girvan modularity

2.4.1 Maximizing the balanced Newman–Girvan modularity

2.4.2 Maximizing the normalized Newman–Girvan modularity

2.4.3 Anti-community structure and some examples

2.5 Normalized bicuts of contingency tables

References

3 Large networks, perturbation of block structures

3.1 Symmetric block structures burdened with random noise

3.1.1 General blown-up structures

3.1.2 Blown-up multipartite structures

3.1.3 Weak links between disjoint components

3.1.4 Recognizing the structure

3.1.5 Random power law graphs and the extended planted partition model

3.2 Noisy contingency tables

3.2.1 Singular values of a noisy contingency table

3.2.2 Clustering the rows and columns via singular vector pairs

3.2.3 Perturbation results for correspondence matrices

3.2.4 Finding the blown-up skeleton

3.3 Regular cluster pairs

3.3.1 Normalized modularity and volume regularity of edgeweighted graphs

3.3.2 Correspondence matrices and volume regularity of contingency tables

3.3.3 Directed graphs

References

4 Testable graph and contingency table parameters

4.1 Convergent graph sequences

4.2 Testability of weighted graph parameters

4.3 Testability of minimum balanced multiway cuts

4.4 Balanced cuts and fuzzy clustering

4.5 Noisy graph sequences

4.6 Convergence of the spectra and spectral subspaces

4.7 Convergence of contingency tables

References

5 Statistical learning of networks

5.1 Parameter estimation in random graph models

5.1.1 EMalgorithmfor estimating the parameters of the block model

5.1.2 Parameter estimation in the _ and _ models

5.2 Nonparametric methods for clustering networks

5.2.1 Spectral clustering of graphs and biclustering of contingency tables

5.2.2 Clustering of hypergraphs

5.3 Supervised learning

References

A Linear algebra and some functional analysis

A.1 Metric, normed vector, and Euclidean spaces

A.2 Hilbert spaces

A.3 Matrices

References

B Random vectors and matrices

B.1 Random vectors

B.2 Random matrices

References

C Multivariate statistical methods

C.1 Principal Component Analysis

C.2 Canonical Correlation Analysis

C.3 Correspondence Analysis

C.4 Multivariate Regression and Analysis of Variance

C.5 The k-means clustering

C.6 Multidimensional Scaling

C.7 Discriminant Analysis

References

Index

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Amazon no longer offers textbook rentals. We do!

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

Spectral Clustering and Biclustering Learning Large Graphs and Contingency Tables

9781118344927

1118344928

Supplemental Materials

Summary

Author Biography

Table of Contents

Supplemental Materials

Rewards Program