Preface

xvii

Introduction

(14)

Linear representation of multivariate data

(2)

The general statistical setting

(1)

Dimension reduction methods

(1)

Independence as a guiding principle

(1)

Blind source separation

(3)

Observing mixtures of unknown signals

(1)

Source separation based on independence

(1)

Independent component analysis

(5)

Definition

(1)

Applications

(1)

How to find the independent components

(4)

History of ICA

(4)

Part I MATHEMATICAL PRELIMINARIES

Random Vectors and Independence

(42)

Probability distributions and densities

(4)

Distribution of a random variable

(2)

Distribution of a random vector

(1)

Joint and marginal distributions

(1)

Expectations and moments

(5)

Definition and general properties

(1)

Mean vector and correlation matrix

(2)

Covariances and joint moments

(2)

Estimation of expectations

(1)

Uncorrelatedness and independence

(4)

Uncorrelatedness and whiteness

(3)

Statistical independence

(1)

Conditional densities and Bayes' rule

(3)

The multivariate gaussian density

(4)

Properties of the gaussian density

(2)

Central limit theorem

(1)

Density of a transformation

(1)

Higher-order statistics

(7)

Kurtosis and classification of densities

(3)

Cumulants, moments, and their properties

(3)

Stochastic processes*

(8)

Introduction and definition

(2)

Stationarity, mean, and autocorrelation

(1)

Wide-sense stationary processes

(2)

Time averages and ergodicity

(1)

Power spectrum

(1)

Stochastic signal models

(1)

Concluding remarks and references

(6)

Problems

(5)

Gradients and Optimization Methods

(20)

Vector and matrix gradients

(6)

Vector gradient

(2)

Matrix gradient

(1)

Examples of gradients

(3)

Taylor series expansions

(1)

Learning rules for unconstrained optimization

(10)

Gradient descent

(2)

Second-order learning

(2)

The natural gradient and relative gradient

(1)

Stochastic gradient descent

(3)

Convergence of stochastic on-line algorithms*

(2)

Learning rules for constrained optimization

(2)

The Lagrange method

(1)

Projection methods

(2)

Concluding remarks and references

(2)

Problems

(2)

Estimation Theory

(28)

Basic concepts

(2)

Properties of estimators

(4)

Method of moments

(2)

Least-squares estimation

(4)

Linear least-squares method

(2)

Nonlinear and generalized least squares*

(2)

Maximum likelihood method

(4)

Bayesian estimation*

(5)

Minimum mean-square error estimator

(2)

Wiener filtering

(1)

Maximum a posteriori (MAP) estimator

(2)

Concluding remarks and references

(6)

Problems

101

(4)

Information Theory

105

(20)

Entropy

105

(5)

Definition of entropy

105

(2)

Entropy and coding length

107

(1)

Differential entropy

108

(1)

Entropy of a transformation

109

(1)

Mutual information

110

(1)

Definition using entropy

110

(1)

Definition using Kullback-Leibler divergence

110

(1)

Maximum entropy

111

(1)

Maximum entropy distributions

111

(1)

Maximality property of gaussian distribution

112

(1)

Negentropy

112

(1)

Approximation of entropy by cumulants

113

(2)

Polynomial density expansions

113

(1)

Using expansions for entropy approximation

114

(1)

Approximation of entropy by nonpolynomial functions

115

(5)

Approximating the maximum entropy

116

(1)

Choosing the nonpolynomial functions

117

(1)

Simple special cases

118

(1)

Illustration

119

(1)

Concluding remarks and references

120

(5)

Problems

121

(1)

Appendix proofs

122

(3)

Principal Component Analysis and Whitening

125

(22)

Principal components

125

(7)

PCA by variance maximization

127

(1)

PCA by minimum MSE compression

128

(1)

Choosing the number of principal components

129

(2)

Closed-form computation of PCA

131

(1)

PCA by on-line learning

132

(6)

The stochastic gradient ascent algorithm

133

(1)

The subspace learning algorithm

134

(1)

The PAST algorithm*

135

(1)

PCA and back-propagation learning*

136

(1)

Extensions of PCA to nonquadratic criteria*

137

(1)

Factor analysis

138

(2)

Whitening

140

(1)

Orthogonalization

141

(2)

Concluding remarks and references

143

(4)

Problems

144

(3)

Part II BASIC INDEPENDENT COMPONENT ANALYSIS

What is Independent Component Analysis?

147

(18)

Motivation

147

(4)

Definition of independent component analysis

151

(4)

ICA as estimation of a generative model

151

(1)

Restrictions in ICA

152

(2)

Ambiguities of ICA

154

(1)

Centering the variables

154

(1)

Illustration of ICA

155

(3)

ICA is stronger that whitening

158

(3)

Uncorrelatedness and whitening

158

(2)

Whitening is only half ICA

160

(1)

Why gaussian variables are forbidden

161

(2)

Concluding remarks and references

163

(2)

Problems

164

(1)

ICA by Maximization of Nongaussianity

165

(38)

``Nongaussian is independent''

166

(5)

Measuring nongaussianity by kurtosis

171

(11)

Extrema give independent components

171

(4)

Gradient algorithm using kurtosis

175

(3)

A fast fixed-point algorithm using kurtosis

178

(1)

Examples

179

(3)

Measuring nongaussianity by negentropy

182

(10)

Critique of kurtosis

182

(1)

Negentropy as nongaussianity measure

182

(1)

Approximating negentropy

183

(2)

Gradient algorithm using negentropy

185

(3)

A fast fixed-point algorithm using negentropy

188

(4)

Estimating several independent components

192

(5)

Constraint of uncorrelatedness

192

(2)

Deflationary orthogonalization

194

(1)

Symmetric orthogonalization

194

(3)

ICA and projection pursuit

197

(1)

Searching for interesting directions

197

(1)

Nongaussian is interesting

197

(1)

Concluding remarks and references

198

(5)

Problems

199

(2)

Appendix proofs

201

(2)

ICA by Maximum Likelihood Estimation

203

(18)

The likelihood of the ICA model

203

(4)

Deriving the likelihood

203

(1)

Estimation of the densities

204

(3)

Algorithms for maximum likelihood estimation

207

(4)

Gradient algorithms

207

(2)

A fast fixed-point algorithm

209

(2)

The infomax principle

211

(2)

Examples

213

(1)

Concluding remarks and references

214

(7)

Problems

218

(1)

Appendix proofs

219

(2)

ICA by Minimization of Mutual Information

221

(8)

Defining ICA by mutual information

221

(2)

Information-theoretic concepts

221

(1)

Mutual information as measure of dependence

222

(1)

Mutual information and nongaussianity

223

(1)

Mutual information and likelihood

224

(1)

Algorithms for minimization of mutual information

224

(1)

Examples

225

(1)

Concluding remarks and references

225

(4)

Problems

227

(2)

ICA by Tensorial Methods

229

(10)

Definition of cumulant tensor

229

(1)

Tensor eigenvalues give independent components

230

(2)

Tensor decomposition by a power method

232

(2)

Joint approximate diagonalization of eigenmatrices

234

(1)

Weighted correlation matrix approach

235

(1)

The FOBI algorithm

235

(1)

From FOBI to JADE

235

(1)

Concluding remarks and references

236

(3)

Problems

237

(2)

ICA by Nonlinear Decorrelation and Nonlinear PCA

239

(24)

Nonlinear correlations and independence

240

(2)

The Herault-Jutten algorithm

242

(1)

The Cichocki-Unbehauen algorithm

243

(2)

The estimating functions approach*

245

(2)

Equivariant adaptive separation via independence

247

(2)

Nonlinear principal components

249

(2)

The nonlinear PCA criterion and ICA

251

(3)

Learning rules for the nonlinear PCA criterion

254

(7)

The nonlinear subspace rule

254

(1)

Convergence of the nonlinear subspace rule*

255

(3)

Nonlinear recursive least-squares rule

258

(3)

Concluding remarks and references

261

(2)

Problems

262

(1)

Practical Considerations

263

(10)

Preprocessing by time filtering

263

(4)

Why time filtering is possible

264

(1)

Low-pass filtering

265

(1)

High-pass filtering and innovations

265

(1)

Optimal filtering

266

(1)

Preprocessing by PCA

267

(2)

Making the mixing matrix square

267

(1)

Reducing noise and preventing overlearning

268

(1)

How many components should be estimated?

269

(2)

Choice of algorithm

271

(1)

Concluding remarks and references

272

(1)

Problems

272

(1)

Overview and Comparison of Basic ICA Methods

273

(20)

Objective functions vs. algorithms

273

(1)

Connections between ICA estimation principles

274

(2)

Similarities between estimation principles

274

(1)

Differences between estimation principles

275

(1)

Statistically optimal nonlinearities

276

(4)

Comparison of asymptotic variance*

276

(1)

Comparison of robustness*

277

(2)

Practical choice of nonlinearity

279

(1)

Experimental comparison of ICA algorithms

280

(7)

Experimental set-up and algorithms

281

(1)

Results for simulated data

282

(4)

Comparisons with real-world data

286

(1)

References

287

(1)

Summary of basic ICA

287

(6)

Appendix Proofs

289

(4)

Part III EXTENSIONS AND RELATED METHODS

Noisy ICA

293

(12)

Definition

293

(1)

Sensor noise vs. source noise

294

(1)

Few noise sources

295

(1)

Estimation of the mixing matrix

295

(4)

Bias removal techniques

296

(2)

Higher-order cumulant methods

298

(1)

Maximum likelihood methods

299

(1)

Estimation of the noise-free independent components

299

(4)

Maximum a posteriori estimation

299

(1)

Special case of shrinkage estimation

300

(3)

Denoising by sparse code shrinkage

303

(1)

Concluding remarks

304

(1)

ICA with Overcomplete Bases

305

(10)

Estimation of the independent components

306

(1)

Maximum likelihood estimation

306

(1)

The case of supergaussian components

307

(1)

Estimation of the mixing matrix

307

(6)

Maximizing joint likelihood

307

(1)

Maximizing likelihood approximations

308

(1)

Approximate estimation by quasiorthogonality

309

(2)

Other approaches

311

(2)

Concluding remarks

313

(2)

Nonlinear ICA

315

(26)

Nonlinear ICA and BSS

315

(4)

The nonlinear ICA and BSS problems

315

(2)

Existence and uniquess of nonlinear ICA

317

(2)

Separation of post-nonlinear mixtures

319

(1)

Nonlinear BSS using self-organizing maps

320

(2)

A generative topographic mapping approach*

322

(6)

Background

322

(1)

The modified GTM method

323

(3)

An experiment

326

(2)

An ensemble learning approach to nonlinear BSS

328

(9)

Ensemble learning

328

(1)

Model structure

329

(1)

Computing Kullback-Leibler cost function*

330

(2)

Learning procedure*

332

(1)

Experimental results

333

(4)

Other approaches

337

(2)

Concluding remarks

339

(2)

Methods using Time Structure

341

(14)

Separation by autocovariances

342

(4)

An alternative to nongaussianity

342

(1)

Using one time lag

343

(1)

Extension to several time lags

344

(2)

Separation by nonstationarity of variances

346

(5)

Using local autocorrelations

347

(2)

Using cross-cumulants

349

(2)

Separation principles unified

351

(3)

Comparison of separation principles

351

(1)

Kolmogoroff complexity as unifying framework

352

(2)

Concluding remarks

354

(1)

Convolutive Mixtures and Blind Deconvolution

355

(16)

Blind deconvolution

356

(5)

Problem definition

356

(1)

Bussgang methods

357

(1)

Cumulant-based methods

358

(2)

Blind deconvolution using linear ICA

360

(1)

Blind separation of convolutive mixtures

361

(7)

The convolutive BSS problem

361

(2)

Reformulation as ordinary ICA

363

(1)

Natural gradient methods

364

(1)

Fourier transform methods

365

(2)

Spatiotemporal decorrelation methods

367

(1)

Other methods for convolutive mixtures

367

(1)

Concluding remarks

368

(3)

Appendix Discrete-time filters and the z-transform

369

(2)

Other Extensions

371

(20)

Priors on the mixing matrix

371

(7)

Motivation for prior information

371

(1)

Classic priors

372

(2)

Sparse priors

374

(3)

Spatiotemporal ICA

377

(1)

Relaxing the independence assumption

378

(5)

Multidimensional ICA

379

(1)

Independent subspace analysis

380

(2)

Topographic ICA

382

(1)

Complex-valued data

383

(4)

Basic concepts of complex random variables

383

(1)

Indeterminacy of the independent components

384

(1)

Choice of the nongaussianity measure

385

(1)

Consistency of estimator

386

(1)

Fixed-point algorithm

386

(1)

Relation to independent subspaces

387

(1)

Concluding remarks

387

(4)

Part IV APPLICATIONS OF ICA

Feature Extraction by ICA

391

(16)

Linear representations

392

(4)

Definition

392

(1)

Gabor analysis

392

(2)

Wavelets

394

(2)

ICA and Sparse Coding

396

(2)

Estimating ICA bases from images

398

(1)

Image denoising by sparse code shrinkage

398

(3)

Component statistics

399

(1)

Remarks on windowing

400

(1)

Denoising results

401

(1)

Independent subspaces and topographic ICA

401

(2)

Neurophysiological connections

403

(2)

Concluding remarks

405

(2)

Brain Imaging Applications

407

(10)

Electro- and magnetoencephalography

407

(3)

Classes of brain imaging techniques

407

(1)

Measuring electric activity in the brain

408

(1)

Validity of the basic ICA model

409

(1)

Artifact identification from EEG and MEG

410

(1)

Analysis of evoked magnetic fields

411

(2)

ICA applied on other measurement techniques

413

(1)

Concluding remarks

414

(3)

Telecommunications

417

(24)

Multiuser detection and CDMA communications

417

(5)

CDMA signal model and ICA

422

(2)

Estimating fading channels

424

(6)

Minimization of complexity

424

(2)

Channel estimation*

426

(2)

Comparisons and discussion

428

(2)

Blind separation of convolved CDMA mixtures*

430

(4)

Feedback architecture

430

(1)

Semiblind separation method

431

(1)

Simulations and discussion

432

(2)

Improving multiuser detection using complex ICA*

434

(5)

Data model

435

(1)

ICA based receivers

436

(2)

Simulation results

438

(1)

Concluding remarks and references

439

(2)

Other Applications

441

(8)

Financial applications

441

(5)

Finding hidden factors in financial data

441

(2)

Time series prediction by ICA

443

(3)

Audio separation

446

(2)

Further applications

448

(1)

References

449

(27)

Index

476

Amazon no longer offers textbook rentals. We do!

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

Independent Component Analysis

9780471405405

047140540X

Supplemental Materials

Summary

Author Biography

Table of Contents

Supplemental Materials

Rewards Program