next up previous
Next: Contrast Functions for ICA Up: Fast and Robust Fixed-Point Previous: Fast and Robust Fixed-Point

Introduction

A central problem in neural network research, as well as in statistics and signal processing, is finding a suitable representation or transformation of the data. For computational and conceptual simplicity, the representation is often sought as a linear transformation of the original data. Let us denote by ${\bf x}=(x_1,x_2,...,x_m)^T$ a zero-mean m-dimensional random variable that can be observed, and by ${\bf s}=(s_1,s_2,...,s_n)^T$ its n-dimensional transform. Then the problem is to determine a constant (weight) matrix ${\bf W}$ so that the linear transformation of the observed variables

 \begin{displaymath}
{\bf s}={\bf W}{\bf x}
\end{displaymath} (1)

has some suitable properties. Several principles and methods have been developed to find such a linear representation, including principal component analysis [30], factor analysis [15], projection pursuit [12,16], independent component analysis [27], etc. The transformation may be defined using such criteria as optimal dimension reduction, statistical 'interestingness' of the resulting components si, simplicity of the transformation, or other criteria, including application-oriented ones.

We treat in this paper the problem of estimating the transformation given by (linear) independent component analysis (ICA) [7,27]. As the name implies, the basic goal in determining the transformation is to find a representation in which the transformed components si are statistically as independent from each other as possible. Thus this method is a special case of redundancy reduction [2].

Two promising applications of ICA are blind source separation and feature extraction. In blind source separation [27], the observed values of ${\bf x}$ correspond to a realization of an m-dimensional discrete-time signal ${\bf x}(t)$, t=1,2,.... Then the components si(t) are called source signals, which are usually original, uncorrupted signals or noise sources. Often such sources are statistically independent from each other, and thus the signals can be recovered from linear mixtures xi by finding a transformation in which the transformed signals are as independent as possible, as in ICA. In feature extraction [4,25], si is the coefficient of the i-th feature in the observed data vector ${\bf x}$. The use of ICA for feature extraction is motivated by results in neurosciences that suggest that the similar principle of redundancy reduction [2,32] explains some aspects of the early processing of sensory data by the brain. ICA has also applications in exploratory data analysis in the same way as the closely related method of projection pursuit [16,12].

In this paper, new objective (contrast) functions and algorithms for ICA are introduced. Starting from an information-theoretic viewpoint, the ICA problem is formulated as minimization of mutual information between the transformed variables si, and a new family of contrast functions for ICA is introduced (Section 2). These contrast functions can also be interpreted from the viewpoint of projection pursuit, and enable the sequential (one-by-one) extraction of independent components. The behavior of the resulting estimators is then evaluated in the framework of the linear mixture model, obtaining guidelines for choosing among the many contrast functions contained in the introduced family. Practical choice of the contrast function is discussed as well, based on the statistical criteria together with some numerical and pragmatic criteria (Section 3). For practical maximization of the contrast functions, we introduce a novel family of fixed-point algorithms (Section 4). These algorithms are shown to have very appealing convergence properties. Simulations confirming the usefulness of the novel contrast functions and algorithms are reported in Section 5, together with references to real-life experiments using these methods. Some conclusions are drawn in Section 6.


next up previous
Next: Contrast Functions for ICA Up: Fast and Robust Fixed-Point Previous: Fast and Robust Fixed-Point
Aapo Hyvarinen
1999-04-23