Definition of ICA

Next: Ambiguities of ICA Up: Independent Component Analysis Previous: Independent Component Analysis

Definition of ICA

To rigorously define ICA [28,7], we can use a statistical ``latent variables'' model. Assume that we observe n linear mixtures x₁,...,x_n of nindependent components

$\begin{displaymath}x_j = a_{j1}s_1 + a_{j2}s_2 + ... + a_{jn}s_n, \text{ for all } j. \end{displaymath}$

(1)

We have now dropped the time index t; in the ICA model, we assume that each mixture x_j as well as each independent component s_k is a random variable, instead of a proper time signal. The observed values x_j(t), e.g., the microphone signals in the cocktail party problem, are then a sample of this random variable. Without loss of generality, we can assume that both the mixture variables and the independent components have zero mean: If this is not true, then the observable variables x_i can always be centered by subtracting the sample mean, which makes the model zero-mean.

It is convenient to use vector-matrix notation instead of the sums like in the previous equation. Let us denote by ${\bf x}$ the random vector whose elements are the mixtures x₁, ..., x_n, and likewise by ${\bf s}$ the random vector with elements s₁, ... , s_n. Let us denote by ${\bf A}$ the matrix with elements a_ij. Generally, bold lower case letters indicate vectors and bold upper-case letters denote matrices. All vectors are understood as column vectors; thus ${\bf x}^T$ , or the transpose of ${\bf x}$ , is a row vector. Using this vector-matrix notation, the above mixing model is written as

$\begin{displaymath} {\bf x}={\bf A}{\bf s}. \end{displaymath}$

(2)

Sometimes we need the columns of matrix ${\bf A}$ ; denoting them by ${\bf a}_j$ the model can also be written as

$\begin{displaymath} {\bf x}=\sum_{i=1}^n {\bf a}_i s_i. \end{displaymath}$

(3)

The statistical model in Eq. 4 is called independent component analysis, or ICA model. The ICA model is a generative model, which means that it describes how the observed data are generated by a process of mixing the components s_i. The independent components are latent variables, meaning that they cannot be directly observed. Also the mixing matrix is assumed to be unknown. All we observe is the random vector ${\bf x}$ , and we must estimate both ${\bf A}$ and ${\bf s}$ using it. This must be done under as general assumptions as possible.

The starting point for ICA is the very simple assumption that the components s_i are statistically independent. Statistical independence will be rigorously defined in Section 3. It will be seen below that we must also assume that the independent component must have nongaussian distributions. However, in the basic model we do not assume these distributions known (if they are known, the problem is considerably simplified.) For simplicity, we are also assuming that the unknown mixing matrix is square, but this assumption can be sometimes relaxed, as explained in Section 4.5. Then, after estimating the matrix ${\bf A}$ , we can compute its inverse, say ${\bf W}$ , and obtain the independent component simply by:

$\begin{displaymath} {\bf s}={\bf W}{\bf x}. \end{displaymath}$

(4)

ICA is very closely related to the method called blind source separation (BSS) or blind signal separation. A ``source'' means here an original signal, i.e. independent component, like the speaker in a cocktail party problem. ``Blind'' means that we no very little, if anything, on the mixing matrix, and make little assumptions on the source signals. ICA is one method, perhaps the most widely used, for performing blind source separation.

In many applications, it would be more realistic to assume that there is some noise in the measurements (see e.g. [17,21]), which would mean adding a noise term in the model. For simplicity, we omit any noise terms, since the estimation of the noise-free model is difficult enough in itself, and seems to be sufficient for many applications.

Next: Ambiguities of ICA Up: Independent Component Analysis Previous: Independent Component Analysis

Aapo Hyvarinen
2000-04-19