next up previous
Next: Jutten-Hérault algorithm Up: Algorithms for ICA Previous: Introduction

   
Preprocessing of the data

Some ICA algorithms require a preliminary sphering or whitening of the data ${\bf x}$, and even those algorithms that do not necessarily need sphering, often converge better with sphered data. (Recall that the data has also been assumed to be centered, i.e., made zero-mean.) Sphering means that the observed variable ${\bf x}$ of Eq. (11) is linearly transformed to a variable ${\bf v}$

 \begin{displaymath}
{\bf v}={\bf Q}{\bf x}
\end{displaymath} (30)

such that the covariance matrix of ${\bf v}$ equals unity: $E\{{\bf v}{\bf v}
^T\}={\bf I}$. This transformation is always possible. Indeed, it can be accomplished by classical PCA [36,112,110,114]. In addition to sphering, PCA may allow us to determine the number of independent components (if m>n): if noise level is low, the energy of ${\bf x}$ is essentially concentrated on the subspace spanned by the nfirst principal components, with n the number of independent components in the model (11). Several methods exist for estimating the number of signals (here, independent components), see [146,94,144,11]. Thus this reduction of dimension partially justifies the assumption m=n that was made in Section 3, and will be retained here. After sphering we have from (11) and (30):

 \begin{displaymath}
{\bf v}={\bf B}{\bf s}
\end{displaymath} (31)

where ${\bf B}={\bf Q}{\bf A}$ is an orthogonal matrix, because

\begin{displaymath}E\{{\bf v}{\bf v}^T\}={\bf B}E\{{\bf s}{\bf s}^T\} {\bf B}^T = {\bf B}{\bf B}^T={\bf I}
\end{displaymath}

Recall that we assumed that the independent components si have unit variance. We have thus reduced the problem of finding an arbitrary matrix ${\bf A}$in model (11) to the simpler problem of finding an orthogonal matrix ${\bf B}$. Once ${\bf B}$ is found, Eq. (31) is used to solve the independent components from the observed ${\bf v}$ by

 \begin{displaymath}
\hat{{\bf s}} = {\bf B}^T{\bf v}
\end{displaymath} (32)

It is also worthwhile to reflect why sphering alone does not solve the separation problem. This is because sphering is only defined up to an additional rotation: if ${\bf Q}_1$ is a sphering matrix, then ${\bf Q}_2={\bf U}{\bf Q}_1$ is also a sphering matrix if and only if ${\bf U}$ is an orthogonal matrix. Therefore, we have to find the correct sphering matrix that equally separates the independent components. This is done by first finding any sphering matrix ${\bf Q}$, and later determining the appropriate orthogonal transformation from a suitable non-quadratic criterion.

In the following, we shall thus assume in certain sections that the data is sphered. For simplicity, the sphered data will be denoted by ${\bf x}$, and the transformed mixing matrix by ${\bf A}$, as in the definitions of Section 3. If an algorithm needs preliminary sphering, this is mentioned in the corresponding section. If no mention of sphering is made, none is needed.


next up previous
Next: Jutten-Hérault algorithm Up: Algorithms for ICA Previous: Introduction
Aapo Hyvarinen
1999-04-23