next up previous
Next: FastICA and maximum likelihood Up: The FastICA Algorithm Previous: FastICA for one unit

   
FastICA for several units

The one-unit algorithm of the preceding subsection estimates just one of the independent components, or one projection pursuit direction. To estimate several independent components, we need to run the one-unit FastICA algorithm using several units (e.g. neurons) with weight vectors ${\bf w}_1,...,{\bf w}_n$.

To prevent different vectors from converging to the same maxima we must decorrelate the outputs ${\bf w}_1^T{\bf x},...,{\bf w}_n^T{\bf x}$ after every iteration. We present here three methods for achieving this.

A simple way of achieving decorrelation is a deflation scheme based on a Gram-Schmidt-like decorrelation. This means that we estimate the independent components one by one. When we have estimated p independent components, or p vectors ${\bf w}_1,...,{\bf w}_p$, we run the one-unit fixed-point algorithm for ${\bf w}_{p+1}$, and after every iteration step subtract from ${\bf w}_{p+1}$the ``projections'' ${\bf w}_{p+1}^T{\bf w}_j {\bf w}_j, j=1,...,p$of the previously estimated p vectors, and then renormalize ${\bf w}_{p+1}$:

 \begin{displaymath}
\begin{array}{l}
\mbox{ 1. Let } {\bf w}_{p+1}={\bf w}_{p+1}...
...={\bf w}_{p+1}/\sqrt{{\bf w}_{p+1}^T{\bf w}_{p+1}}
\end{array}\end{displaymath} (40)

In certain applications, however, it may be desired to use a symmetric decorrelation, in which no vectors are ``privileged'' over others [29]. This can be accomplished, e.g., by the classical method involving matrix square roots,

 \begin{displaymath}
\mbox{Let }{\bf W}= ({\bf W}{\bf W}^T)^{-1/2} {\bf W}
\end{displaymath} (41)

where ${\bf W}$ is the matrix $({\bf w}_1,...,{\bf w}_n)^T$ of the vectors, and the inverse square root $({\bf W}{\bf W}^T)^{-1/2}$ is obtained from the eigenvalue decomposition of ${\bf W}{\bf W}^T = {\bf F}{\bf\Lambda}{\bf F}^T$as $({\bf W}{\bf W}^T)^{-1/2}={\bf F}{\bf\Lambda}^{-1/2}{\bf F}^T$. A simpler alternative is the following iterative algorithm [19],

 \begin{displaymath}
\begin{array}{l}
\mbox{1. Let } {\bf W}={\bf W}/\sqrt{\Vert...
...{3}{2}{\bf W}-\frac{1}{2}{\bf W}{\bf W}^T{\bf W}\\
\end{array}\end{displaymath} (42)

The norm in step 1 can be almost any ordinary matrix norm, e.g., the 2-norm or the largest absolute row (or column) sum (but not the Frobenius norm).


next up previous
Next: FastICA and maximum likelihood Up: The FastICA Algorithm Previous: FastICA for one unit
Aapo Hyvarinen
2000-04-19