Next: Jutten-Hérault algorithm Up: Algorithms for ICA Previous: Introduction

## Preprocessing of the data

Some ICA algorithms require a preliminary sphering or whitening of the data , and even those algorithms that do not necessarily need sphering, often converge better with sphered data. (Recall that the data has also been assumed to be centered, i.e., made zero-mean.) Sphering means that the observed variable of Eq. (11) is linearly transformed to a variable

 (30)

such that the covariance matrix of equals unity: . This transformation is always possible. Indeed, it can be accomplished by classical PCA [36,112,110,114]. In addition to sphering, PCA may allow us to determine the number of independent components (if m>n): if noise level is low, the energy of is essentially concentrated on the subspace spanned by the nfirst principal components, with n the number of independent components in the model (11). Several methods exist for estimating the number of signals (here, independent components), see [146,94,144,11]. Thus this reduction of dimension partially justifies the assumption m=n that was made in Section 3, and will be retained here. After sphering we have from (11) and (30):

 (31)

where is an orthogonal matrix, because

Recall that we assumed that the independent components si have unit variance. We have thus reduced the problem of finding an arbitrary matrix in model (11) to the simpler problem of finding an orthogonal matrix . Once is found, Eq. (31) is used to solve the independent components from the observed by

 (32)

It is also worthwhile to reflect why sphering alone does not solve the separation problem. This is because sphering is only defined up to an additional rotation: if is a sphering matrix, then is also a sphering matrix if and only if is an orthogonal matrix. Therefore, we have to find the correct sphering matrix that equally separates the independent components. This is done by first finding any sphering matrix , and later determining the appropriate orthogonal transformation from a suitable non-quadratic criterion.

In the following, we shall thus assume in certain sections that the data is sphered. For simplicity, the sphered data will be denoted by , and the transformed mixing matrix by , as in the definitions of Section 3. If an algorithm needs preliminary sphering, this is mentioned in the corresponding section. If no mention of sphering is made, none is needed.

Next: Jutten-Hérault algorithm Up: Algorithms for ICA Previous: Introduction
Aapo Hyvarinen
1999-04-23