To rigorously define ICA [28,7], we can use a
statistical ``latent variables'' model.
Assume that we observe *n* linear mixtures
*x*_{1},...,*x*_{n} of *n*independent components

(1) |

We have now dropped the time index *t*; in the ICA model, we assume
that each mixture *x*_{j} as well as each independent component *s*_{k} is
a random variable, instead of a proper time signal. The observed values
*x*_{j}(*t*), e.g., the microphone
signals in the cocktail party problem, are then a sample of this
random variable. Without loss of generality, we can assume that both
the mixture variables and the independent components have zero mean:
If this is not true, then the observable variables *x*_{i} can always
be centered by subtracting the sample mean, which makes the model
zero-mean.

It is convenient to use vector-matrix notation instead of the sums
like in the previous equation. Let us denote by
the random vector
whose elements are the mixtures
*x*_{1}, ..., *x*_{n}, and likewise by the random vector with elements
*s*_{1}, ... , *s*_{n}. Let us denote by
the matrix with elements *a*_{ij}. Generally, bold lower case
letters indicate vectors and bold upper-case letters denote
matrices. All vectors are understood as column vectors; thus ,
or the transpose of ,
is a row vector. Using this vector-matrix
notation, the above mixing model is written as

Sometimes we need the columns of matrix ; denoting them by the model can also be written as

The statistical model in Eq. 4 is called independent component
analysis, or ICA model. The ICA model is a generative model, which means
that it describes how the observed data are generated by a process of
mixing the components *s*_{i}. The independent components
are latent variables, meaning that they cannot be directly observed.
Also the mixing matrix is assumed to be unknown.
All we observe is the random vector ,
and we must estimate both
and
using it. This must be done under as general assumptions
as possible.

The starting point for ICA is the very simple assumption that the
components *s*_{i} are statistically *independent*. Statistical
independence will be rigorously defined in Section 3. It
will be seen below that we must also assume that the independent
component must have *nongaussian* distributions. However, in the
basic model we do *not* assume these distributions known (if they
are known, the problem is considerably simplified.) For simplicity, we are
also assuming that the unknown
mixing matrix is square, but this assumption can be sometimes relaxed,
as explained in Section 4.5. Then, after estimating the
matrix ,
we can compute its inverse, say ,
and obtain the
independent component simply by:

ICA is very closely related to the method called *blind source
separation* (BSS) or blind signal separation. A ``source'' means here an
original signal, i.e. independent component, like the speaker in a
cocktail party problem. ``Blind'' means that we no very little, if
anything, on the mixing matrix, and make little assumptions on the
source signals. ICA is one method, perhaps the most widely used, for
performing blind source separation.

In many applications, it would be more realistic to assume that there is some noise in the measurements (see e.g. [17,21]), which would mean adding a noise term in the model. For simplicity, we omit any noise terms, since the estimation of the noise-free model is difficult enough in itself, and seems to be sufficient for many applications.