Next: Neural oneunit learning rules
Up: Algorithms for ICA
Previous: Algorithms for maximum likelihood
Nonlinear extensions of the wellknown neural PCA algorithms
[110,114,111]
were developed in [115].
For example, in [115], the following nonlinear
version of a hierarchical PCA
learning rule was introduced:



(38) 
where g is a suitable nonlinear scalar function. The
symmetric versions of the learning rules in [114,111] can be
extended for the nonlinear case in the same manner.
In [82], a connection between these algorithms and
nonlinear versions of PCA criteria (see Section 4.3.4) were
proven. In general, the introduction of
nonlinearities means that the learning rule uses higherorder
information in the learning. Thus, the learning rules may perform
something more related to the higherorder representation techniques
(projection pursuit, blind deconvolution, ICA).
In [84,112], it was proven that for wellchosen
nonlinearities, the learning rule in (38) does indeed perform
ICA, if the data is sphered (whitened). Algorithms for
exactly maximizing the nonlinear PCA criteria were introduced in
[113].
An interesting simplification of the nonlinear PCA algorithms is the
bigradient algorithm [145]. The feedback term in the learning
rule (38) is here
replaced by a much simpler one, giving

(39) 
where
is the learning rate (step size) sequence, is a constant on the range [.5,1], the function g is applied
separately on every component of the vector
,
and the data is
assumed to be sphered. A hierarchical version of the
bigradient algorithm is also possible. Due to the simplicity of the
bigradient algorithm, its properties can be analyzed in more
detail, as in [145] and [73].
Next: Neural oneunit learning rules
Up: Algorithms for ICA
Previous: Algorithms for maximum likelihood
Aapo Hyvarinen
19990423