Next: About this document ... Up: Fast and Robust Fixed-Point Previous: Proof of convergence of

Appendix: Adaptive neural algorithms

Let us consider sphered data only. Taking the instantaneous gradient of the approximation of negentropy in (7) with respect to ${\bf w}$ , and taking the normalization $\Vert{\bf w}\Vert^2=1$ into account, one obtains the following Hebbian-like learning rule

$\begin{displaymath} \Delta {\bf w}\propto r {\bf x}g({\bf w}^T{\bf x}),\: \mbox{normalize}\: {\bf w} \end{displaymath}$

(38)

where $r=E\{G({\bf w}^T{\bf x})\}-E\{G(\nu)\}$ . This is equivalent to the learning rule in [24], except that the self-adaptation constant r is different.

To find the whole n-dimensional transform ${\bf s}={\bf W}{\bf x}$ , one can then use a network of n neurons, each of which learns according to eq. (38). Of course, some kind of feedback is then necessary. In [24], it was shown how to add a bigradient feedback to the learning rule. Denoting by ${\bf W}=({\bf w}_1,...,{\bf w}_n)^T$ the weight matrix whose rows are the weight vectors ${\bf w}_i$ of the neurons, we obtain:

$\begin{displaymath} \begin{split} {\bf W}(t+1)={\bf W}(t)+\mu(t) \mbox{diag}(r_i... ...rac{1}{2}({\bf I}-{\bf W}(t){\bf W}(t)^T){\bf W}(t) \end{split}\end{displaymath}$

(39)

where $\mu(t)$ is the learning rate sequence, and the function g(.)=G'(.) is applied separately on every component of the vector ${\bf W}(t) {\bf x}(t)$ . In this most general version of the learning rule, the r_i, i=1...n are estimated separately for each neuron, as given above (see also [24]). They may also be fixed using prior knowledge.

Next: About this document ... Up: Fast and Robust Fixed-Point Previous: Proof of convergence of

Aapo Hyvarinen
1999-04-23