next up previous
Next: About this document ... Up: Fast and Robust Fixed-Point Previous: Proof of convergence of

   
Appendix: Adaptive neural algorithms

Let us consider sphered data only. Taking the instantaneous gradient of the approximation of negentropy in (7) with respect to ${\bf w}$, and taking the normalization $\Vert{\bf w}\Vert^2=1$ into account, one obtains the following Hebbian-like learning rule

 \begin{displaymath}
\Delta {\bf w}\propto r {\bf x}g({\bf w}^T{\bf x}),\: \mbox{normalize}\: {\bf w}
\end{displaymath} (38)

where $r=E\{G({\bf w}^T{\bf x})\}-E\{G(\nu)\}$. This is equivalent to the learning rule in [24], except that the self-adaptation constant r is different.

To find the whole n-dimensional transform ${\bf s}={\bf W}{\bf x}$, one can then use a network of n neurons, each of which learns according to eq. (38). Of course, some kind of feedback is then necessary. In [24], it was shown how to add a bigradient feedback to the learning rule. Denoting by ${\bf W}=({\bf w}_1,...,{\bf w}_n)^T$the weight matrix whose rows are the weight vectors ${\bf w}_i$ of the neurons, we obtain:

 \begin{displaymath}
\begin{split}
{\bf W}(t+1)={\bf W}(t)+\mu(t) \mbox{diag}(r_i...
...rac{1}{2}({\bf I}-{\bf W}(t){\bf W}(t)^T){\bf W}(t)
\end{split}\end{displaymath} (39)

where $\mu(t)$ is the learning rate sequence, and the function g(.)=G'(.) is applied separately on every component of the vector ${\bf W}(t) {\bf x}(t)$. In this most general version of the learning rule, the ri, i=1...n are estimated separately for each neuron, as given above (see also [24]). They may also be fixed using prior knowledge.


next up previous
Next: About this document ... Up: Fast and Robust Fixed-Point Previous: Proof of convergence of
Aapo Hyvarinen
1999-04-23