Next: Connection to mutual information
Up: Maximum Likelihood Estimation
Previous: The likelihood
Another related contrast function was derived from a neural network
viewpoint in [3,35]. This was based on maximizing the
output entropy (or information flow) of a neural network with
nonlinear outputs. Assume
that
is the input to the neural network whose outputs are of the
form
,
where the g_{i} are some nonlinear scalar functions, and the are the weight vectors of the neurons.
One then wants to maximize the entropy of the outputs:

(28) 
If the g_{i} are well chosen, this framework also enables the
estimation of the ICA model.
Indeed,
several authors, e.g., [4,37], proved the
surprising result that the principle of network entropy maximization,
or ``infomax'', is equivalent to
maximum likelihood estimation. This
equivalence requires that the nonlinearities g_{i} used in the
neural network are chosen as the cumulative distribution functions
corresponding to the densities f_{i}, i.e.,
g_{i}'(.)=f_{i}(.).
Aapo Hyvarinen
20000419