next up previous
Next: Telecommunications Up: Applications of ICA Previous: Finding Hidden Factors in

Reducing Noise in Natural Images

The third example deals with finding ICA filters for natural images and, based on the ICA decomposition, removing noise from images corrupted with additive Gaussian noise.

A set of digitized natural images were used. Denote the vector of pixel gray levels in an image window by ${\bf x}$. Note that, contrary to the other two applications in the previous sections, we are not this time considering multivalued time series or images changing with time; instead the elements of ${\bf x}$ are indexed by the location in the image window or patch. The sample windows were taken at random locations. The 2-D structure of the windows is of no significance here: row by row scanning was used to turn a square image window into a vector of pixel values. The independent components of such image windows are represented in Fig. 4. Each window in this Figure corresponds to one of the columns ${\bf a}_i$ of the mixing matrix ${\bf A}$. Thus an observed image window is a superposition of these windows as in (5), with independent coefficients.

Now, suppose a noisy image model holds:

{\bf z}= {\bf x}+ {\bf n}
\end{displaymath} (45)

where ${\bf n}$ is uncorrelated noise, with elements indexed in the image window in the same way as ${\bf x}$, and ${\bf z}$ is the measured image window corrupted with noise. Let us further assume that ${\bf n}$ is Gaussian and ${\bf x}$ is non-Gaussian. There are many ways to clean the noise; one example is to make a transformation to spatial frequency space by DFT, do low-pass filtering, and return to the image space by IDFT [15]. This is not very efficient, however. A better method is the recently introduced Wavelet Shrinkage method [10] in which a transform based on wavelets is used, or methods based on median filtering [15]. None of these methods is explicitly taking advantage of the image statistics, however.

We have recently introduced another, statistically principled method called Sparse Code Shrinkage [22]. It is very closely related to independent component analysis. Briefly, if we model the density of ${\bf x}$ by ICA, and assume ${\bf n}$ Gaussian, then the Maximum Likelihood (ML) solution for ${\bf x}$ given the measurement ${\bf z}$ can be developed in the signal model (48).

The ML solution can be simply computed, albeit approximately, by using a decomposition that is an orthogonalized version of ICA. The transform is given by

\begin{displaymath}{\bf W}{\bf z}= {\bf W}{\bf x}+ {\bf W}{\bf n}= {\bf s}+ {\bf W}{\bf n},
\end{displaymath} (46)

where ${\bf W}$ is here an orthogonal matrix that is the best orthognal approximation of the inverse of the ICA mixing matrix. The noise term ${\bf W}{\bf n}$ is still Gaussian and white. With a suitably chosen orthogonal transform ${\bf W}$, however, the density of ${\bf W}{\bf x}= {\bf s}$ becomes highly non-Gaussian, e.g., super-Gaussian with a high positive kurtosis. This depends of course on the original ${\bf x}$signals, as we are assuming in fact that there exists a model ${\bf x}=
{\bf W}^T{\bf s}$ for the signal, such that the ``source signals'' or elements of ${\bf s}$ have a positive kurtotic density, in which case the ICA transform gives highly supergaussian components. This seems to hold at least for image windows of natural scenes [34].

It was shown in [22] that, assuming a Laplacian density for si, the ML solution for si is given by a ``shrinkage function'' $\hat{s_i} = g([{\bf W}{\bf z}]_i)$, or in vector form, $\hat{{\bf s}} =
g({\bf W}{\bf z})$. Function g(.) has a characteristic shape: it is zero close to the origin and then linear after a cutting value depending on the parameters of the Laplacian density and the Gaussian noise density. Assuming other forms for the densities, other optimal shrinkage functions can be derived [22].

In the Sparse Code Shrinkage method, the shrinkage operation is performed in the rotated space, after which the estimate for the signal in the original space is given by rotating back:

\begin{displaymath}\hat{{\bf x}} = {\bf W}^T \hat{{\bf s}} = {\bf W}^T g({\bf W}{\bf z}).
\end{displaymath} (47)

Thus we get the Maximum Likelihood estimate for the image window ${\bf x}$in which much of the noise has been removed.

The rotation operator ${\bf W}$ is such that the sparsity of the components ${\bf s}= {\bf W}{\bf x}$ is maximized. This operator can be learned with a modification of the FastICA algorithm; see [22] for details.

A noise cleaning result is shown in Fig. 15. A noiseless image and a noisy version, in which the noise level is 50 % of the signal level, are shown. The results of the Sparse Code Shrinkage method and classic wiener filtering are given, indicating that Sparse Code Shrinkage may be a promising approach. The noise is reduced without blurring edges or other sharp features as much as in wiener filtering. This is largely due to the strongly nonlinear nature of the shrinkage operator, that is optimally adapted to the inherent statistics of natural images.

Figure: (from [22]). An experiment in denoising. Upper left: original image. Upper right: original image corrupted with noise; the noise level is 50 %. Lower left: the recovered image after applying sparse code shrinkage. Lower right: for comparison, a wiener filtered image.
\resizebox{6cm}{!}{\includegraphics{/home/info/phoyer/research/thesis/tex/figs/denoise/jarmoN5clean.eps}} \resizebox{6cm}{!}{\includegraphics{/home/info/phoyer/research/thesis/tex/figs/denoise/jarmoN5.eps}}
\resizebox{6cm}{!}{\includegraphics{/home/info/phoyer/research/thesis/tex/figs/denoise/jarmoN5scs.eps}} \resizebox{6cm}{!}{\includegraphics{/home/info/phoyer/research/thesis/tex/figs/denoise/jarmoN5wiener.eps}}

next up previous
Next: Telecommunications Up: Applications of ICA Previous: Finding Hidden Factors in
Aapo Hyvarinen