We introduce a mixture of probabilistic canonical correlation
analyzers model for analyzing local correlations, or more generally
mutual statistical dependencies, in co-occurring data pairs. The
model extends the traditional canonical correlation analysis and its
probabilistic interpretation in three main ways. First, a full
Bayesian treatment enables analysis of small samples (large p,
small n, a crucial problem in bioinformatics, for instance),
and rigorous estimation of the degree of dependency and
independency. Secondly, the mixture formulation generalizes the method
from global linearity to the more reasonable assumption of different
kinds of dependencies for different kinds of data. As a third novel
extension the method decomposes the variation in
the data into shared and data set-specific components.
This work was supported by the Academy of Finland, decision number
207467, and in part by the IST Programme of the European Community,
under the PASCAL Network of Excellence, IST-2002-506778. This
publication only reflects the authors' views. All rights are reserved
because of other commitments.