STATISTICAL MACHINE LEARNING AND BIOINFORMATICS
The research group is based in Aalto University School of Science and Technology
(Department of Information and
Computer Science), and partly in
University of Helsinki (Department
of Computer Science). We are members of AIRC
(Adaptive Informatics Research
Centre, national CoE), HIIT (Helsinki
Institute for Information Technology), and PASCAL
(Pattern Analysis,
Statistical Modelling and Computational Learning, EU FP6 NoE).
The group develops machine learning methods for statistical data
mining, information visualization, exploratory data analysis, and in
general for probabilistic modeling of data. By machine learning we
mean flexible statistical models usable in several applications.
The methods are being developed in bioinformatics and information
retrieval projects, where we collaborate with groups of the
application areas. The idea is to use the applications as testbenches
for the methods, and the methods for solving research problems
in the application area.
Our current methodological foci include discriminative generative
modeling, data fusion by modeling dependencies between data sets,
supervised unsupervised learning, and models for defining and
extracting "relevant" signals from data.
- STATISTICAL MACHINE LEARNING AND DATA MINING
Basic research on the methods
- BIOINFORMATICS
Methods for genomics, functional genomics, and systems biology
- PROACTIVE INFORMATION RETRIEVAL
Implicit user feedback for information retrieval.
UPCOMING EVENTS
- 14 - 17 June 2011, ICANN 2011, Espoo (Finland):
The Twentieth Anniversary ICANN is back at its roots: Machine learning re-inspired by brain and cognition
JOBS
For positions in the group, please follow job adverts on the pages of
HIIT
and the
Department of Information and Computer Science, or contact directly Samuel Kaski or one of the postdocs. Contact info below.
PERSONNEL
JOURNAL CLUB
Journal club on machine learning and
bioinformatics
SOFTWARE
Released packages (stable)
pint - Pairwise INTegration of functional genomics data (R/BioConductor)
RPA - probe reliability and differential gene expression analysis for short oligonucleotide arrays (R/BioConductor)
NetResponse Functional network
analysis. Applicable for global modeling of context-specific
transcriptional responses in genome-scale interaction networks (R /
Matlab)
Development versions (useful and functioning alpha/beta)
DCA - discriminative component analysis (Python/C, beta-version)
drCCA data fusion package (R, BMC Bioinformatics)
dredviz - dimensionality reduction for visualization
ICMg - Interaction component models for gene modules
multiWayCCA - Multivariate multi-way analysis of multi-source data (R, ISMB'10)
Probabilistic retrieval and visualization of biologically relevant microarray experiments
(ISMB'09, supplementary material)
Treebic - Hierarchical biclustering (C++, RECOMB'10)
vbmcca - Variational Bayesian mixture of robust CCAs (MATLAB)
Experimental code
Dependency modeling
toolbox. Ongoing project to integrate various dependency modeling
approaches into a unified framework.
FUN BLOG
|
News:
13.10.2010 We are involved in organizing ICANN 2011 (Call for papers). The 20th Anniversary ICANN links Machine Learning with its Neural and Cognitive inspirations.
28.09.2010 Bahman Khanloo joins the group as a PhD student
21.09.2010 Talks at ECML/PKDD 2010: Variational Bayesian mixture of robust CCA models (Jaakko Viinikanoja), Graphical multi-way models (Ilkka Huopaniemi)
09.09.2010 Special course on Learning from multiple sources starts (T-61.6040)
04.09.2010 Samuel Kaski gives an invited talk at Cancer Bioinformatics Workshop, Cambridge UK, on Learning and retrieval from multiple sources.
01.09.2010 We helped organizing MLSP 2010
30.08.2010 Mehmet Gönen joins the group as a postdoc
20.08.2010 We participated in making the first actual Education event of the ICT Labs of EIT, COMP-IT, and FICS summer school
13.08.2010 José Caldas presents Hierarchical generative biclustering for microRNA expression analysis at RECOMB 2010
25.07.2010 Juuso Parkkinen presents the paper
Graph visualization with latent variable models at MLG 2010
12.07.2010 Ilkka Huopaniemi gives a talk on
Multivariate multi-way analysis of multi-source data
at ISMB 2010
09.07.2010 Arto Klami presents
Bayesian exponential family projections for coupled data sources at UAI 2010
18.06.2010 Samuel Kaski has been appointed as the new Director of Helsinki Institute for Information Technology (HIIT) for a fixed five-year term, starting from August 1st
18.06.2010 Doctoral defense of Eerika Savia: "Mutual dependency-based modeling of relevance in co-occurrence data". The opponent is Dr. Michal Rosen-Zvi, IBM Research Centre, Haifa, Israel.
12.06.2010 Upcoming workshop presentations at ICML 2010 in Haifa, Israel, on the 25th of June:
"Dependency modeling toolbox", Workshop on Machine Learning Open Source Software (MLOSS)
"Pinview: implicit feedback in content-based image retrieval", Reinforcement Learning and Search in Very Large Spaces
27.05.2010 Samuel Kaski gives an invited talk about "Networks, visualization and retrieval" at the Statistical Mechanics of Learning and Inference workshop
17.03.2010 Submission to MLSP 2010 is open (Deadline: 11.04.2010)
11.03.2010 Special Course in Bioinformatics on Information Retrieval from Biological Databases (T-61.6070) starts on Monday 15.03. 12:15
08.03.2010 Zak Hussain, Alex Leung, and Kitsuchart Pasupa visit the group for two weeks
28.02.2010 Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization recently published in JMLR
22.01.2010 Researcher positions and student summer jobs:
Postdoc positions at the ICS Department (closed 15.03.)
Postdoc and senior researcher positions at HIIT (closed 15.03.)
Student summer jobs at the ICS Department (closed 29.01.)
Student summer jobs at HIIT (closed 19.02.)
Old news...
|
|