Laboratory of Computer and Information Science / Neural Networks Research Centre CIS Lab Helsinki University of Technology


The research group is based in Aalto University School of Science and Technology (Department of Information and Computer Science), and partly in University of Helsinki (Department of Computer Science). We are members of AIRC (Adaptive Informatics Research Centre, national CoE), HIIT (Helsinki Institute for Information Technology), and PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning, EU FP6 NoE).

The group develops machine learning methods for statistical data mining, information visualization, exploratory data analysis, and in general for probabilistic modeling of data. By machine learning we mean flexible statistical models usable in several applications.

The methods are being developed in bioinformatics and information retrieval projects, where we collaborate with groups of the application areas. The idea is to use the applications as testbenches for the methods, and the methods for solving research problems in the application area.

Our current methodological foci include discriminative generative modeling, data fusion by modeling dependencies between data sets, supervised unsupervised learning, and models for defining and extracting "relevant" signals from data.

    Basic research on the methods
    Methods for genomics, functional genomics, and systems biology
    Implicit user feedback for information retrieval.


  • 14 - 17 June 2011, ICANN 2011, Espoo (Finland):
    The Twentieth Anniversary ICANN is back at its roots: Machine learning re-inspired by brain and cognition


For positions in the group, please follow job adverts on the pages of HIIT and the Department of Information and Computer Science, or contact directly Samuel Kaski or one of the postdocs. Contact info below.


Group Leader
Samuel Kaski
Elisabeth Georgii
Elisabeth Georgii
Mehmet Gönen
Mehmet Gönen

Arto Klami
Arto Klami
Gayle Leen
Gayle Leen

Jaakko Peltonen
Jaakko Peltonen

Doctoral Students
Antti Ajanki
Antti Ajanki

José Caldas
José Caldas

Ali Faisal
Ali Faisal

Ilkka Huopaniemi
Ilkka Huopaniemi

ih Kandemir
Melih Kandemir
Suleiman Ali Khan
Suleiman Ali Khan

Bahman Khanloo
Bahman Khanloo
Leo Lahti
Leo Lahti
Maija Nevala
Maija Nevala

Kristian Nybo
Kristian Nybo

Juuso Parkkinen
Juuso Parkkinen

Tommi Suvitaival
Tommi Suvitaival

No picture
Jussi Gillberg

No picture
Eemeli Leppäaho

Maria Osmala
Maria Osmala

Jaakko Viinikanoja
Jaakko Viinikanoja

Seppo Virtanen
Seppo Virtanen

Alumni (Doctors, Postdocs)

Janne Sinkkonen

Janne Nikkilä

Merja Oja

Jarkko Salojärvi

Eerika Savia

Hasan Ogul

Jarkko Venna

Jussi Kujala
Visitors and Former Students

Yusuf Yaslan

Nils Gehlenborg

Sourangshu Bhattacharya

Sounak Chakraborty

Alba Martinez-Ruiz

Indrė Žliobaitė

Abhishek Tripathi

László Kozma

Helena Aidos

Andrey Ermolov

Pejman Mohammadi


Journal club on machine learning and bioinformatics


Released packages (stable)

pint - Pairwise INTegration of functional genomics data (R/BioConductor)

RPA - probe reliability and differential gene expression analysis for short oligonucleotide arrays (R/BioConductor)

NetResponse Functional network analysis. Applicable for global modeling of context-specific transcriptional responses in genome-scale interaction networks (R / Matlab)

Development versions (useful and functioning alpha/beta)

DCA - discriminative component analysis (Python/C, beta-version)

drCCA data fusion package (R, BMC Bioinformatics)

dredviz - dimensionality reduction for visualization

ICMg - Interaction component models for gene modules

multiWayCCA - Multivariate multi-way analysis of multi-source data (R, ISMB'10)

Probabilistic retrieval and visualization of biologically relevant microarray experiments (ISMB'09, supplementary material)

Treebic - Hierarchical biclustering (C++, RECOMB'10)

vbmcca - Variational Bayesian mixture of robust CCAs (MATLAB)

Experimental code

Dependency modeling toolbox. Ongoing project to integrate various dependency modeling approaches into a unified framework.



13.10.2010 We are involved in organizing ICANN 2011 (Call for papers). The 20th Anniversary ICANN links Machine Learning with its Neural and Cognitive inspirations.

28.09.2010 Bahman Khanloo joins the group as a PhD student

21.09.2010 Talks at ECML/PKDD 2010: Variational Bayesian mixture of robust CCA models (Jaakko Viinikanoja), Graphical multi-way models (Ilkka Huopaniemi)

09.09.2010 Special course on Learning from multiple sources starts (T-61.6040)

04.09.2010 Samuel Kaski gives an invited talk at Cancer Bioinformatics Workshop, Cambridge UK, on Learning and retrieval from multiple sources.

01.09.2010 We helped organizing MLSP 2010

30.08.2010 Mehmet Gönen joins the group as a postdoc

20.08.2010 We participated in making the first actual Education event of the ICT Labs of EIT, COMP-IT, and FICS summer school

13.08.2010 José Caldas presents Hierarchical generative biclustering for microRNA expression analysis at RECOMB 2010

25.07.2010 Juuso Parkkinen presents the paper Graph visualization with latent variable models at MLG 2010

12.07.2010 Ilkka Huopaniemi gives a talk on Multivariate multi-way analysis of multi-source data at ISMB 2010

09.07.2010 Arto Klami presents Bayesian exponential family projections for coupled data sources at UAI 2010

18.06.2010 Samuel Kaski has been appointed as the new Director of Helsinki Institute for Information Technology (HIIT) for a fixed five-year term, starting from August 1st

18.06.2010 Doctoral defense of Eerika Savia: "Mutual dependency-based modeling of relevance in co-occurrence data". The opponent is Dr. Michal Rosen-Zvi, IBM Research Centre, Haifa, Israel.

12.06.2010 Upcoming workshop presentations at ICML 2010 in Haifa, Israel, on the 25th of June:
"Dependency modeling toolbox", Workshop on Machine Learning Open Source Software (MLOSS)
"Pinview: implicit feedback in content-based image retrieval", Reinforcement Learning and Search in Very Large Spaces

27.05.2010 Samuel Kaski gives an invited talk about "Networks, visualization and retrieval" at the Statistical Mechanics of Learning and Inference workshop

17.03.2010 Submission to MLSP 2010 is open (Deadline: 11.04.2010)

11.03.2010 Special Course in Bioinformatics on Information Retrieval from Biological Databases (T-61.6070) starts on Monday 15.03. 12:15

08.03.2010 Zak Hussain, Alex Leung, and Kitsuchart Pasupa visit the group for two weeks

28.02.2010 Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization recently published in JMLR

22.01.2010 Researcher positions and student summer jobs:
Postdoc positions at the ICS Department (closed 15.03.)
Postdoc and senior researcher positions at HIIT (closed 15.03.)
Student summer jobs at the ICS Department (closed 29.01.)
Student summer jobs at HIIT (closed 19.02.)

Old news...

You are at: CIS → Statistical machine learning and bioinformatics: Research group

Page maintained by webmaster at, last updated Thursday, 14-Oct-2010 15:20:36 EEST