Merja Oja, Jonas Blomberg, and Samuel Kaski. Class discovery and visualization for human endogenous retroviruses by bootstrapping Median Self-organizing Maps In Bioinformatics 2004, Linköping, Sweden, June 3-6, 2004. A poster. (postscript (A4 size), gzipped postscript)

About eight percent of the human genome consists of human endogenous retrovirus sequences. Human endogenous retroviruses (HERV) are remains from ancient infections by retroviruses. The HERVs are mutated and deficient, but they still may give rise to transcripts or may affect the expression of human genes.

The HERVs stem from several kinds of retroviruses. The possible current functioning of the HERV sequences may reflect the origin of the HERVs. Hence, the classification of the diverse HERV sequences is a natural starting point when investigating effects of HERVs in humans. The current HERV taxonomy is incomplete: some sequences cannot be assigned to any class and the classification is ambigous for others.

A Median Self-Organizing Map (SOM), a SOM for data about pairwise distances between samples, can be used to group all the HERVs found in the human genome. The Median SOM will visualize the HERV collection on a two-dimensional display. The visualization will represent the similarity relationships between individual sequences, as well as cluster structures and similarities of clusters.

In this work the Median SOM is used to cluster and visualize a collection of 3661 HERV sequences picked from the human genome by the RetroTector system. The trustworthiness of the visualization in representing the similarity relationships between the HERV sequences is evaluated, and confidence in the found groupings is estimated with resampling techniques.


Last modified: Wed Mar 9 08:27:49 EET 2005