Merja Oja, Jaakko Peltonen, and Samuel Kaski. Estimation of human endogenous retrovirus activities from expressed sequence databases. In Juho Rousu, Samuel Kaski, and Esko Ukkonen, editors, Probabilistic Modeling and Machine Learning in Structural and Systems Biology (PMSB 2006), workshop proceedings, pages 50-54, Helsinki University Printing House, 2006. (gzipped postscript, pdf)

Human endogenous retroviruses (HERVs) are remnants of ancient retrovirus infections and now reside within the human DNA. Recently HERV expression has been detected in both normal tissues and diseased patients. However, the activities (expression levels) of individual HERV sequences are mostly unknown. In this work we introduce a generative mixture model, based on Hidden Markov Models, for estimating the activities of the individual HERV sequences from databases of expressed sequences. We determine the relative activities of 91 HERVs; the majority of their activities were previously unknown. We also empirically justify a faster heuristic method for HERV activity estimation.