Automated recognition of retroviral sequences in genomic data - RetroTector©
2007 (English)In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 35, no 15, 4964-4976 p.Article in journal (Refereed) Published
Eukaryotic genomes contain many endogenous retroviral sequences(ERVs). ERVs are often severely mutated, therefore difficultto detect. A platform independent (Java) program package, RetroTector©(ReTe), was constructed. It has three basic modules: (i) detectionof candidate long terminal repeats (LTRs), (ii) detection ofchains of conserved retroviral motifs fulfilling distance constraintsand (iii) attempted reconstruction of original retroviral proteinsequences, combining alignment, codon statistics and propertiesof protein ends. Other features are prediction of additionalopen reading frames, automated database collection, graphicalpresentation and automatic classification. ReTe favors elements>1000-bp long due to its dependence on order of and distancesbetween retroviral fragments. It detects single or low-copy-numberelements. ReTe assigned a ‘retroviral’ score of890–2827 to 10 exogenous retroviruses from seven genera,and accurately predicted their genes. In a simulated model,ReTe was robust against mutational decay. The human genome wasanalyzed in 1–2 days on a LINUX cluster. Retroviral sequenceswere detected in divergent vertebrate genomes. Most ReTe detectedchains were coincident with Repeatmasker output and the HERVddatabase. ReTe did not report most of the evolutionary old HERV-Lrelated and MalR sequences, and is not yet tailored for singleLTR detection. Nevertheless, ReTe rationally detects and annotatesmany retroviral sequences.
Place, publisher, year, edition, pages
2007. Vol. 35, no 15, 4964-4976 p.
Algorithms, Animals, Endogenous Retroviruses/*genetics, Genome; Human, Genomics/*methods, Humans, Mutation, Reproducibility of Results, Retroviridae Proteins/genetics, Software, Terminal Repeat Sequences
Medical and Health Sciences
IdentifiersURN: urn:nbn:se:uu:diva-11878DOI: 10.1093/nar/gkm515ISI: 000249612300004PubMedID: 17636050OAI: oai:DiVA.org:uu-11878DiVA: diva2:39647