Universal structure and phylogeny of Long Terminal Repeats (LTRs)
(English)Manuscript (preprint) (Other academic)
Long terminal repeats (LTRs) are important sequence elements of retroviruses and related retrotransposons. They are however difficult to analyse due to their variability.
The aim of this work was to construct models of LTRs from all known groups of LTR retrotransposons and retroviruses, representative of the entire LTR diversity, making it possible for the first time to comprehensively study their phylogeny and to compare it to phylogenies of other retrotransposon genes.
A general HMM describing all LTRs was built. Its associated Viterbi alignment showed a consistent basic structure with inverted repeats starting with TGTT at the 5´end, ending with AACA at the 3´ end, plus two conserved AT-rich areas, the first one often containing the TATA box and the second one containing the polyadenylation signal AATAAA. A less conserved AT-rich stretch was apparent in the likely U3 portion. R was harder to delineate. The polyadenylation signal was followed by a T rich area characteristic of U5. The modular LTR structure previously reported by us, with modules separated by clusters of insert states, was also observed in this pan-LTR setting The result attests to the highly conserved basic structure of LTRs, which must date over a billion years back.
Hidden Markov models (HMM) were also created for 14 subgroups of LTRs. The HMMs yielded consensus sequences which were aligned to a "Superviterbi" alignment. The Superviterbi alignment yielded a phylogenetic tree which was consistent with a tree based on an alignment of concatenated RT, RNAse H and INT proteins. In particular it gave further support for the monophyly of retroviral LTRs.
The phylogenetic reconstruction now allows inferences regarding the origin of LTR retrotransposons.
Long terminal repeats, hidden Markov models, phylogeny, alignment, detection
Microbiology in the medical area
Research subject Clinical Virology
IdentifiersURN: urn:nbn:se:uu:diva-119977OAI: oai:DiVA.org:uu-119977DiVA: diva2:302046