uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Universal structure and phylogeny of Long Terminal Repeats (LTRs)
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Virology.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Neuroscience.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Virology.
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Long terminal repeats (LTRs) are important sequence elements of retroviruses and related retrotransposons. They are however difficult to analyse due to their variability.

The aim of this work was to construct models of LTRs from all known groups of LTR retrotransposons and retroviruses, representative of the entire LTR diversity, making it possible for the first time to comprehensively study their phylogeny and to compare it to phylogenies of other retrotransposon genes.

A general HMM describing all LTRs was built. Its associated Viterbi alignment showed a consistent basic structure with inverted repeats starting with TGTT at the 5´end, ending with AACA at the 3´ end, plus two conserved AT-rich areas, the first one often containing the TATA box and the second one containing the polyadenylation signal AATAAA. A less conserved AT-rich stretch was apparent in the likely U3 portion. R was harder to delineate. The polyadenylation signal was followed by a T rich area characteristic of U5. The modular LTR structure previously reported by us, with modules separated by clusters of insert states, was also observed in this pan-LTR setting The result attests to the highly conserved basic structure of LTRs, which must date over a billion years back.

Hidden Markov models (HMM) were also created for 14 subgroups of LTRs. The HMMs yielded consensus sequences which were aligned to a "Superviterbi" alignment. The Superviterbi alignment yielded a phylogenetic tree which was consistent with a tree based on an alignment of concatenated RT, RNAse H and INT proteins. In particular it gave further support for the monophyly of retroviral LTRs.

The phylogenetic reconstruction now allows inferences regarding the origin of LTR retrotransposons.

Keyword [en]
Long terminal repeats, hidden Markov models, phylogeny, alignment, detection
National Category
Microbiology in the medical area
Research subject
Clinical Virology
Identifiers
URN: urn:nbn:se:uu:diva-119977OAI: oai:DiVA.org:uu-119977DiVA: diva2:302046
Available from: 2010-03-04 Created: 2010-03-04 Last updated: 2010-03-04
In thesis
1. Retroviral long Terminal Repeats; Structure, Detection and Phylogeny
Open this publication in new window or tab >>Retroviral long Terminal Repeats; Structure, Detection and Phylogeny
2010 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Long terminal repeats (LTRs) are non-coding repeats flanking the protein-coding genes of LTR retrotransposons. The variability of LTRs poses a challenge in studying them. Hidden Markov models (HMMs), probabilistic models widely used in pattern recognition, are useful in dealing with this variability. The aim of this work was mainly to study LTRs of retroviruses and LTR retrotransposons using HMMs.

Paper I describes the methodology of HMM modelling applied to different groups of LTRs from exogenous retroviruses (XRVs) and endogenous retroviruses (ERVs). The detection capabilities of HMMs were assessed and were found to be high for homogeneous groups of LTRs. The alignments generated by the HMMs displayed conserved motifs some of which could be related to known functions of XRVs. The common features of the different groups of retroviral LTRs were investigated by combining them into a single alignment. They were the short inverted terminal repeats TG and CA and three AT-rich stretches which provide retroviruses with TATA boxes and AATAAA polyadenylation signals.

In Paper II, phylogenetic trees of three groups of retroviral LTRs were constructed by using HMM-based alignments. The LTR trees were consistent with trees based on other retroviral genes suggesting co-evolution between LTRs and these genes.

In Paper III, the methods in Paper I and II were extended to LTRs from other retrotransposon groups, covering much of the diversity of all known LTRs. For the first time an LTR phylogeny could be achieved. There were no major disagreement between the LTR tree and trees based on three different domains of the Pol gene. The conserved LTR structure of paper I was found to apply to all LTRs. Putative Integrase recognition motifs extended up to 12 bp beyond the short inverted repeats TG/CA.

Paper IV is a review article describing the use of sequence similarity and structural markers for the taxonomy of ERVs. ERVs were originally classified into three classes according to the length of the target site duplication. While this classification is useful it does not include all ERVs. A naming convention based on previous ERV and XRV nomenclature but taking into account newer information is advocated in order to provide a practical yet coherent scheme in dealing with new unclassified ERV sequences.

Paper V gives an overview of bioinformatics tools for studies of ERVs and of retroviral evolution before and after endogenization. It gives some examples of recent integrations in vertebrate genomes and discusses pathogenicity of human ERVs including their possible relation to cancers.

In conclusion, HMMs were able to successfully detect and align LTRs. Progress was made in understanding their conserved structure and phylogeny. The methods developed in this thesis could be applied to different kinds of non-coding DNA sequence element.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2010. viii, 26 p.
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine, ISSN 1651-6206 ; 531
Keyword
Retrovirus, long terminal repeats, hidden Markov models, phylogeny, alignment, conserved motif, stem-loop
National Category
Microbiology in the medical area
Research subject
Clinical Virology
Identifiers
urn:nbn:se:uu:diva-120028 (URN)978-91-554-7740-0 (ISBN)
Public defence
2010-04-16, Hörsalen, mikrobiologen, Dag Hammarskjölds väg 17, 75185 Uppsala, Uppsala, 09:00 (English)
Opponent
Supervisors
Available from: 2010-03-24 Created: 2010-03-04 Last updated: 2010-08-16Bibliographically approved

Open Access in DiVA

No full text

By organisation
Clinical VirologyDepartment of Neuroscience
Microbiology in the medical area

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 428 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf