uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
An automatable method for high throughput analysis of evolutionary patterns in slightly complex indels and its application to the deep phylogeny of Metazoa
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organismal Biology, Systematic Biology.
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organismal Biology, Systematic Biology.
2014 (English)Article in journal (Refereed) Submitted
Abstract [en]

Insertions/deletions (indels) in protein sequences are potential powerful evolutionary markers. However, these characters have rarely been explored systematically at deep phylogenetic levels. Previous analyses of simple (2-state) clade defining indels (CDIs) in universal eukaryotic proteins found none to support any major animal clade. We hypothesized that CDIs might still be found in the remaining population of indels, which we term complex indels. Here, we propose a method for analyzing the simplest class of complex indels the “slightly complex indels”, and use these to investigate deep branches in animal phylogeny. Complex indels with two states, called bi-state indels, show similar evolutionary patterns to singleton simple indels and confirms that insertion mutations are more common than deletions. Exploration of CDIs in 2- to 9-state complex indels shows strong support for all examined branches of fungi and Archaeplastida. Surprisingly, we also found CDIs supporting major branches in animals, particular in vertebrates. We then expanded the search to non-bilaterial animals (Porifera, Cnidaria and Ctenophora). The phylogenetic tree reconstructed by CDIs places the Ctenophore Mnemiopsis leidyi as the deepest branch of animals with 6 CDIs support. Trichoplax adhaerens is closely related to the Bilateria. Moreover, the indel phylogeny shows Nematostella vectensis and Hydra magnipapillata are paraphyletic group and position of Cnidarian branches seems to be problematic in the indel phylogeny because of homoplasy. This might be solved if we discover CDIs from animal specific proteins, which emerged after the universal orthologous proteins.Evolutionary Patterns in Slightly Complex Protein Insertions/Deletions (Indels) and Their Application to the Study of Deep Phylogeny in Metazoa

Place, publisher, year, edition, pages
2014.
National Category
Other Biological Topics
Identifiers
URN: urn:nbn:se:uu:diva-216842OAI: oai:DiVA.org:uu-216842DiVA: diva2:691069
Available from: 2014-01-27 Created: 2014-01-27 Last updated: 2014-04-17Bibliographically approved
In thesis
1. Mine the Gaps: Evolution of Eukaryotic Protein Indels and their Application for Testing Deep Phylogeny
Open this publication in new window or tab >>Mine the Gaps: Evolution of Eukaryotic Protein Indels and their Application for Testing Deep Phylogeny
2014 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Insertions/deletions (indels) are potentially powerful evolutionary markers, but little is known about their evolution and few tools exist to effectively study them. To address this, I developed SeqFIRE, a tool for automated identification and extraction of indels from protein multiple sequence alignments. The program also extracts conserved alignment blocks, thus covering all major steps in preparing multiple sequence alignments for phylogenetic analysis.

I then used SeqFIRE to build an indel database, using 299 single copy proteins from a broad taxonomic sampling of mainly multicellular eukaryotes. A total of 4,707 indels were extracted, of which 901 are simple (one genetic event) and 3,806 are complex (multiple events). The most abundant indels are single amino acid simple indels. Indel frequency decreases exponentially with length and shows a linear relationship with host protein size. Singleton indels reveal a strong bias towards insertions (2.31 x deletions on average). These analyses also identify 43 indels marking major clades in Plantae and Fungi (clade defining indels or CDIs), but none for Metazoa.

In order to study the 3806 complex indels they were first classified by number of states. Analysis of the 2-state complex and simple indels combined (“bi-state indels”) confirms that insertions are over 2.5 times as frequent as deletions. Three-quarters of the complex indels had three-nine states (“slightly complex indels”). A tree-assisted search method was developed allowing me to identify 1,010 potential CDIs supporting all examined major branches of Plantae and Fungi.

Forty-two proteins were also found to host complex indel CDIs for the deepest branches of Metazoa. After expanding the taxon set for these proteins, I identified a total of 49 non-bilaterian specific CDIs. Parsimony analysis of these indels places Ctenophora as sister taxon to all other Metazoa including Porifera. Six CDIs were also found placing Placozoa as sister to Bilateria. I conclude that slightly complex indels are a rich source of CDIs, and my tree-assisted search strategy could be automated and implemented in the program SeqFIRE to facilitate their discovery. This will have important implications for mining the phylogenomic content of the vast resource of protist genome data soon to become available.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2014. 58 p.
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1131
Keyword
indel, insertion/deletion, protein evolution, bioinformatics, non-bilateria, eukaryotes, phylogeny
National Category
Bioinformatics and Systems Biology Biological Systematics
Research subject
Biology with specialization in Systematics; Biology with specialization in Molecular Evolution
Identifiers
urn:nbn:se:uu:diva-220727 (URN)978-91-554-8904-5 (ISBN)
Public defence
2014-05-07, Lindahlsalen, Norbyvägen 18, Uppsala, 10:00 (English)
Opponent
Supervisors
Available from: 2014-04-15 Created: 2014-03-19 Last updated: 2014-04-29Bibliographically approved

Open Access in DiVA

No full text

Authority records BETA

Ajawatanawong, PravechBaldauf, Sandra

Search in DiVA

By author/editor
Ajawatanawong, PravechBaldauf, Sandra
By organisation
Systematic Biology
Other Biological Topics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 937 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf