uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Examining the Root of the Eukaryotic Tree of Life
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Biology Education Centre.
2017 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Identifying the evolutionary root of eukaryotic tree of life (eToL) is a central problem in systematic biology that has been receiving growing attention. This task has been aided by the development of advanced phylogenetic methods and the availability of large amounts of genomic data from across the tree. Recently, two studies have tried a novel approach to define the eToL root, using euBacteria (instead of the more distantly related Archaea) as the outgroup. The results of these two recent studies are partially overlapping datasets, which produce contradictory results. One study, using mixed eubacterial data (euBac), makes the case for a neozoan-excavate root, while the other study, using alpha-proteobacterial (aP) data, concluded the traditional unikont-bikont root. These two results suggest different theories of early eukaryote evolution. However, there is also evidence of substantial artefacts in these datasets and traces of horizontal gene transfer (HGT), the exchange of DNA between unrelated organisms. This project aims to re-examine the datasets of both publications (61 total protein markers). The work started with updating both datasets with solid new phylogenomic data from the supervisor lab and new publicly available data. I then used these data to systematically investigate the phylogenetic signals of the 61 protein markers across 88 taxa (68 eukaryotes and 20 Bacteria). These were first subjected to preliminary phylogenetic analyses to sort orthologues from paralogues. All orthologues were then combined into a single dataset and subjected to in depth phylogenetic analyses to evaluate the support for various hypotheses. I also investigated potential sources of artefact in the data using traditional and novel methods I devised and developed myself including computer scripts specifically written for this work. I created a pipeline for the data curation process to make it fast and efficient by automating various parts of the workflow, including concatenating the multigene dataset into a super matrix. I estimated the level of incongruence in each dataset, excluded the protein markers that have a strong phylogenetic bias, and reconstructed new datasets. I conclude that the data in hand (protein markers and taxa) contain conflicting and inconsistent phylogenetic signal and that a few proteins can have a very strong effect on the results of the analyses. However, a third possible hypothesis is clearly rejected. This suggests that there are specific artefacts in the data, favouring one or the other of the two remaining hypotheses.

Abstract [en]

The first attempt to portray the relatedness between species in the form of a tree was sketched by Charles Darwin in his revolutionary book ‘Origin of species’ in 1859. Since then and until the discovery of the molecular structure of DNA and proteins, gauging the degree of kinship between species had relied solely on visible morphological similarities defined by observable traits. A phylogenetic tree is a visual diagram describing a hypothetical course of evolution between different species or organisms. Each tip of the tree represents a species, and the branch length represents the scale of divergence between the species. The branches are connected with internal nodes representing a speciation event at which two species, or group of species, share a common ancestor. Phylogenetic analysis has developed substantially with the emergence of biological molecular data. The current classification of all living species into the three main domains of life, Bacteria, Archaea, and Eukarya has been brought by countless contributions from several scientists. The eukaryotic domain constitutes most of the discovered diversity of life, which includes Animals, Plants, Fungi, and various other organisms. Defining the evolutionary relationships between the eukaryotic organisms has continuously been revised; more lineages are added and several nodes of the tree are constantly being corrected. However, much uncertainty remains about the positions of ancient nodes, especially the location of the last common ancestor of eukaryotes. This speciation event goes back to a couple of billion years at a historical episode that beamed all eukaryotic life. As of now, the location of this root is considered an unresolved problem. We based our work on conflicting data of two primary previous publications, we incorporated a large amount of proteomic data from a diversity of species and carefully examined them attempting to discern the sources of conflicts in locating this root. Throughout the project, we present an exploration of the advanced methods used, while we push to resolve the problem. Using statistical approaches, we examined several theories investigating their strengths and weaknesses, our work converged back into the main two alternative placements for the root, and conclude that further evidence is needed to accept or reject any of them.

Place, publisher, year, edition, pages
2017. , p. 48
Keywords [en]
Eukaryotic root phylogenomics
National Category
Bioinformatics and Systems Biology Biological Systematics
Identifiers
URN: urn:nbn:se:uu:diva-328303OAI: oai:DiVA.org:uu-328303DiVA, id: diva2:1134751
Educational program
Master Programme in Bioinformatics
Presentation
2017-08-18, C8:321, BMC, Uppsala, 10:00 (English)
Supervisors
Examiners
Available from: 2017-08-21 Created: 2017-08-21 Last updated: 2017-08-21Bibliographically approved

Open Access in DiVA

No full text in DiVA

By organisation
Biology Education Centre
Bioinformatics and Systems BiologyBiological Systematics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 1126 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf