Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Conflict over the Eukaryote Root Resides in Strong Outliers, Mosaics and Missing Data Sensitivity of Site-Specific (CAT) Mixture Models
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organismal Biology, Systematic Biology. (Baldauf Lab)ORCID iD: 0000-0002-0868-0384
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Organismal Biology, Systematic Biology. (Baldauf Lab)ORCID iD: 0000-0003-4485-6671
2022 (English)In: Systematic Biology, ISSN 1063-5157, E-ISSN 1076-836XArticle in journal (Refereed) Epub ahead of print
Abstract [en]

Phylogenetic reconstruction using concatenated loci ("phylogenomics" or "supermatrix phylogeny") is a powerful tool for solving evolutionary splits that are poorly resolved in single gene/protein trees. However, recent phylogenomic attempts to resolve the eukaryote root have yielded conflicting results, along with claims of various artifacts hidden in the data. We have investigated these conflicts using two new methods for assessing phylogenetic conflict. ConJak uses whole marker (gene or protein) jackknifing to assess deviation from a central mean for each individual sequence, whereas ConWin uses a sliding window to screen for incongruent protein fragments (mosaics). Both methods allow selective masking of individual sequences or sequence fragments in order to minimize missing data, an important consideration for resolving deep splits with limited data. Analyses focused on a set of 76 eukaryotic proteins of bacterial ancestry previously used in various combinations to assess the branching order among the three major divisions of eukaryotes: Amorphea (mainly animals, fungi, and Amoebozoa), Diaphoretickes (most other well-known eukaryotes and nearly all algae) and Excavata, represented here by Discoba (Jakobida, Heterolobosea, and Euglenozoa). ConJak analyses found strong outliers to be concentrated in undersampled lineages, whereas ConWin analyses of Discoba, the most undersampled of the major lineages, detected potentially incongruent fragments scattered throughout. Phylogenetic analyses of the full data using an LG-gamma model support a Discoba sister scenario (neozoan-excavate root), which rises to 99-100% bootstrap support with data masked according to either protocol. However, analyses with two site-specific (CAT) mixture models yielded widely inconsistent results and a striking sensitivity to missing data. The neozoan-excavate root places Amorphea and Diaphoretickes as more closely related to each other than either is to Discoba, a fundamental relationship that should remain unaffected by additional taxa.

Place, publisher, year, edition, pages
2022.
National Category
Biological Systematics
Identifiers
URN: urn:nbn:se:uu:diva-484535DOI: 10.1093/sysbio/syac029ISI: 000804041500001PubMedID: 35412616OAI: oai:DiVA.org:uu-484535DiVA, id: diva2:1695381
Funder
Swedish Research Council, 2017-04351Available from: 2022-09-13 Created: 2022-09-13 Last updated: 2022-10-18
In thesis
1. Resolving deep nodes of eukaryote phylogeny
Open this publication in new window or tab >>Resolving deep nodes of eukaryote phylogeny
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

My thesis aims to solve deep nodes in the eukaryote tree of life (eToL), by developing new data sets and new approaches to analysing them. In paper I, I described a dataset of 76 universal eukaryotic proteins of bacterial descent (euBacs), in order to test the relations among the three main divisions of mitochondriate eukaryotes (Amorphea, Diaphretickes and Discoba). I developed two protocols to identify problematic data. The conJac protocol analyzes data by jackknifing to detect outlier sequences, while conWin uses a sliding window to find sequence fragments of potentially foreign origin. Phylogenetic analyses of the 76 euBacs, with and without conWin or conJac filtering place Discoba as the sister group to Amorphea and Diaphretickes. The results are largely consistent and highly supported under various evolutionary models except for highly complex CAT models. In paper II, I describe a dataset of 198 universal eukaryote proteins of archaeal ancestry (euArcs), which includes the remaining eukaryotes, informally referred to as amitochondriate excavate. These were excluded from the previous study because they lack euBacs. Phylogenetic analyses of the euArc dataset place the amitochondriate excavate as the first three branches of eToL, followed by Discoba, the only mitochondriate excavates, which appear as a sister group to the remaining eukaryotes. I also developed a protocol using predicted protein structures to increase the fitness of the model without inflating the parameter space, allowing me to conduct a series of control analyses and further support the multi-excavate root. In Paper III, I describe a new application of reciprocal-rooting using concatenated sequences, which I then use to test the euArc root. I also developed two sampling protocols unique to this kind of data. The protocols confirm the multi-excavate euArc root, which indicates that eukaryotes arose from an excavate ancestor. Paper IV describes a follow-up on the ConWin results from Paper I. These show moderate to strong support for mosaicism in 16 euBac proteins from diverse metabolic pathways and donor lineages. In summary, this thesis presents a novel root for the eukaryote tree of life. The new root requires revision of fundamental theories of eukaryote evolution including the source and timing of mitochondrial origins. The methods I have developed are applicable to many different kinds of phylogenetic studies, and the new protein structure model should make these analyses faster, more flexible, and more widely available.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2022. p. 53
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 2191
Keywords
Eukaryote Tree of Life, Excavata, phylogenetics, phylogenomics, Mitochondria
National Category
Biological Systematics
Research subject
Biology with specialization in Systematics
Identifiers
urn:nbn:se:uu:diva-484580 (URN)978-91-513-1599-7 (ISBN)
Public defence
2022-11-02, Lindahlsalen, Evolutionsbiologiskt centrum, Norbyv. 18D, Uppsala, 13:15 (English)
Opponent
Supervisors
Funder
Swedish Research Council, 2017-04351
Available from: 2022-10-10 Created: 2022-09-13 Last updated: 2022-10-11

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMed

Authority records

Al Jewari, CaesarBaldauf, Sandra L.

Search in DiVA

By author/editor
Al Jewari, CaesarBaldauf, Sandra L.
By organisation
Systematic Biology
In the same journal
Systematic Biology
Biological Systematics

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 310 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf