uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Genetic algorithm for large-scale maximum parsimony phylogenetic analysis of proteins
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Neuroscience.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Neuroscience.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Neuroscience.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Neuroscience.
2005 (English)In: Biochimica et Biophysica Acta - General Subjects, ISSN 0304-4165, E-ISSN 1872-8006, Vol. 1725, no 1, 19-29 p.Article in journal (Refereed) Published
Abstract [en]

Inferring phylogeny is a difficult computational problem. For example, for only 13 taxa, there are more then 13 billion possible unrooted phylogenetic trees. Heuristics are necessary to minimize the time spent evaluating non-optimal trees. We describe here an approach for heuristic searching, using a genetic algorithm, that can reduce the time required for weighted maximum parsimony phylogenetic inference, especially for data sets involving a large number of taxa. It is the first implementation of a weighted maximum parsimony criterion using amino acid sequences. To validate the weighted criterion, we used an artificial data set and compared it to a number of other phylogenetic methods. Genetic algorithms mimic the natural selection's ability to solve complex problems. We have identified several parameters affecting the genetic algorithm. Methods were developed to validate these parameters, ensuring optimal performance. This approach allows the construction of phylogenetic trees with over 200 taxa in practical time on a regular PC.

Place, publisher, year, edition, pages
2005. Vol. 1725, no 1, 19-29 p.
National Category
Medical and Health Sciences
Identifiers
URN: urn:nbn:se:uu:diva-95458DOI: 10.1016/j.bbagen.2005.04.027OAI: oai:DiVA.org:uu-95458DiVA: diva2:169674
Available from: 2007-02-16 Created: 2007-02-16 Last updated: 2014-01-24Bibliographically approved
In thesis
1. Development of New Methods for Inferring and Evaluating Phylogenetic Trees
Open this publication in new window or tab >>Development of New Methods for Inferring and Evaluating Phylogenetic Trees
2007 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Inferring phylogeny is a difficult computational problem. Heuristics are necessary to minimize the time spent evaluating non optimal trees. In paper I, we developed an approach for heuristic searching, using a genetic algorithm. Genetic algorithms mimic the natural selections ability to solve complex problems. The algorithm can reduce the time required for weighted maximum parsimony phylogenetic inference using protein sequences, especially for data sets involving large number of taxa.

Evaluating and comparing the ability of phylogenetic methods to infer the correct topology is complex. In paper II, we developed software that determines the minimum subtree prune and regraft (SPR) distance between binary trees to ease the process. The minimum SPR distance can be used to measure the incongruence between trees inferred using different methods. Given a known topology the methods could be evaluated on their ability to infer the correct phylogeny given specific data.

The minimum SPR software the intermediate trees that separate two binary trees. In paper III we developed software that given a set of incongruent trees determines the median SPR consensus tree i.e. the tree that explains the trees with a minimum of SPR operations. We investigated the median SPR consensus tree and its possible interpretation as a species tree given a set of gene trees. We used a set of α-proteobacteria gene trees to test the ability of the algorithm to infer a species tree and compared it to previous studies. The results show that the algorithm can successfully reconstruct a species tree.

Expressed sequence tag (EST) data is important in determining intron-exon boundaries, single nucleotide polymorphism and the coding sequence of genes. In paper IV we aligned ESTs to the genome to evaluate the quality of EST data. The results show that many ESTs are contaminated by vector sequences and low quality regions. The reliability of EST data is largely determined by the clustering of the ESTs and the association of the clusters to the correct portion of genome. We investigate the performance of EST clustering using the genome as template compared to previously existing methods using pair-wise alignments. The results show that using the genome as guidance improves the resulting EST clusters in respect to the extent ESTs originating from the same transcriptional unit are separated into disjunct clusters.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2007. 39 p.
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine, ISSN 1651-6206 ; 225
Keyword
Pharmacology, Evolution, Phylogeny, SPR, Genetic Algorithm, Tree metrics, Farmakologi
Identifiers
urn:nbn:se:uu:diva-7501 (URN)978-91-554-6799-7 (ISBN)
Public defence
2007-03-09, B7:101a, BMC, Husargatan 3, Uppsala, 09:15
Opponent
Supervisors
Available from: 2007-02-16 Created: 2007-02-16Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Schiöth, Helgi B.

Search in DiVA

By author/editor
Schiöth, Helgi B.
By organisation
Department of Neuroscience
In the same journal
Biochimica et Biophysica Acta - General Subjects
Medical and Health Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 781 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf