uu.seUppsala University Publications
Change search
ReferencesLink to record
Permanent link

Direct link
Combining Markers into Haplotypes Can Improve Population Structure Inference
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Ecology and Genetics, Evolutionary Biology. Uppsala University, Science for Life Laboratory, SciLifeLab.
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Ecology and Genetics, Evolutionary Biology. Uppsala University, Science for Life Laboratory, SciLifeLab.
2012 (English)In: Genetics, ISSN 0016-6731, E-ISSN 1943-2631, Vol. 190, no 1, 159-174 p.Article in journal (Refereed) Published
Abstract [en]

High-throughput genotyping and sequencing technologies can generate dense sets of genetic markers for large numbers of individuals. For most species, these data will contain many markers in linkage disequilibrium (LD). To utilize such data for population structure inference, we investigate the use of haplotypes constructed by combining the alleles at single-nucleotide polymorphisms (SNPs). We introduce a statistic derived from information theory, the gain of informativeness for assignment (GIA), which quantifies the additional information for assigning individuals to populations using haplotype data compared to using individual loci separately. Using a two-loci-two-allele model, we demonstrate that combining markers in linkage equilibrium into haplotypes always leads to non-positive GIA, suggesting that combining the two markers is not advantageous for ancestry inference. However, for loci in LD, GIA is often positive, suggesting that assignment can be improved by combining markers into haplotypes. Using GIA as a criterion for combining markers into haplotypes, we demonstrate for simulated data a significant improvement of assigning individuals to candidate populations. For the many cases that we investigate, incorrect assignment was reduced between 26% and 97% using haplotype data. For empirical data from French and German individuals, the incorrectly assigned individuals can, for example, be decreased by 73% using haplotypes. Our results can be useful for challenging population structure and assignment problems, in particular for studies where large-scale population-genomic data are available.

Place, publisher, year, edition, pages
2012. Vol. 190, no 1, 159-174 p.
National Category
Biological Sciences
URN: urn:nbn:se:uu:diva-168746DOI: 10.1534/genetics.111.131136ISI: 000299166800011OAI: oai:DiVA.org:uu-168746DiVA: diva2:503677
Available from: 2012-02-16 Created: 2012-02-15 Last updated: 2015-10-01Bibliographically approved
In thesis
1. Population Genetic Methods and Applications to Human Genomes
Open this publication in new window or tab >>Population Genetic Methods and Applications to Human Genomes
2015 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Population Genetics has led to countless numbers of fruitful studies of evolution, due to its abilities for prediction and description of the most important evolutionary processes such as mutation, genetic drift and selection. The field is still growing today, with new methods and models being developed to answer questions of evolutionary relevance and to lift the veil on the past of all life forms. In this thesis, I present a modest contribution to the growth of population genetics. I investigate different questions related to the dynamics of populations, with particular focus on studying human evolution. I derive an upper bound and a lower bound for FST, a classical measure of population differentiation, as functions of the homozygosity in each of the two studied populations, and apply the result to discuss observed differentiation levels between human populations. I introduce a new criterion, the Gain of Informativeness for Assignment, to help us decide whether two genetic markers should be combined into a haplotype marker and improve the assignment of individuals to a panel of reference populations. Applying the method on SNP data for French, German and Swiss individuals, I show how haplotypes can lead to better assignment results when they are supervised by GIA. I also derive the population size over time as a function of the densities of cumulative coalescent times, show the robustness of this result to the number of loci as well as the sample size, and together with a simple algorithm of gene-genealogy inference, apply the method on low recombining regions of the human genome for four worldwide populations. I recover previously observed population size shapes, as well as uncover an early divergence of the Yoruba population from the non-African populations, suggesting ancient population structure on the African continent prior to the Out-of-Africa event. Finally, I present a case study of human adaptation to an arsenic-rich environment.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2015. 63 p.
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1280
Population genetics, Human evolution, Genetic diversity, Genetic differentiation, Adaptation, Population structure, Effective population size
National Category
Evolutionary Biology
Research subject
Biology with specialization in Evolutionary Genetics
urn:nbn:se:uu:diva-260998 (URN)978-91-554-9319-6 (ISBN)
Public defence
2015-10-22, Lindahlsalen, Norbyvägen 18A, Uppsala, 13:15 (English)
Available from: 2015-09-29 Created: 2015-08-27 Last updated: 2015-10-01

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Gattepaille, Lucie M.Jakobsson, Mattias
By organisation
Evolutionary BiologyScience for Life Laboratory, SciLifeLab
In the same journal
Biological Sciences

Search outside of DiVA

GoogleGoogle Scholar

Altmetric score

Total: 230 hits
ReferencesLink to record
Permanent link

Direct link