uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evaluation of methods handling missing data in PCA on genotype data: Applications for ancient DNA
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.ORCID iD: 0000-0002-6212-539x
2019 (English)Report (Other academic)
Place, publisher, year, edition, pages
2019.
Series
Technical report / Department of Information Technology, Uppsala University, ISSN 1404-3203 ; 2019-009
National Category
Computational Mathematics Genetics
Identifiers
URN: urn:nbn:se:uu:diva-396346OAI: oai:DiVA.org:uu-396346DiVA, id: diva2:1367445
Projects
eSSENCEAvailable from: 2019-11-04 Created: 2019-11-04 Last updated: 2019-11-11Bibliographically approved
In thesis
1. Efficient computational methods for applications in genomics
Open this publication in new window or tab >>Efficient computational methods for applications in genomics
2019 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

During the last two decades, advances in molecular technology have facilitated the sequencing and analysis of ancient DNA recovered from archaeological finds, contributing to novel insights into human evolutionary history. As more ancient genetic information has become available, the need for specialized methods of analysis has also increased. In this thesis, we investigate statistical and computational models for analysis of genetic data, with a particular focus on the context of ancient DNA.

The main focus is on imputation, or the inference of missing genotypes based on observed sequence data. We present results from a systematic evaluation of a common imputation pipeline on empirical ancient samples, and show that imputed data can constitute a realistic option for population-genetic analyses. We also discuss preliminary results from a simulation study comparing two methods of phasing and imputation, which suggest that the parametric Li and Stephens framework may be more robust to extremely low levels of sparsity than the parsimonious Browning and Browning model.

An evaluation of methods to handle missing data in the application of PCA for dimensionality reduction of genotype data is also presented. We illustrate that non-overlapping sequence data can lead to artifacts in projected scores, and evaluate different methods for handling unobserved genotypes.

In genomics, as in other fields of research, increasing sizes of data sets are placing larger demands on efficient data management and compute infrastructures. The last part of this thesis addresses the use of cloud resources for facilitating such analysis. We present two different cloud-based solutions, and exemplify them on applications from genomics.

Place, publisher, year, edition, pages
Uppsala University, 2019
Series
Information technology licentiate theses: Licentiate theses from the Department of Information Technology, ISSN 1404-5117 ; 2019-006
National Category
Computational Mathematics Genetics
Research subject
Scientific Computing
Identifiers
urn:nbn:se:uu:diva-396409 (URN)
Supervisors
Projects
eSSENCE
Available from: 2019-11-04 Created: 2019-11-04 Last updated: 2019-11-11Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

http://www.it.uu.se/research/publications/reports/2019-009/

Authority records BETA

Ausmees, Kristiina

Search in DiVA

By author/editor
Ausmees, Kristiina
By organisation
Division of Scientific ComputingComputational Science
Computational MathematicsGenetics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 85 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf