Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evaluation of serverless computing for scalable execution of a joint variant calling workflow
Univ Washington, Dept Biol, Seattle, WA 98195 USA..
Univ Washington, Dept Biomed Informat & Med Educ, Seattle, WA 98195 USA..
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing.ORCID iD: 0000-0002-6212-539x
2021 (English)In: PLOS ONE, E-ISSN 1932-6203, Vol. 16, no 7, article id e0254363Article in journal (Refereed) Published
Abstract [en]

Advances in whole-genome sequencing have greatly reduced the cost and time of obtaining raw genetic information, but the computational requirements of analysis remain a challenge. Serverless computing has emerged as an alternative to using dedicated compute resources, but its utility has not been widely evaluated for standardized genomic workflows. In this study, we define and execute a best-practice joint variant calling workflow using the SWEEP workflow management system. We present an analysis of performance and scalability, and discuss the utility of the serverless paradigm for executing workflows in the field of genomics research. The GATK best-practice short germline joint variant calling pipeline was implemented as a SWEEP workflow comprising 18 tasks. The workflow was executed on Illumina paired-end read samples from the European and African super populations of the 1000 Genomes project phase III. Cost and runtime increased linearly with increasing sample size, although runtime was driven primarily by a single task for larger problem sizes. Execution took a minimum of around 3 hours for 2 samples, up to nearly 13 hours for 62 samples, with costs ranging from $2 to $70.

Place, publisher, year, edition, pages
PUBLIC LIBRARY SCIENCE Public Library of Science (PLoS), 2021. Vol. 16, no 7, article id e0254363
National Category
Bioinformatics (Computational Biology)
Identifiers
URN: urn:nbn:se:uu:diva-452320DOI: 10.1371/journal.pone.0254363ISI: 000674301400079PubMedID: 34242357OAI: oai:DiVA.org:uu-452320DiVA, id: diva2:1591258
Available from: 2021-09-06 Created: 2021-09-06 Last updated: 2024-01-15Bibliographically approved

Open Access in DiVA

fulltext(886 kB)297 downloads
File information
File name FULLTEXT01.pdfFile size 886 kBChecksum SHA-512
52cce0675a4be058ef6d20caa59d15f65f9c8dbf2f95c962c6e29f95fc79b7b25971c42bda74c437a65d6178af9e1cf305ef4362181c1b7f1ebf1a59112fc512
Type fulltextMimetype application/pdf

Other links

Publisher's full textPubMed

Authority records

Ausmees, Kristiina

Search in DiVA

By author/editor
Ausmees, Kristiina
By organisation
Division of Scientific Computing
In the same journal
PLOS ONE
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar
Total: 298 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 146 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf