uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
On-demand virtual research environments using microservices
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.ORCID iD: 0000-0002-4851-759x
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Computational Biology and Bioinformatics.ORCID iD: 0000-0002-2096-8102
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Pharmacy, Department of Pharmaceutical Biosciences. (Spjuth)
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Computational Biology and Bioinformatics.ORCID iD: 0000-0002-2187-5426
Show others and affiliations
2019 (English)In: PeerJ Computer Science, ISSN 2376-5992, Vol. 5, article id e232Article in journal (Refereed) Published
Place, publisher, year, edition, pages
2019. Vol. 5, article id e232
National Category
Computer Sciences Bioinformatics and Systems Biology
Identifiers
URN: urn:nbn:se:uu:diva-390665DOI: 10.7717/peerj-cs.232ISI: 000496144800002OAI: oai:DiVA.org:uu-390665DiVA, id: diva2:1342440
Projects
eSSENCEAvailable from: 2019-11-11 Created: 2019-08-13 Last updated: 2020-01-31Bibliographically approved
In thesis
1. Enabling Scalable Data Analysis on Cloud Resources with Applications in Life Science
Open this publication in new window or tab >>Enabling Scalable Data Analysis on Cloud Resources with Applications in Life Science
2019 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Over the past 20 years, the rise of high-throughput methods in life science has enabled research laboratories to produce massive datasets of biological interest. When dealing with this "data deluge" of modern biology researchers encounter two major challenges: first, there is a need for substantial technical skills for dealing with Big Data and; second, infrastructure procurement becomes difficult. In connection to this second challenge, the computing model and business trend that was originally popularized by Amazon under the name of cloud computing represents an interesting opportunity. Instead of buying computing infrastructure upfront, cloud providers enable the allocation and release of virtual resources on-demand. These resources are then billed with a pay-per-use pricing model and physical infrastructure management is delegated to the provider. In this thesis, we introduce a number of methods for running Big Data analyses of biological interest using cloud computing. Considerable efforts were made in enabling the application of trusted, bioinformatics software to Big Data scenarios as opposed to reimplementing the existing codebase. Further, we improve the accessibility of the technology with the aim of reducing the entry barrier for biologists. The thesis includes 5 papers. In Papers I and II, we explore the applicability of Apache Spark, one of the leading Big Data analytics platforms in cloud environments, to two drug-discovery use cases. In Paper III, we present a general method for running bioinformatics analyses on the cloud using the microservices-oriented architecture. In Paper IV, we introduce a method that combines microservices and Apache Spark with the aim of providing the best of both technologies. In Paper V, we discuss how to reduce the entry barrier for the allocation of cloud research environments. We show that all of the developed methods scale well and we provide high-level programming interfaces for improving accessibility. We have also made the developed software publicly available.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2019. p. 71
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1846
Keywords
cloud computing, bioinformatics, Big Data, microservices, containers, MapReduce
National Category
Computational Mathematics
Research subject
Scientific Computing
Identifiers
urn:nbn:se:uu:diva-390666 (URN)978-91-513-0730-5 (ISBN)
Public defence
2019-10-10, B42, Uppsala Biomedicinska Centrum, Husargatan 3, Uppsala, 13:15 (English)
Opponent
Supervisors
Available from: 2019-09-17 Created: 2019-08-22 Last updated: 2019-10-15

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records BETA

Capuccini, MarcoLarsson, AndersNovella, Jon AnderToor, SalmanSpjuth, Ola

Search in DiVA

By author/editor
Capuccini, MarcoLarsson, AndersNovella, Jon AnderToor, SalmanSpjuth, Ola
By organisation
Division of Scientific ComputingComputational ScienceComputational Biology and BioinformaticsDepartment of Pharmaceutical BiosciencesScience for Life Laboratory, SciLifeLab
Computer SciencesBioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 62 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf