Logo: to the web site of Uppsala University

uu.sePublikasjoner fra Uppsala universitet
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Large-scale virtual screening on public cloud resources with Apache Spark
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för beräkningsvetenskap. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Tillämpad beräkningsvetenskap.ORCID-id: 0000-0002-4851-759x
Uppsala universitet, Medicinska och farmaceutiska vetenskapsområdet, Farmaceutiska fakulteten, Institutionen för farmaceutisk biovetenskap. (Pharmaceutical Bioinformatics)ORCID-id: 0000-0001-6770-0878
Vise andre og tillknytning
2017 (engelsk)Inngår i: Journal of Cheminformatics, E-ISSN 1758-2946, Vol. 9, artikkel-id 15Artikkel i tidsskrift (Fagfellevurdert) Published
sted, utgiver, år, opplag, sider
2017. Vol. 9, artikkel-id 15
HSV kategori
Identifikatorer
URN: urn:nbn:se:uu:diva-318693DOI: 10.1186/s13321-017-0204-4ISI: 000396830300001PubMedID: 28316653OAI: oai:DiVA.org:uu-318693DiVA, id: diva2:1085075
Prosjekter
eSSENCETilgjengelig fra: 2017-03-06 Laget: 2017-03-27 Sist oppdatert: 2022-05-10bibliografisk kontrollert
Inngår i avhandling
1. Enabling Scalable Data Analysis on Cloud Resources with Applications in Life Science
Åpne denne publikasjonen i ny fane eller vindu >>Enabling Scalable Data Analysis on Cloud Resources with Applications in Life Science
2019 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Over the past 20 years, the rise of high-throughput methods in life science has enabled research laboratories to produce massive datasets of biological interest. When dealing with this "data deluge" of modern biology researchers encounter two major challenges: first, there is a need for substantial technical skills for dealing with Big Data and; second, infrastructure procurement becomes difficult. In connection to this second challenge, the computing model and business trend that was originally popularized by Amazon under the name of cloud computing represents an interesting opportunity. Instead of buying computing infrastructure upfront, cloud providers enable the allocation and release of virtual resources on-demand. These resources are then billed with a pay-per-use pricing model and physical infrastructure management is delegated to the provider. In this thesis, we introduce a number of methods for running Big Data analyses of biological interest using cloud computing. Considerable efforts were made in enabling the application of trusted, bioinformatics software to Big Data scenarios as opposed to reimplementing the existing codebase. Further, we improve the accessibility of the technology with the aim of reducing the entry barrier for biologists. The thesis includes 5 papers. In Papers I and II, we explore the applicability of Apache Spark, one of the leading Big Data analytics platforms in cloud environments, to two drug-discovery use cases. In Paper III, we present a general method for running bioinformatics analyses on the cloud using the microservices-oriented architecture. In Paper IV, we introduce a method that combines microservices and Apache Spark with the aim of providing the best of both technologies. In Paper V, we discuss how to reduce the entry barrier for the allocation of cloud research environments. We show that all of the developed methods scale well and we provide high-level programming interfaces for improving accessibility. We have also made the developed software publicly available.

sted, utgiver, år, opplag, sider
Uppsala: Acta Universitatis Upsaliensis, 2019. s. 71
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1846
Emneord
cloud computing, bioinformatics, Big Data, microservices, containers, MapReduce
HSV kategori
Forskningsprogram
Beräkningsvetenskap
Identifikatorer
urn:nbn:se:uu:diva-390666 (URN)978-91-513-0730-5 (ISBN)
Disputas
2019-10-10, B42, Uppsala Biomedicinska Centrum, Husargatan 3, Uppsala, 13:15 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2019-09-17 Laget: 2019-08-22 Sist oppdatert: 2019-10-15

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstPubMed

Person

Capuccini, MarcoSchaal, WesleySpjuth, Ola

Søk i DiVA

Av forfatter/redaktør
Capuccini, MarcoSchaal, WesleySpjuth, Ola
Av organisasjonen
I samme tidsskrift
Journal of Cheminformatics

Søk utenfor DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric

doi
pubmed
urn-nbn
Totalt: 1098 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf