uu.seUppsala universitets publikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Scientific analysis by queries in extended SPARQL over a scalable e-Science data store
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datalogi. (UDBL)
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för beräkningsvetenskap. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Tillämpad beräkningsvetenskap.
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för beräkningsvetenskap. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Tillämpad beräkningsvetenskap.
Visa övriga samt affilieringar
2013 (Engelska)Ingår i: Proc. 9th International Conference on e-Science, Los Alamitos, CA: IEEE Computer Society, 2013, s. 98-106Konferensbidrag, Publicerat paper (Refereegranskat)
Ort, förlag, år, upplaga, sidor
Los Alamitos, CA: IEEE Computer Society, 2013. s. 98-106
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:uu:diva-173448DOI: 10.1109/eScience.2013.19ISI: 000330195500012ISBN: 978-0-7695-5083-1 (tryckt)OAI: oai:DiVA.org:uu-173448DiVA, id: diva2:519242
Projekt
eSSENCETillgänglig från: 2013-10-25 Skapad: 2012-04-24 Senast uppdaterad: 2018-01-12Bibliografiskt granskad
Ingår i avhandling
1. Managing Applications and Data in Distributed Computing Infrastructures
Öppna denna publikation i ny flik eller fönster >>Managing Applications and Data in Distributed Computing Infrastructures
2012 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

During the last decades the demand for large-scale computational and storage resources in science has increased dramatically. New computational infrastructures enable scientists to enter a new mode of science, e-science, which complements traditional theory and experiments. E-science is inherently interdisciplinary, involving researchers from several disciplines, and also opens up for large-scale collaborative efforts where physically distributed groups of scientists share software tools and data to make scientific progress. Within the field of e-science, new challenges are emerging in managing large-scale distributed computing efforts and distributed data sets. Different models, e.g. grids and clouds, have been introduced over the years, but new solutions built on these models are needed to enable easy and flexible use of distributed computing infrastructures by application scientists.

In the first part of the thesis, application execution environments are studied. The goal is to hide technical details of the underlying distributed computing infrastructure and expose secure and user-friendly environments to the end users. First, a general-purpose solution using portal technology is described, enabling transparent and easy usage of a variety of grid systems. Then a problem-solving environment for genetic analysis is presented. Here the statistical software R is used as a workflow engine, enhanced with grid-enabled routines for performing the computationally demanding parts of the analysis. Finally, the issue of resource allocation in grid system is briefly studied and certain modifications in the distributed resource-brokering model for the ARC middleware are proposed.

The second part of the thesis presents solutions for managing and analyzing scientific data using distributed storage resources. First, a new reliable and secure file-oriented distributed storage system, Chelonia, is presented. The architectural design of the system is described and implementation issues are considered. Also, the stability and scalable performance of Chelonia is verified using several test scenarios. Then, tools for providing an efficient and easy-to-use platform for data analysis built on Chelonia are presented. Here, a database driven approach is explored. An extended architecture where Chelonia is combined with the Web-Service MEDiator (WSMED) system is implemented, providing web service tools to query data without any further programming. This approach is then developed further and Chelonia is combined with SciSPARQL, a query language that extends SPARQL to queries over numeric scientific data. This results in a system that is capable of interactive analysis of distributed data sets. Writing customized modules in Java, Python or C can fulfill advanced application-specific analysis requirements. The viability of the approach is demonstrated by applying the system to data produced by URDME, a computational environment in systems biology and results for sample queries expressed in SciSPARQL are presented.

Finally, the use of an open source storage cloud, Openstack – SWIFT, for analysis of data from CERN experiments is considered. Here, a pilot implementation for the ROOT data analysis framework is presented together with a performance evaluation.

Ort, förlag, år, upplaga, sidor
Uppsala: Acta Universitatis Upsaliensis, 2012. s. 62
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 940
Nyckelord
Distributed Computing Infrastructures, Grids, Clouds, Application Management, Distributed Storage, Resource Allocation
Nationell ämneskategori
Programvaruteknik
Forskningsämne
Beräkningsvetenskap
Identifikatorer
urn:nbn:se:uu:diva-173467 (URN)978-91-554-8381-4 (ISBN)
Disputation
2012-06-14, Room 2446, Polacksbacken, Lägerhyddsvägen 2D, Uppsala, 13:15 (Engelska)
Opponent
Handledare
Projekt
eSSENCE
Tillgänglig från: 2012-05-24 Skapad: 2012-04-25 Senast uppdaterad: 2018-01-12Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltext

Personposter BETA

Andrejev, AndrejToor, SalmanHellander, AndreasHolmgren, SverkerRisch, Tore

Sök vidare i DiVA

Av författaren/redaktören
Andrejev, AndrejToor, SalmanHellander, AndreasHolmgren, SverkerRisch, Tore
Av organisationen
DatalogiAvdelningen för beräkningsvetenskapTillämpad beräkningsvetenskap
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 1297 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf