uu.seUppsala universitets publikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Semantic Web Queries over Scientific Data
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datalogi. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för datalogi. (UDBL)ORCID-id: 0000-0002-7965-9128
2016 (Engelska)Doktorsavhandling, monografi (Övrigt vetenskapligt)
Abstract [en]

Semantic Web and Linked Open Data provide a potential platform for interoperability of scientific data, offering a flexible model for providing machine-readable and queryable metadata. However, RDF and SPARQL gained limited adoption within the scientific community, mainly due to the lack of support for managing massive numeric data, along with certain other important features – such as extensibility with user-defined functions, query modularity, and integration with existing environments and workflows.

We present the design, implementation and evaluation of Scientific SPARQL – a language for querying data and metadata combined, represented using the RDF graph model extended with numeric multidimensional arrays as node values – RDF with Arrays. The techniques used to store RDF with Arrays in a scalable way and process Scientific SPARQL queries and updates are implemented in our prototype software – Scientific SPARQL Database Manager, SSDM, and its integrations with data storage systems and computational frameworks. This includes scalable storage solutions for numeric multidimensional arrays and an efficient implementation of array operations. The arrays can be physically stored in a variety of external storage systems, including files, relational databases, and specialized array data stores, using our Array Storage Extensibility Interface. Whenever possible SSDM accumulates array operations and accesses array contents in a lazy fashion.

In scientific applications numeric computations are often used for filtering or post-processing the retrieved data, which can be expressed in a functional way. Scientific SPARQL allows expressing common query sub-tasks with functions defined as parameterized queries. This becomes especially useful along with functional language abstractions such as lexical closures and second-order functions, e.g. array mappers.

Existing computational libraries can be interfaced and invoked from Scientific SPARQL queries as foreign functions. Cost estimates and alternative evaluation directions may be specified, aiding the construction of better execution plans. Costly array processing, e.g. filtering and aggregation, is thus preformed on the server, saving the amount of communication. Furthermore, common supported operations are delegated to the array storage back-ends, according to their capabilities. Both expressivity and performance of Scientific SPARQL are evaluated on a real-world example, and further performance tests are run using our mini-benchmark for array queries.

Ort, förlag, år, upplaga, sidor
Uppsala: Acta Universitatis Upsaliensis, 2016. , s. 214
Serie
Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1104-2516 ; 121
Nyckelord [en]
RDF, SPARQL, Arrays, Query optimization, Second-order functions, Scientific workflows
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
Datavetenskap med inriktning mot databasteknik
Identifikatorer
URN: urn:nbn:se:uu:diva-274856ISBN: 978-91-554-9465-0 (tryckt)OAI: oai:DiVA.org:uu-274856DiVA, id: diva2:897986
Disputation
2016-03-23, Lecture hall 2446, Polacksbacken, Uppsala, 14:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2016-02-25 Skapad: 2016-01-26 Senast uppdaterad: 2018-01-10Bibliografiskt granskad

Open Access i DiVA

fulltext(1693 kB)1137 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 1693 kBChecksumma SHA-512
96fd85db5fe313a0086477a831fa0fa8872b3f0dd64f39cba29d0220c668f3f06a841acc99d183c4928dd4d3769f2b4528d7c6d1c4b1e591bf46fb6d6e0c4b14
Typ fulltextMimetyp application/pdf
Köp publikationen >>

Personposter BETA

Andrejev, Andrej

Sök vidare i DiVA

Av författaren/redaktören
Andrejev, Andrej
Av organisationen
DatalogiAvdelningen för datalogi
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 1137 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

isbn
urn-nbn

Altmetricpoäng

isbn
urn-nbn
Totalt: 2710 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf