uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Scalable Preservation, Reconstruction, and Querying of Databases in terms of Semantic Web Representations
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computing Science. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science. (UDBL)
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This Thesis addresses how Semantic Web representations, in particular RDF, can enable flexible and scalable preservation, recreation, and querying of databases.

An approach has been developed for selective scalable long-term archival of relational databases (RDBs) as RDF, implemented in the SAQ (Semantic Archive and Query) system. The archival of user-specified parts of an RDB is specified using an extension of SPARQL, A-SPARQL. SAQ automatically generates an RDF view of the RDB, the RD-view. The result of an archival query is RDF triples stored in: i) a data archive file containing the preserved RDB content, and ii) a schema archive file containing sufficient meta-data to reconstruct the archived database. To achieve scalable data preservation and recreation, SAQ uses special query rewriting optimizations for the archival queries. It was experimentally shown that they improve query execution and archival time compared with naïve processing. The performance of SAQ was compared with that of other systems supporting SPARQL queries to views of existing RDBs.

When an archived RDB is to be recreated, the reloader module of SAQ first reads the schema archive file and executes a schema reconstruction algorithm to automatically construct the RDB schema. The thus created RDB is populated by reading the data archive and converting the read data into relational attribute values. For scalable recreation of RDF archived data we have developed the Triple Bulk Load (TBL) approach where the relational data is reconstructed by using the bulk load facility of the RDBMS. Our experiments show that the TBL approach is substantially faster than the naïve Insert Attribute Value (IAV) approach, despite the added sorting and post-processing.

To view and query semi-structured Topic Maps data as RDF the prototype system TM-Viewer was implemented. A declarative RDF view of Topic Maps, the TM-view, is automatically generated by the TM-viewer using a developed conceptual schema for the Topic Maps data model. To achieve efficient query processing of SPARQL queries to the TM-view query rewrite transformations were developed and evaluated. It was shown that they significantly improve the query execution time.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2013. , 59 p.
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1052
Keyword [en]
RDF, RDFS, RDF view, SPARQL, SPARQL query processing, rewrite optimization, Topic Maps, querying of RDF views, archive relational databases, reconstruct archived databases
National Category
Computer Science
Research subject
Computer Science with specialization in Database Technology
Identifiers
URN: urn:nbn:se:uu:diva-199573ISBN: 978-91-554-8690-7 (print)OAI: oai:DiVA.org:uu-199573DiVA: diva2:620172
Public defence
2013-06-14, Room 2446, Polacksbacken, Lägerhyddsvägen 2, Uppsala, 13:00 (English)
Opponent
Supervisors
Projects
eSSENCE
Funder
eSSENCE - An eScience Collaboration
Available from: 2013-05-24 Created: 2013-05-07 Last updated: 2014-07-21Bibliographically approved
List of papers
1. SPARQL queries to RDFS views of Topic Maps
Open this publication in new window or tab >>SPARQL queries to RDFS views of Topic Maps
2010 (English)In: International Journal of Metadata, Semantics and Ontologies (IJMSO), ISSN 1744-2621, Vol. 5, no 1, 1-16 p.Article in journal (Refereed) Published
Place, publisher, year, edition, pages
Inderscience, 2010
National Category
Computer Science Computer Science
Research subject
Computer Science with specialization in Database Technology
Identifiers
urn:nbn:se:uu:diva-142495 (URN)10.1504/IJMSO.2010.032647 (DOI)
Projects
eSSENCE
Available from: 2010-03-31 Created: 2011-01-14 Last updated: 2013-08-30Bibliographically approved
2. Optimizing Unbound-property Queries to RDF Views of RelationalDatabases
Open this publication in new window or tab >>Optimizing Unbound-property Queries to RDF Views of RelationalDatabases
2011 (English)Conference paper, Published paper (Refereed)
Abstract [en]

SAQ (Semantic Archive and Query) is a system for querying and long-term preservation of relational data in terms of RDF. In SAQ relational data in a back-end DBMS is exposed as an RDF view, called the RD-view. SAQ can process arbitrary SPARQL queries to the RD-view. In addition long-term preservation as RDF of selected parts of a relational database is specified by SPARQL queries to the RD-view. Such queries usually select sets of RDF properties and thus in the query definition a property p is unknown. We call such queries unbound-property queries. This class of queries is also present in the SPARQL benchmarks. We optimize unbound-property queries by introducing a query transformation algorithm called Group Common Terms, GCT. It pulls out from a DNF normalized query those common terms that can be translated to SQL predicates accessing the relational database. Our experiments using the Berlin SPARQL benchmark show that GCT improves substantially the query execution time to a back-end commercial relational DBMS for both selective and unselective unbound-property queries. We compared the performance of our approach with the performance of other systems processing SPARQL queries over views of relational databases and showed that GCT improves scalability compared to the approaches used by the other systems.

Place, publisher, year, edition, pages
Bonn, Germany: , 2011. 16 p.
Keyword
SPARQL queries, RDF views of relational databases, query optimization, query rewrites, unbound property queries
National Category
Computer Science
Research subject
Computer Science with specialization in Database Technology
Identifiers
urn:nbn:se:uu:diva-199569 (URN)
Conference
The 7th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS 2011) at the 10th International Semantic Web Conference (ISWC 2011), Bonn, Germany, October 24th, 2011
Available from: 2013-05-07 Created: 2013-05-07 Last updated: 2013-08-30
3. Scalable long-term preservation of relational data through SPARQL queries
Open this publication in new window or tab >>Scalable long-term preservation of relational data through SPARQL queries
2016 (English)In: Semantic Web, ISSN 1570-0844, E-ISSN 2210-4968, Vol. 7, no 2, 117-137 p.Article in journal (Refereed) Published
National Category
Computer Science
Identifiers
urn:nbn:se:uu:diva-199570 (URN)10.3233/SW-150173 (DOI)000373208100002 ()
Projects
eSSENCE
Available from: 2016-02-12 Created: 2013-05-07 Last updated: 2017-12-06Bibliographically approved
4. Scalable reconstruction of RDF-archived relational databases
Open this publication in new window or tab >>Scalable reconstruction of RDF-archived relational databases
2013 (English)In: Proc. 5th International Workshop on Semantic Web Information Management, New York: ACM Press, 2013, 5:1-4 p.Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
New York: ACM Press, 2013
National Category
Computer Science
Research subject
Computer Science with specialization in Database Technology
Identifiers
urn:nbn:se:uu:diva-199571 (URN)10.1145/2484712.2484717 (DOI)978-1-4503-2194-5 (ISBN)
Conference
5th International Workshop on Semantic Web Information Management (SWIM 2013), June 23, New York
Projects
eSSENCE
Funder
eSSENCE - An eScience Collaboration
Available from: 2013-06-23 Created: 2013-05-07 Last updated: 2013-12-17Bibliographically approved

Open Access in DiVA

fulltext(1810 kB)1126 downloads
File information
File name FULLTEXT01.pdfFile size 1810 kBChecksum SHA-512
e15715c7c7e90004cbd2be1b58aaa29ca3715c254d23ea2260498809d7dc76cc008c8b777f03764c9b48c47606e085ee2ef43629bdfe3f6ebb8d1478efffab03
Type fulltextMimetype application/pdf
Buy this publication >>

Authority records BETA

Stefanova, Silvia

Search in DiVA

By author/editor
Stefanova, Silvia
By organisation
Division of Computing ScienceComputing Science
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 1126 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1003 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf