Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Scalable queries over log database collections
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science. (UDBL)
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science. (UDBL)
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science. (UDBL)
2015 (English)In: Data Science, Springer, 2015, p. 173-185Conference paper, Published paper (Refereed)
Resource type
Text
Place, publisher, year, edition, pages
Springer, 2015. p. 173-185
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 9147
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:uu:diva-274784DOI: 10.1007/978-3-319-20424-6_17ISI: 000364104600017ISBN: 978-3-319-20423-9 (print)OAI: oai:DiVA.org:uu-274784DiVA, id: diva2:897636
Conference
BICOD 2015, July 6–8, Edinburgh, UK
Available from: 2015-06-11 Created: 2016-01-26 Last updated: 2021-11-22Bibliographically approved
In thesis
1. Scalable Queries over Log Database Collections
Open this publication in new window or tab >>Scalable Queries over Log Database Collections
2016 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

In industrial settings, machines such as trucks, hydraulic pumps, etc. are widely distributed at different geographic locations where sensors on machines produce large volumes of data. The data produced is stored locally in autonomous databases called log databases. The collection of log databases is dynamically changing when new sites are dynamically added or removed from the federation.

In this application context, an efficient way to search and analyze passed behavior of products in use is desired. To enable scalable queries over collections of distributed and autonomous log databases we developed the FLOQ (Fused LOg database Query processor) system, which provides a global view of the working status of all machines on the sites through a meta-database integrating the dynamic log database collection. A particular challenge in this scenario is a scalable way to process numerical queries that identify anomalies by joining data from the meta-database with data selected from the collection of distributed and autonomous log databases. The Thesis describes the architecture of FLOQ. In particular different strategies to execute numerical queries over log database collections are investigated. FLOQ allows both the meta-database and the log databases to be stored in multiple formats using different kinds of data managers. FLOQ provides general and extensible mechanisms for efficient processing of queries over different kinds of distributed data sources.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2016. p. 51
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1343
National Category
Computer Sciences
Research subject
Computer Science with specialization in Database Technology
Identifiers
urn:nbn:se:uu:diva-275044 (URN)978-91-554-9472-8 (ISBN)
Public defence
2016-03-30, 2446, Department of Information Technology, Polacksbacken (Lägerhyddsvägen 2), Uppsala, 13:00 (English)
Opponent
Supervisors
Available from: 2016-03-03 Created: 2016-01-28 Last updated: 2018-01-10Bibliographically approved
2. Scalable Data Management for Internet of Things
Open this publication in new window or tab >>Scalable Data Management for Internet of Things
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Internet of Things (IoT) often involve considerable numbers of sensors that produce large volumes of data. In this context, efficient management of data could potentially enable automatic decision making based on analytics of sensors on equipment. However, these sensors are often geographically distributed and generate diverse formats of data in form of sensor streams at a high rate. The combination of these properties of IoT pose significant challenges for the existing database management systems (DBMSs) to provide scalable data storage and analytics.

The problem of providing efficient data management of distributed IoT applications using DBMS technologies is addressed in this thesis. Initially, we developed a prototype system, Fused LOg database Query Processor (FLOQ), which enables general query processingover collections of relational databases that are deployed locally on distributed sites to store sensor measurement logs. Although FLOQ provides efficient query execution when scaling the number of distributed databases, it exhibits complexity and scalability issues for large IoT applications having heterogeneous data. The limitations of FLOQ are primarily inherent to its use of relational database backends for storage of sensor logs.

When a relational database is used to store large-scale IoT data, it exhibits several challenges. The loading of massive logs produced at high rates is not fast enough due to its strong consistency mechanisms. Furthermore, it could demonstrate a single point of failure that limits the availability, and the inflexible schemas make it difficult to manage heterogeneity. In contrast to relational databases, distributed NoSQL data stores could provide scalable storage of heterogeneous data through data partitioning, replication, and high availability by sacrificing strong consistency. To understand the suitability of NoSQL databases, this thesis also investigates to what degree NoSQL DBMSs provide scalable storage and analytics of IoT applications by comparing a variety of state-of-the-art relational and NoSQL databases for real-world industrial IoT data. 

The experimental evaluations reveal that the scalability can be provided by the distributed NoSQL data stores; however, the support of advanced data analytics is difficult due to their limited query processing capabilities. Furthermore, data management of distributed IoT applications often requires seamless integration between a real-time edge analytics platform, a distributed storage manager, effective data integration, and query processing techniques for handling heterogeneity. Therefore, in order to provide a holistic data management solution, this thesis developed the Extended Query Processing (EQP) system, which enables advanced analytics for supporting both edge and offline analytics for large-scale IoT applications.

These contributions enable efficient data management of large-scale heterogeneous IoT applications and supports advanced analytics.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2021. p. 44
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 2095
Keywords
NoSQL, IoT, Smart Computing, MongoDB, IIoT, Data Streams, Edge Computing
National Category
Computer Sciences Computer Systems
Research subject
Computer Science with specialization in Database Technology
Identifiers
urn:nbn:se:uu:diva-458420 (URN)978-91-513-1346-7 (ISBN)
Public defence
2022-01-14, Room 2446, Polacksbacken, Lägerhyddsvägen 2, Uppsala, 13:15 (English)
Opponent
Supervisors
Funder
eSSENCE - An eScience Collaboration
Available from: 2021-12-21 Created: 2021-11-22 Last updated: 2022-01-18Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Zhu, MinpengMahmood, KhalidRisch, Tore

Search in DiVA

By author/editor
Zhu, MinpengMahmood, KhalidRisch, Tore
By organisation
Computing Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 430 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf