Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Wrapping a NoSQL Datastore for Stream Analytics
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science.
2020 (English)In: 2020 IEEE 21st International Conference On Information Reuse And Integration For Data Science (IRI 2020), Institute of Electrical and Electronics Engineers (IEEE) , 2020, p. 301-305Conference paper, Published paper (Refereed)
Abstract [en]

With the advent of the Industrial Internet of Things (IIoT) and Industrial Analytics, numerous application scenarios emerge, where business and mission-critical decisions depend upon large scale analytics of sensor streams. However, very large volumes of data from data streams generated at a high rate pose substantial challenges in providing scalable analytics from existing Database Management Systems (DBMS). While scalability can be provided by high-performance distributed datastores, due to the simple query operations, access to high-level query-based data analytics is usually limited. This work combines high-level query-based data analytics capabilities with high-performance distributed scalability by applying a wrapper-mediator approach. The Amos II extensible main-memory DBMS provides online query processing data analytics engine in front of the MongoDB distributed NoSQL datastore to support large-scale distributed data analytics over persisted data streams. Thus, the implemented system enables query-based online data stream analytics over persisted data streams stored/logged in distributed NoSQL datastores.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2020. p. 301-305
Keywords [en]
NoSQL Datastores, MongoDB, IIoT, Data Streams
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:uu:diva-442569DOI: 10.1109/IRI49571.2020.00050ISI: 000635425100042OAI: oai:DiVA.org:uu-442569DiVA, id: diva2:1555209
Conference
2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI), Las Vegas, USA, 11-13 Aug. 2020
Funder
EU, FP7, Seventh Framework ProgrammeAvailable from: 2021-05-18 Created: 2021-05-18 Last updated: 2021-11-22Bibliographically approved
In thesis
1. Scalable Data Management for Internet of Things
Open this publication in new window or tab >>Scalable Data Management for Internet of Things
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Internet of Things (IoT) often involve considerable numbers of sensors that produce large volumes of data. In this context, efficient management of data could potentially enable automatic decision making based on analytics of sensors on equipment. However, these sensors are often geographically distributed and generate diverse formats of data in form of sensor streams at a high rate. The combination of these properties of IoT pose significant challenges for the existing database management systems (DBMSs) to provide scalable data storage and analytics.

The problem of providing efficient data management of distributed IoT applications using DBMS technologies is addressed in this thesis. Initially, we developed a prototype system, Fused LOg database Query Processor (FLOQ), which enables general query processingover collections of relational databases that are deployed locally on distributed sites to store sensor measurement logs. Although FLOQ provides efficient query execution when scaling the number of distributed databases, it exhibits complexity and scalability issues for large IoT applications having heterogeneous data. The limitations of FLOQ are primarily inherent to its use of relational database backends for storage of sensor logs.

When a relational database is used to store large-scale IoT data, it exhibits several challenges. The loading of massive logs produced at high rates is not fast enough due to its strong consistency mechanisms. Furthermore, it could demonstrate a single point of failure that limits the availability, and the inflexible schemas make it difficult to manage heterogeneity. In contrast to relational databases, distributed NoSQL data stores could provide scalable storage of heterogeneous data through data partitioning, replication, and high availability by sacrificing strong consistency. To understand the suitability of NoSQL databases, this thesis also investigates to what degree NoSQL DBMSs provide scalable storage and analytics of IoT applications by comparing a variety of state-of-the-art relational and NoSQL databases for real-world industrial IoT data. 

The experimental evaluations reveal that the scalability can be provided by the distributed NoSQL data stores; however, the support of advanced data analytics is difficult due to their limited query processing capabilities. Furthermore, data management of distributed IoT applications often requires seamless integration between a real-time edge analytics platform, a distributed storage manager, effective data integration, and query processing techniques for handling heterogeneity. Therefore, in order to provide a holistic data management solution, this thesis developed the Extended Query Processing (EQP) system, which enables advanced analytics for supporting both edge and offline analytics for large-scale IoT applications.

These contributions enable efficient data management of large-scale heterogeneous IoT applications and supports advanced analytics.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2021. p. 44
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 2095
Keywords
NoSQL, IoT, Smart Computing, MongoDB, IIoT, Data Streams, Edge Computing
National Category
Computer Sciences Computer Systems
Research subject
Computer Science with specialization in Database Technology
Identifiers
urn:nbn:se:uu:diva-458420 (URN)978-91-513-1346-7 (ISBN)
Public defence
2022-01-14, Room 2446, Polacksbacken, Lägerhyddsvägen 2, Uppsala, 13:15 (English)
Opponent
Supervisors
Funder
eSSENCE - An eScience Collaboration
Available from: 2021-12-21 Created: 2021-11-22 Last updated: 2022-01-18Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Mahmood, KhalidOrsborn, KjellRisch, Tore

Search in DiVA

By author/editor
Mahmood, KhalidOrsborn, KjellRisch, Tore
By organisation
Computing Science
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 130 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf