Scalable Validation of Data Streams
2016 (English)Doctoral thesis, comprehensive summary (Other academic)
In manufacturing industries, sensors are often installed on industrial equipment generating high volumes of data in real-time. For shortening the machine downtime and reducing maintenance costs, it is critical to analyze efficiently this kind of streams in order to detect abnormal behavior of equipment.
For validating data streams to detect anomalies, a data stream management system called SVALI is developed. Based on requirements by the application domain, different stream window semantics are explored and an extensible set of window forming functions are implemented, where dynamic registration of window aggregations allow incremental evaluation of aggregate functions over windows.
To facilitate stream validation on a high level, the system provides two second order system validation functions, model-and-validate and learn-and-validate. Model-and-validate allows the user to define mathematical models based on physical properties of the monitored equipment, while learn-and-validate builds statistical models by sampling the stream in real-time as it flows.
To validate geographically distributed equipment with short response time, SVALI is a distributed system where many SVALI instances can be started and run in parallel on-board the equipment. Central analyses are made at a monitoring center where streams of detected anomalies are combined and analyzed on a cluster computer.
SVALI is an extensible system where functions can be implemented using external libraries written in C, Java, and Python without any modifications of the original code.
The system and the developed functionality have been applied on several applications, both industrial and for sports analytics.
Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2016. , 51 p.
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1384
Data Stream Management, Distributed Data Stream Processing, Data Stream Validation, Anomaly Detection
Research subject Computer Science with specialization in Database Technology
IdentifiersURN: urn:nbn:se:uu:diva-291530ISBN: 978-91-554-9600-5OAI: oai:DiVA.org:uu-291530DiVA: diva2:925793
2016-08-17, room 2446, ITC building 2, Lägerhyddsvägen 2, Uppsala, 13:15 (English)
Lee, Byung-Suk, Professor
Risch, Tore, Professor
List of papers