uu.seUppsala University Publications
Change search
ReferencesLink to record
Permanent link

Direct link
Highly Scalable Trip Grouping for Large Scale Collective Transportation Systems
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science. (UDBL)
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science. (UDBL)
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computing Science. (UDBL)
2008 (English)In: Proc. 11th International Conference on Extending Database Technology, EDBT 2008, 2008Conference paper (Refereed)
Abstract [en]

Transportation–related problems, like road congestion, park-ing, and pollution are increasing in most cities. In order toreduce traffic, recent work has proposed methods for vehiclesharing, for example for sharing cabs by grouping “closeby”cab requests and thus minimizing transportation cost andutilizing cab space. However, the methods proposed so fardo not scale to large data volumes, which is necessary tofacilitate large scale collective transportation systems, e.g.,ride–sharing systems for large cities.This paper presents highly scalable “trip grouping” algo-rithms, that generalize previous techniques and support in-put rates that can be orders of magnitude larger. The follow-ing three contributions make the grouping algorithms scal-able. First, the basic grouping algorithm is expressed as acontinuous stream query in a data stream management sys-tem to allow for very large flows of requests. Second, follow-ing the divide–and–conquer paradigm, four space–partition-ing policies for dividing the input data stream into sub–streams are developed and implemented using continuousstream queries. Third, using the partitioning policies, par-allel implementations of the grouping algorithm in a paral-lel computing environment are described. Extensive experi-mental results show that the parallel implementation usingsimple adaptive partitioning methods can achieve speed–upsof several orders of magnitudes without significantly effect-ing the quality of the grouping.

Place, publisher, year, edition, pages
National Category
Computer Science Computer Science
URN: urn:nbn:se:uu:diva-103534OAI: oai:DiVA.org:uu-103534DiVA: diva2:218438
Available from: 2009-05-19 Created: 2009-05-19 Last updated: 2011-07-04Bibliographically approved
In thesis
1. Scalable Parallelization of Expensive Continuous Queries over Massive Data Streams
Open this publication in new window or tab >>Scalable Parallelization of Expensive Continuous Queries over Massive Data Streams
2011 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Numerous applications in for example science, engineering, and financial analysis increasingly require online analysis over streaming data. These data streams are often of such a high rate that saving them to disk is not desirable or feasible. Therefore, search and analysis must be performed directly over the data in motion. Such on-line search and analysis can be expressed as continuous queries (CQs) that are defined over the streams. The result of a CQ is a stream itself, which is continuously updated as new data appears in the queried stream(s). In many cases, the applications require non-trivial analysis, leading to CQs involving expensive processing. To provide scalability of such expensive CQs over high-volume streams, the execution of the CQs must be parallelized.

In order to investigate different approaches to parallel execution of CQs, a parallel data stream management system called SCSQ was implemented for this Thesis. Data and queries from space physics and traffic management applications are used in the evaluations, as well as synthetic data and the standard data stream benchmark; the Linear Road Benchmark. Declarative parallelization functions are introduced into the query language of SCSQ, allowing the user to specify customized parallelization. In particular, declarative stream splitting functions are introduced, which split a stream into parallel sub-streams, over which expensive CQ operators are continuously executed in parallel.

Naïvely implemented, stream splitting becomes a bottleneck if the input streams are of high volume, if the CQ operators are massively parallelized, or if the stream splitting conditions are expensive. To eliminate this bottleneck, different approaches are investigated to automatically generate parallel execution plans for stream splitting functions. This Thesis shows that by parallelizing the stream splitting itself, expensive CQs can be processed at stream rates close to network speed. Furthermore, it is demonstrated how parallelized stream splitting allows orders of magnitude higher stream rates than any previously published results for the Linear Road Benchmark.


Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2011. 35 p.
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 836
National Category
Computer Science
Research subject
Computer Science with specialization in Database Technology
urn:nbn:se:uu:diva-152255 (URN)978-91-554-8095-0 (ISBN)
Public defence
2011-09-20, Auditorium Minus, Museum Gustavianum, Akademigatan 3, Uppsala, 13:15 (English)
Available from: 2011-06-10 Created: 2011-04-27 Last updated: 2014-07-21

Open Access in DiVA

No full text

Other links


Search in DiVA

By author/editor
Gidófalvi, GyözöRisch, ToreZeitler, Erik
By organisation
Computing Science
Computer ScienceComputer Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 281 hits
ReferencesLink to record
Permanent link

Direct link