• Media type: E-Article
  • Title: Massive scale-out of expensive continuous queries
  • Contributor: Zeitler, Erik; Risch, Tore
  • Published: Association for Computing Machinery (ACM), 2011
  • Published in: Proceedings of the VLDB Endowment, 4 (2011) 11, Seite 1181-1188
  • Language: English
  • DOI: 10.14778/3402707.3402752
  • ISSN: 2150-8097
  • Keywords: General Medicine
  • Origination:
  • Footnote:
  • Description: Scalable execution of expensive continuous queries over massive data streams requires input streams to be split into parallel sub-streams. The query operators are continuously executed in parallel over these sub-streams. Stream splitting involves both partitioning and replication of incoming tuples, depending on how the continuous query is parallelized. We provide a stream splitting operator that enables such customized stream splitting. However, it is critical that the stream splitting itself keeps up with input streams of high volume. This is a problem when the stream splitting predicates have some costs. Therefore, to enable customized splitting of high-volume streams, we introduce a parallelized stream splitting operator, called parasplit. We investigate the performance of parasplit using a cost model and experimentally. Based on these results, a heuristic is devised to automatically parallelize the execution of parasplit. We show that the maximum stream rate of parasplit is network bound, and that the parallelization is energy efficient. Finally, the scalability of our approach is experimentally demonstrated on the Linear Road Benchmark, showing an order of magnitude higher stream processing rate over previously published results, allowing at least 512 expressways.