Faculty Advisor or Committee Member

Elke A. Rundensteiner, Advisor

Identifier

etd-3686

Abstract

Streaming analytics deploy Kleene pattern queries to detect and aggregate event trends against high-rate data streams. Despite increasing workloads, most state-of-the-art systems process each query independently, thus missing cost-saving sharing opportunities. Sharing complex event trend aggregation poses several technical challenges. First, the execution of nested and diverse Kleene patterns is difficult to share. Second, we must share aggregate computation without the exponential costs of constructing the event trends. Third, not all sharing opportunities are beneficial because sharing aggregation introduces overhead. We propose a novel framework, Muse (Multi-query Snapshot Execution), that shares aggregation queries with Kleene patterns while avoiding expensive trend construction. It adopts an online sharing strategy that eliminates re-computations for shared sub-patterns. To determine the beneficial sharing plan, we introduce a cost model to estimate the sharing benefit and design the Muse refinement algorithm to efficiently select robust sharing candidates from the search space. Finally, we explore optimization decisions to further improve performance. Our experiments over a wide range of scenarios demonstrate that Muse increases throughput by 4 orders of magnitude compared to state-of-the-art approaches with negligible memory requirements.

Publisher

Worcester Polytechnic Institute

Degree Name

MS

Department

Data Science

Project Type

Thesis

Date Accepted

2020-05-07

Accessibility

Unrestricted

Subjects

complex event processing, event trend, incremental aggregation, multi-query optimization

Available for download on Friday, May 07, 2021

Share

COinS