Elastic Scaling for Distributed Latency-sensitive Data Stream Operators

Abstract

High-volume data streams are straining the limits of stream processing frameworks which need advanced parallel processing capabilities to withstand the actual incoming bandwidth. Parallel processing must be synergically integrated with elastic features in order dynamically scale the amount of utilized resources by accomplishing the Quality of Service goals in a costeffective manner. This paper proposes a control-theoretic strategy to drive the elastic behavior of latency-sensitive streaming operators in distributed environments. The strategy takes scaling decisions in advance by relying on a predictive model-based approach. Our ideas have been experimentally evaluated on a cluster using a real-world streaming application fed by synthetic and real datasets. The results show that our approach takes the strictly necessary reconfigurations while providing reduced resource consumption. Furthermore, it allows the operator to meet desired average latency requirements with a significant reduction in the experienced latency jitter.

Publication
Proceedings of the 25th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2017