Abstract
The Internet of Things (IoT) has enabled an abundance of geographically distributed physical devices or "things" equipped with sensors and actuators to exchange information with the Cloud. However, this paradigm remains largely under-exploited for real-time analytic applications. The benefit of realtime data acquisition at the Edge becomes fruitless as it is not readily accessible to more powerful data analytic tools in the Cloud due to wide-area network delays. In this paper, we present VRebalance, a virtual resource orchestrator that provides an end-to-end performance guarantee for concurrent stream processing workloads at the Edge. VRebalance employs Bayesian Optimization \mathcal{BO} to quickly identify near-optimal resource configurations. Experimental results with a real-time open-source IoT benchmark for Distributed Stream Processing Platforms (RIoTBench) and a representative stream processing engine (Apache Storm) demonstrate the superior performance, resource efficiency and adaptiveness of our \mathcal{BO} -based resource management system. VRebalance meets the performance SLO (service level objective) targets for stream processing workloads even in the presence of acute system dynamics. It decreases the SLO violation rate by at least 34% for static workloads and by 62.5% for dynamic workloads compared to a hill climbing method. Compared to Storm's default resource scaling mechanism, our method decreases the SLO violation rate by 83.7%.