Open Systems Laboratory at Illinois

Streaming analytics with adaptive near-data processing

By Atul Sandur, ChanHo Park, Stavros Volos, Gul Agha, and Myeongjae Jeon. In Companion Proceedings of the Web Conference 2022, WWW '22, 563–566. New York, NY, USA, 2022. Association for Computing Machinery.

DOI:
10.1145/3487553.3524858
Publisher Link:
https://doi.org/10.1145/3487553.3524858

Abstract

Streaming analytics applications need to process massive volumes of data in a timely manner, in domains ranging from datacenter telemetry and geo-distributed log analytics to Internet-of-Things systems. Such applications suffer from significant network transfer costs to transport the data to a stream processor and compute costs to analyze the data in a timely manner. Pushing the computation closer to the data source by partitioning the analytics query is an effective strategy to reduce resource costs for the stream processor. However, the partitioning strategy depends on the nature of resource bottleneck and resource variability that is encountered at the compute resources near the data source. In this paper, we investigate different issues which affect query partitioning strategies. We first study new partitioning techniques within cloud datacenters which operate under constrained compute conditions varying widely across data sources and different time slots. With insights obtained from the study, we suggest several different ways to improve the performance of stream analytics applications operating in different resource environments, by making effective partitioning decisions for a variety of use cases such as geo-distributed streaming analytics.

BibTeX

@inproceedings{10.1145/3487553.3524858,
    author = "Sandur, Atul and Park, ChanHo and Volos, Stavros and
              Agha, Gul and Jeon, Myeongjae",
    title = "Streaming Analytics with Adaptive Near-Data Processing",
    address = "New York, NY, USA",
    booktitle = "Companion Proceedings of the Web Conference 2022",
    doi = "10.1145/3487553.3524858",
    ee = "https://doi.org/10.1145/3487553.3524858",
    isbn = "9781450391306",
    keywords = "Datacenter monitoring, Streaming analytics, Edge
                computing, Wide area network, Query partitioning",
    location = "Virtual Event, Lyon, France",
    numpages = "4",
    pages = "563–566",
    publisher = "Association for Computing Machinery",
    series = "WWW '22",
    year = "2022",
}