Distributed placement of machine learning operators for iot applications spanning edge and cloud resources

By Tarek Elgamal, Atul Sandur, Gul Agha, and Klara Nahrstedt. In SysML Conference 2018, Stanford, CA, USA, February 15-16, 2018, 1–3. 2018.

Publisher Link:: https://www.sysml.cc/doc/204.pdf

Abstract

Internet of Things (IoT) applications generate massive amounts of real-time data that are typically processed for carrying out complex tasks such as vision and speech processing. Owners of such data strive to make predictions/inference from large streams of complex data such as video feeds, often using pre-trained neural network models. A typical deployment of IoT applications includes edge devices to acquire the input data and provide processing and storage capacity closer to the location where the data is captured. This can obviate the need to move all the data/processing to a remote cloud service. However, since edge devices are limited in computational capacity, we need to determine the optimal placement of operations between edge and remote cloud resources to optimize the performance of neural network model inference. In this paper we propose an algorithm to decide the partitioning of neural network operations across edge and cloud resources. Our algorithm is linear in the number of operations (m) of the neural network model and the overall complexity is O(m · (L!) · L), where L is the number of resources (typical deployments include one edge resource and the cloud, in which case L is 2).

BibTeX

@inproceedings{conf/Sysml2018,
    author = "Elgamal, Tarek and Sandur, Atul and Agha, Gul and
              Nahrstedt, Klara",
    title = "Distributed Placement of Machine Learning Operators for
             IoT applications spanning Edge and Cloud Resources",
    booktitle = "SysML Conference 2018, Stanford, CA, USA,
                 February 15-16, 2018",
    pages = "1--3",
    timestamp = "Tue, 25 Sep 2018 01:00:00 +0200",
    url = "https://www.sysml.cc/doc/204.pdf",
    year = "2018",
}