Skip to content

[TPDS'21] COSCO: Container Orchestration using Co-Simulation and Gradient Based Optimization for Fog Computing Environments

License

Notifications You must be signed in to change notification settings

imperial-qore/COSCO

Repository files navigation

COSCO Framework

COSCO is an AI based coupled-simulation and container orchestration framework for integrated Edge, Fog and Cloud Computing Environments. It's a simple python based software solution, where academics or practitioners can develop, simulate, test and deploy their scheduling policies. Further, this repo presents a novel gradient-based optimization strategy using deep neural networks as surrogate functions and co-simulations to facilitate decision making. A tutorial of the COSCO framework was presented at the International Conference of Performance Engineering (ICPE) 2022. Recording available here.

Advantages of COSCO

  1. Hassle free development of AI based scheduling algorithms in integrated edge, fog and cloud infrastructures.
  2. Provides seamless integration of scheduling policies with simulated back-end for enhanced decision making.
  3. Supports container migration physical deployments (not supported by other frameworks) using CRIU utility.
  4. Multiple deployment support as per needs of the developers. (Vagrant VM testbed, VLAN Fog environment, Cloud based deployment using Azure/AWS/OpenStack)
  5. Equipped with a smart real-time graph generation of utilization metrics using InfluxDB and Grafana.
  6. Real time metrics monitoring, logging and consolidated graph generation using custom Stats logger.

The basic architecture of COSCO has two main packages:
Simulator: It's a discrete event simulator and runs in a standalone system.
Framework: It’s a kind of tool to test the scheduling algorithms in a physical(real time) fog/cloud environment with real world applications.

Supported workloads: (Simulator) Bitbrains and Azure2017/2019; (Framework) DeFog and AIoTBench.

Our main COSCO work uses the Bitbrains and DeFog workloads. An extended work, MCDS (see workflow branch), accepted in IEEE TPDS uses scientific workflows. Check paper and code.

Novel Scheduling Algorithms

We present two novel algorithms in this work: GOBI and GOBI*. GOBI uses a neural network as a surrogate model and gradient based optimization using backpropagation of gradients to input. With advances like cosine annealing and momentum allow us to converge to an optima quickly. Moreover, GOBI* leverages a coupled simulation engine like a digital-twin to further improve the surrogate accuracy and subsequently the scheduling decisions. Experiments conducted using real-world data on fog applications using the GOBI and GOBI* methods, show a significant improvement in terms of energy consumption, response time, Service Level Objective and scheduling time by up to 15, 40, 4, and 82 percent respectively when compared to the state-of-the-art algorithms.

Supplementary video

IMAGE ALT TEXT HERE

A detailed course on using the COSCO framework for deep learning based scheduling (deep surrogate optimization and co-simulation) in fog environments is available as a youtube playlist.

Quick Start Guide

To run the COSCO framework, install required packages using

python3 install.py

To run the code with the required scheduler, modify line 106 of main.py to one of the several options including LRMMTR, RF, RL, RM, Random, RLRMMTR, TMCR, TMMR, TMMTR, GA, GOBI.

scheduler = GOBIScheduler('energy_latency_'+str(HOSTS))

To run the simulator, use the following command

python3 main.py

Gitpod

You can directly run tests on the results using a Gitpod Workspace without needing to install anything on your local machine. Click "Open in Gitpod" below and test the code by running python3 main.py.

Open in Gitpod

Wiki

Access the wiki for detailed installation instructions, implementing a custom scheduler and replication of results. All execution traces and training data is available at Zenodo under CC License.

Links

Items Contents
Paper https://ieeexplore.ieee.org/document/9448450 (with the "Code Reviewed Badge")
Pre-print https://arxiv.org/pdf/2104.14392.pdf
Documentation https://github.com/imperial-qore/COSCO/wiki
Video https://youtu.be/RZOWTj0rfBQ
Tutorial https://www.youtube.com/playlist?list=PLN_nzHzuaOBQijEwy2Fy8c09-dWYVe4XO
ICPE Tutorial https://youtu.be/osjpaNmkm_w
Extensions QoS aware scheduling (TPDS'22, code), Energy aware sustainable computing (JSS'21, code), EdgeAI (SIGMETRICS'21 Poster, NeurIPS'21 Workshop) and fault-tolerance (INFOCOM'22, code)
Contact Shreshth Tuli (@shreshthtuli)
Funding Imperial President's scholarship, H2020-825040 (RADON)

Cite this work

Our work is published in IEEE TPDS journal. Cite using the following bibtex entry.

@article{tuli2021cosco,
  author={Tuli, Shreshth and Poojara, Shivananda R. and Srirama, Satish N. and Casale, Giuliano and Jennings, Nicholas R.},
  journal={IEEE Transactions on Parallel and Distributed Systems}, 
  title={{COSCO: Container Orchestration Using Co-Simulation and Gradient Based Optimization for Fog Computing Environments}}, 
  year={2022},
  volume={33},
  number={1},
  pages={101-116},
}

License

BSD-3-Clause. Copyright (c) 2021, Shreshth Tuli. All rights reserved.

See License file for more details.