| Roadmap |
Saturn is a novel system for multi-model deep learning training that automatically optimizes jobs for highly efficient training. It automatically selects parallelization techniques, determines optimized resource allocations, and constructs execution schedules for submitted jobs. Applying Saturn for hyperparameter optimization or model selection requires only a few lines of code.
Saturn is designed to support extensibility, allowing users to specify new execution procedures that can be included in its optimization plan and search space. In this way, you can keep up with the latest advances in model execution optimizations without having to wait for library updates & changes.
To install Saturn, please read the instructions. We're always excited to hear about new use cases and details of your experience with Saturn, so feel free to contact us at knagrech@ucsd.edu if you want to share news.
We currently prioritize PyTorch support, but Saturn's general techniques are framework-independent. We would welcome contributions for TensorFlow & Jax.
We welcome contributions to Saturn. Areas of particular interest are an alternative solver (e.g. using reinforcement learning), new interfaces, dashboards, and ways to support online job submissions. Please let us know if you encounter any bugs or have any suggestions by submitting an issue.
You can join the Slack here: https://join.slack.com/t/saturn-dl/shared_invite/zt-267mfi3s4-ifUYLiJUtaVeGFcYe9vbxA or by scanning this QR code:
You can find the docs for Saturn here.
If you use this system in an academic work, please cite our tech report as follows.
@article{nagrechasaturn,
title={Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads (Information System Architectures)},
author={Nagrecha, Kabir and Kumar, Arun}
}
Saturn is currently developed and maintained by Kabir Nagrecha at UCSD.
Saturn uses Apache License 2.0.