Skip to content

Commit

Permalink
Add roadmap. (#1317)
Browse files Browse the repository at this point in the history
* Add roadmap.

* Update ROADMAP.md
  • Loading branch information
concretevitamin authored Oct 31, 2022
1 parent 99fe087 commit e49c726
Showing 1 changed file with 59 additions and 0 deletions.
59 changes: 59 additions & 0 deletions ROADMAP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# SkyPilot Roadmap

This doc lists general directions of interest to facilitate community contributions.

Note that
- This list is not meant to be comprehensive (i.e., new work items of interest may pop up)
- Even though listed under a specific version, not all items need to be completed before we ship that version (i.e., some items can go into future versions)

## v0.3

### Managed Spot
- Minimize the cost of the controller
- Support running spot controller on an existing/local cluster
- Reducing the fixed cost of the controller (e.g., allow setting controller VM type)
- Supporting a higher number of pending/concurrent jobs
- Framework-specific guides to add checkpointing/reloading using SkyPilot Storage

### Smarter Optimizer
- Fine-grained optimizer: pick by cheapest zone order
- Better consider data egress time/cost
- Consider buckets/Storage objects in file_mounts
- Optimizing the data placement for SkyPilot Storage local uploads
- Use the optimizer to decide the bucket location

### Programmatic API
- Refactor/extend the current API to *make it easy to programmatically use SkyPilot*
- Expose core classes in docs

### Support more clouds
- Refactoring of interfaces to ease adding new clouds
- IBM Cloud
- Explore support for low-cost clouds (e.g., lambda labs/runpod/jarvis labs)

### On-prem
- Robustify the on-prem feature
- Design for switching between cloud and on-prem
- Explore/design of "local mode" to run SkyPilot tasks locally

### Faster launching speed
- Consider a more minimal image
- Azure speed investigation

### k8s support
- Ray-on-k8s backend
- To figure out: Launch a new k8s cluster? Launch SkyPilot Tasks to an existing k8s cluster?

### Cost: Optimization, Tracking, and Reporting
- Track and show costs related to a job/cluster
- For managed spot jobs, track and show %savings vs. on-demand
- Optimizer: take into account disk costs

### Serverless
- Design and prototype of a "serverless jobs" submission API and CLI
- Initial use case: hundreds of hyperparameter tuning trials

### Backend
- Support heterogeneous node types in a cluster (e.g., in RL, CPU actor(s) and GPU learner(s) in the same cluster)
- Support CPUs as resource requirements
- General robustness/UX improvements

0 comments on commit e49c726

Please sign in to comment.