Cube Studio

Infra

cube-studio is a one-stop cloud-native machine learning platform open sourced by Tencent Music, Currently mainly includes the following functions

1、data management: feature store, online and offline features; dataset management, structure data and media data, data label platform
2、develop: notebook(vscode/jupyter); docker image management; image build online
3、train: pipeline drag and drop online; open template market; distributed computing/training tasks, example tf/pytorch/mxnet/spark/ray/horovod/kaldi/volcano; batch priority scheduling; resource monitoring/alarm/balancing; cron scheduling
4、automl: nni, katib, ray
5、inference: model manager; serverless traffic control; tf/pytorch/onnx/tensorrt model deploy, tfserving/torchserver/onnxruntime/triton inference; VGPU; load balancing、high availability、elastic scaling
6、infra: multi-user; multi-project; multi-cluster; edge cluster mode; blockchain sharing;

Doc

https://github.com/tencentmusic/cube-studio/wiki

WeChat group

learning、deploy、consult、contribution、cooperation, join group, wechart id luanpeng1234 remark<open source>, construction guide

Job Template

tips:

1、You can develop your own template, Easy to develop and more suitable for your own scenarios

template	type	describe
linux	base	Custom stand-alone operating environment, free to implement all custom stand-alone functions
datax	import export	Import and export of heterogeneous data sources
media-download	data processing	Distributed download of media files
video-audio	data processing	Distributed extraction of audio from video
video-img	data processing	Distributed extraction of pictures from video
sparkjob	data processing	spark serverless
ray	data processing	python ray multi-machine distributed framework
volcano	data processing	volcano multi-machine distributed framework
xgb	machine learning	xgb model training and inference
ray-sklearn	machine learning	sklearn based on ray framework supports multi-machine distributed parallel computing
pytorchjob-train	model train	Multi-machine distributed training of pytorch
horovod-train	model train	Multi-machine distributed training of horovod
tfjob	model train	Multi-machine distributed training of tensorflow
tfjob-train	model train	distributed training of tensorflow: plain and runner
tfjob-runner	model train	distributed training of tensorflow: runner method
tfjob-plain	model train	distributed training of tensorflow: plain method
kaldi-train	model train	Multi-machine distributed training of kaldi
tf-model-evaluation	model evaluate	distributed model evaluation of tensorflow2.3
tf-offline-predict	model inference	distributed offline model inference of tensorflow2.3
model-offline-predict	model inference	distributed offline model inference of framework
deploy-service	model deploy	deploy inference service

Deploy

wiki

Contributor

algorithm: @hujunaifuture @jaffe-fly @JLWLL @ma-chengcheng @chendile

platform: @xiaoyangmai @VincentWei2021 @SeibertronSS @cyxnzb @gilearn @wulingling0108

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Cube Studio

Infra

Doc

WeChat group

Job Template

Deploy

Contributor

Company

Files

README.md

Latest commit

History

README.md

File metadata and controls

Cube Studio

Infra

Doc

WeChat group

Job Template

Deploy

Contributor

Company