Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

Latest commit

 

History

History
25 lines (21 loc) · 1.51 KB

File metadata and controls

25 lines (21 loc) · 1.51 KB

ITREX Distributed Training

Version: 0.1.0 Type: application AppVersion: 1.16.0

A Helm chart for Kubernetes

Values

Key Type Default Description
distributed.eval.batch_size int 64 Evaluation Batch Size
distributed.image.image_name string "intel/ai-tools"
distributed.image.image_tag string "itrex-devel-1.1.0"
distributed.model_name_or_path string "distilbert-base-uncased" Name of Model to Train
distributed.resources.cpu int 32 Number of CPUs per Pod
distributed.resources.memory string "16Gi" Amount of Memory per Pod
distributed.task_name string "sst2" Name of ITREX Task
distributed.teacher_model_name_or_path string "textattack/bert-base-uncased-SST-2" Name of Huggingface Model to Train off of
distributed.train.batch_size int 64 Training Batch Size
distributed.workers int 4 Number of Workers (World Size)
metadata.name string "itrex-distributed"
metadata.namespace string "kubeflow"
pvc.name string "itrex" Name of PVC for Output Directory
pvc.resources string "2Gi" Amount of Storage for Output Directory
pvc.scn string "nil" StorageClassName of PVC