Dummy Model training locally, on Polyaxon and on SageMaker

Train Locally

#!/bin/bash
pip install -r requirements.txt
python ./src/get_data.py --train_channel=/{{ train_channel }}
python ./src/local_train.py --penalty={{ penalty }}
                            --C={{ C }}
                            --train_channel={{ train_channel }}
                            --model_dir={{ model_dir }}

Parameter	Description	Valid Values	Default
C	Intensity of regularisation	float	1.0
penalty	Penalty to be used for regularisation	l1, l2	l2
train_channel	Local directory of training data	str	-
model_dir	Local directory to export the model	str	-

Polyaxon

Login to Polyaxon

pip install -U polyaxon-cli
polyaxon config set --host=****** --port=******
polyaxon login --username=****** --password=******

Validate you are logged in: polyaxon cluster

Train on Polyaxon

Assumptions for the following to work

You need to have a polyaxon cluster running

- create a project
`polyaxon project create --name=project-1`

- initialise the project
`polyaxon init project-1`

- download the data to the cluster
`polyaxon run -f polyaxonfiles/data.yml -u`

- Upload the code to polyaxon and run experiments
`polyaxon run -f polyaxonfiles/cpu.yml`

- See how much resourses experiment `3` is using:
`polyaxon experiment -xp 3 resources`

- Start a jupyter notebook
`polyaxon notebook start -f polyaxonfiles/notebook.yml`

SageMaker

High Level Workflow

With SageMaker first you need to create a docker image holding the training environment along with the training code on ECR. The training data lives in S3. On runtime, SageMaker downloads the docker image from ECR and the training data from S3. Therefore, you need to have your training image in ECR, your data in S3, provide those paths to SageMaker configs and provide a role that has access to all these resources.

Assumptions for the following to work

Generate the dummy training data using the get_data.py script

Upload the training data to a S3 bucket

In sagemaker/create_hp_job.py update:

the S3 bucket

the ECR repo

the RoleArn

Build and Push Docker Image for Training

docker build -f sagemaker/dockerfiles/train.Dockerfile -t sm_train .
docker tag sm_train aws_account_id.dkr.ecr.region.amazonaws.com/ecr_repo_name:tag
docker push aws_account_id.dkr.ecr.region.amazonaws.com/ecr_repo_name:tag

Train

python sagemaker/create_hp_job.py --tuning_job_name={{ tuning_job_name }}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
polyaxonfiles		polyaxonfiles
sagemaker		sagemaker
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dummy Model training locally, on Polyaxon and on SageMaker

Train Locally

Polyaxon

Login to Polyaxon

Train on Polyaxon

Assumptions for the following to work

SageMaker

High Level Workflow

Assumptions for the following to work

Build and Push Docker Image for Training

Train

About

Releases

Packages

Contributors 2

Languages

andreas-gompos/hyperparameter-tuning-at-scale

Folders and files

Latest commit

History

Repository files navigation

Dummy Model training locally, on Polyaxon and on SageMaker

Train Locally

Polyaxon

Login to Polyaxon

Train on Polyaxon

Assumptions for the following to work

SageMaker

High Level Workflow

Assumptions for the following to work

Build and Push Docker Image for Training

Train

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages