-
Notifications
You must be signed in to change notification settings - Fork 1
/
DESIGN
31 lines (26 loc) · 1.61 KB
/
DESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
There will be 2 file structures for the jobs. A local structure and a cloud structure. The local structure will be used by the extension to create jobs. The local agent will watch and read the local structure and then create a corresponding job with VM allocations in the remote structure. The remote agent will pull from the remote structure and execute jobs on the local VM.
Local Structure:
jobs/
|__<job1_id>.yaml (YAML file describing the job config)( <job1_id> A unique ID created by the extension)
|
|__<job2_id>.yaml ....
Remote Structure:
jobs/
|__<job1_id>
| |_ machine1 (machine name that should do the job)
| |_ <dir> (dir with content from the <dir> attribute in job config file)
| |_ <job1_id>.yaml (job config file)
| |_ RUNNING (An empty file marking that the job is already running)
| |_ DONE (An empty file marking thet the job is complete)
| |_ <my_notebook>.nbconvert.ipynb ( output notebook file)
| |_ machine2 ...
|
|__<job2_id> ....
config.yaml format:
machine_type: n1-standard-8 (or any of the types in https://cloud.google.com/compute/docs/machine-types)
machine_count: 2 ( number of gce instances to create. minimum is 1)
gpu_type: nvidia-tesla-p100
gpu_count: 1
notebook: relative/path/to/<my_notebook>.ipynb ( Notebook file to submit)
dir: dir/to/copy ( The directory that contains the above path.
zone: us-west1-b ( machines zone. If not present, we will use notebook machine zone)