-
Notifications
You must be signed in to change notification settings - Fork 6
neuston_sbatch
Sidney Batchelder edited this page Mar 15, 2021
·
1 revision
The neuston_sbatch.py
module allows a user to create and submit a neuston_net.py TRAIN
or neuston_net.py RUN
sbatch script on a SLURM enabled system.
usage: neuston_sbatch.py [-h] [--job-name STR] [--email EMAIL] [--walltime HH:MM:SS]
[--gpu-num INT] [--cpu-num INT] [--mem-per-cpu MB]
[--slurm-log-dir DIR] [--ofile OFILE] [--dry-run]
[--batch SIZE] [--loaders N] {TRAIN,RUN} ...
SLURM SBATCH auto-submitter for neuston_net.py
positional arguments:
{TRAIN,RUN} These sub-commands are mutually exclusive. The sub-commands are identical to
the TRAIN and RUN commands from "neuston_net.py"
Note: optional arguments (below) must be specified before "TRAIN" or "RUN"
TRAIN Train a new model.
RUN Run a previously trained model.
optional arguments:
-h, --help show this help message and exit
SLURM Args:
--job-name STR Job Name that will appear in slurm jobs list. Defaults is "NN"
--email EMAIL Email address for slurm notifications. The default is "{USERNAME}@whoi.edu"
--walltime HH:MM:SS Set Slurm Task max runtime. Default is "24:00:00"
--gpu-num INT Number of GPUs to allocate per task. Default is 1
--cpu-num INT Number of CPUs to allocate per task. Default is 4
--mem-per-cpu MB Memory to allocate per cpu in MB. Default is 10240MB
--slurm-log-dir DIR Directory to save slurm log file to.
Defaults to OUTDIR (as defined by TRAIN or RUN subcommand)
--ofile OFILE Save location for generated sbatch file.
Defaults to "{OUTDIR}/{PID}.{JOB_NAME}.sbatch"
--dry-run Create the sbatch script but do not run it
NN Common Args:
--batch SIZE Number of images per batch. Defaults is 108
--loaders N Number of data-loading threads. 4 per GPU is typical. Default is 4
- Overview
- Installation
- local
- whoi hpc
- Training a Model
- Running a Model
- Utilities
- SLURM SBATCH Tool ⊛
- Dupes Training ⊛
- Tips
- HPC Patch Notes