Skip to content
Shreyas Bhat edited this page Aug 10, 2023 · 24 revisions

Overview

jobsub_lite is a wrapper for Condor job submission, intended to be backwards compatible with the actively used options of the past Fermilab jobsub tools, while being smaller and easier to maintain, and handling new requirements (i.e SciTokens authentication, etc.)

User Documentation

There is Fermilab-facing documentation being provided in the Jobsub Lite page of the Fife Wiki

Commands

jobsub_submit

Submits a new job or group of jobs into the batch system, where they will run and let you collect the output later, with condor_transfer_data or jobsub_fetchlog.

usage: jobsub_submit [-h] [--auth-methods AUTH_METHODS] [-G GROUP]
                     [--global-pool GLOBAL_POOL] [--role ROLE]
                     [--subgroup SUBGROUP] [--verbose VERBOSE] [--debug]
                     [--devserver] [--version] [--support-email]
                     [--job-info JOB_INFO] [-c APPEND_CONDOR_REQUIREMENTS]
                     [--blocklist BLOCKLIST] [-r R] [-i I] [-t T]
                     [--cmtconfig CMTCONFIG] [--cpu CPU] [--dag]
                     [--dataset-definition DATASET_DEFINITION]
                     [--dd-percentage DD_PERCENTAGE]
                     [--dd-extra-dataset DD_EXTRA_DATASET] [--disk DISK]
                     [-d tag dir] [--email-to EMAIL_TO] [-e ENVIRONMENT]
                     [--expected-lifetime EXPECTED_LIFETIME] [-f INPUT_FILE]
                     [--generate-email-summary] [-L LOG_FILE] [-l LINES]
                     [--need-storage-modify NEED_STORAGE_MODIFY]
                     [--need-scope NEED_SCOPE] [--project-name PROJECT_NAME]
                     [-Q] [--mail_on_error] [--mail_always]
                     [--maxConcurrent MAXCONCURRENT] [--memory MEMORY] [-N N]
                     [-n] [--no-env-cleanup] [--OS OS]
                     [--overwrite-condor-requirements OVERWRITE_CONDOR_REQUIREMENTS]
                     [--resource-provides RESOURCE_PROVIDES]
                     [--skip-check SKIP_CHECK] [--tar_file_name TAR_FILE_NAME]
                     [--tarball-exclusion-file TARBALL_EXCLUSION_FILE]
                     [--timeout TIMEOUT] [--use-cvmfs-dropbox]
                     [--use-pnfs-dropbox] [--site SITE | --onsite | --offsite]
                     [--singularity-image SINGULARITY_IMAGE | --no-singularity]
                     [executable] ...

positional arguments:
  executable            executable for job to run
  exe_arguments         arguments to executable

optional arguments:
  -h, --help            show this help message and exit
  --global-pool GLOBAL_POOL
                        direct jobs/commands to a particular known global
                        pool.Currently known pools are: dune
  --devserver           Use jobsubdevgpvm01 etc. to submit
  --job-info JOB_INFO   script to call with jobid and command line when job is
                        submitted
  -c APPEND_CONDOR_REQUIREMENTS, --append_condor_requirements APPEND_CONDOR_REQUIREMENTS, --append-condor-requirements APPEND_CONDOR_REQUIREMENTS
                        append condor requirements
  --blocklist BLOCKLIST, --blacklist BLOCKLIST
                        ensure that jobs do not land at these (comma-
                        separated) sites
  -r R                  Experiment release version
  -i I                  Experiment release dir
  -t T                  Experiment test release dir
  --cmtconfig CMTCONFIG
                        Set up minervasoft release built with cmt
                        configuration. default is $CMTCONFIG
  --cpu CPU             request worker nodes have at least NUMBER cpus
  --dag                 submit and run a dagNabbit input file
  --dataset-definition DATASET_DEFINITION, --dataset_definition DATASET_DEFINITION, --dataset DATASET_DEFINITION
                        SAM dataset definition used in a Directed Acyclic
                        Graph (DAG)
  --dd-percentage DD_PERCENTAGE
                        percentage to apply to SAM dataset size for --dataset-
                        definition start job.
  --dd-extra-dataset DD_EXTRA_DATASET
                        SAM dataset definition start script extra dataset to
                        check as staged. You can add multiple of them.
  --disk DISK           Request worker nodes have at least NUMBER[UNITS] of
                        disk space. If UNITS is not specified default is 'KB'
                        (a typo in earlier versions said that default was
                        'MB', this was wrong). Allowed values for UNITS are
                        'KB','MB','GB', and 'TB'
  -d tag dir            -d <tag> <dir> Writable directory $CONDOR_DIR_<tag>
                        will exist on the execution node. After job
                        completion, its contents will be moved to <dir>
                        automatically. Specify as many <tag>/<dir> pairs as
                        you need.
  --email-to EMAIL_TO   email address to send job reports/summaries (default
                        is $USER@fnal.gov)
  -e ENVIRONMENT, --environment ENVIRONMENT
                        -e ADDED_ENVIRONMENT exports this variable with its
                        local value to worker node environment. For example
                        export FOO='BAR'; jobsub -e FOO <more stuff>
                        guarantees that the value of $FOO on the worker node
                        is 'BAR' . Alternate format which does not require
                        setting the env var first is the -e VAR=VAL idiom,
                        which sets the value of $VAR to 'VAL' in the worker
                        environment. The -e option can be used as many times
                        in one jobsub_submit invocation as desired.
  --expected-lifetime EXPECTED_LIFETIME
                        'short'|'medium'|'long'|NUMBER[UNITS] Expected
                        lifetime of the job. Used to match against resources
                        advertising that they have REMAINING_LIFETIME seconds
                        left. The shorter your EXPECTED_LIFTIME is, the more
                        resources (aka slots, cpus) your job can potentially
                        match against and the quicker it should start. If your
                        job runs longer than EXPECTED_LIFETIME it *may* be
                        killed by the batch system. If your specified
                        EXPECTED_LIFETIME is too long your job may take a long
                        time to match against a resource a sufficiently long
                        REMAINING_LIFETIME. Valid inputs for this parameter
                        are: 'short', 'medium', 'long' IF [UNITS] is omitted,
                        value is NUMBER seconds. Allowed values for UNITS are
                        's', 'm', 'h', 'd' representing seconds, minutes,
                        etc.The values for 'short','medium',and 'long' are
                        configurable by Grid Operations, they currently are
                        '3h' , '8h' , and '85200s' but this may change in the
                        future.
  -f INPUT_FILE         INPUT_FILE at runtime, INPUT_FILE will be copied to
                        directory $CONDOR_DIR_INPUT on the execution node.
                        Example : -f /grid/data/minerva/my/input/file.xxx will
                        be copied to $CONDOR_DIR_INPUT/file.xxx Specify as
                        many -f INPUT_FILE_1 -f INPUT_FILE_2 args as you need.
                        To copy file at submission time instead of run time,
                        use -f dropbox://INPUT_FILE to copy the file. If -f is
                        used without the dropbox:// URI, for example -f
                        /path/to/myfile, then the file (/path/to/myfile in
                        this example) MUST be grid-accessible via ifdh. For
                        more information, please see
                        https://github.com/fermitools/jobsub_lite/wiki/File-
                        Transfers-in-jobsub-lite
  --generate-email-summary
                        generate and mail a summary report of
                        completed/failed/removed jobs in a DAG
  -L LOG_FILE, --log-file LOG_FILE, --log_file LOG_FILE
                        Log file to hold log output from job.
  -l LINES, --lines LINES
                        Lines to append to the job file.
  --need-storage-modify NEED_STORAGE_MODIFY
                        directories needing storage.modify scope in job tokens
  --need-scope NEED_SCOPE
                        scopes needed in job tokens
  --project-name PROJECT_NAME, --project_name PROJECT_NAME
                        Project name for --dataset-definition DAGs to share
  -Q, --mail_never, --mail-never
                        never send mail about job results (default)
  --mail_on_error, --mail-on-error
                        send mail about job results if job fails
  --mail_always, --mail-always
                        send mail about job results
  --maxConcurrent MAXCONCURRENT
                        max number of jobs running concurrently at given time.
                        Use in conjunction with -N option to protect a shared
                        resource. Example: jobsub -N 1000 -maxConcurrent 20
                        will only run 20 jobs at a time until all 1000 have
                        completed. This is implemented by running the jobs in
                        a DAG. Normally when jobs are run with the -N option,
                        they all have the same $CLUSTER number and differing,
                        sequential $PROCESS numbers, and many submission
                        scripts take advantage of this. When jobs are run with
                        this option in a DAG each job has a different $CLUSTER
                        number and a $PROCESS number of 0, which may break
                        scripts that rely on the normal -N numbering scheme
                        for $CLUSTER and $PROCESS. Groups of jobs run with
                        this option will have the same $JOBSUBPARENTJOBID,
                        each individual job will have a unique and sequential
                        $JOBSUBJOBSECTION. Scripts may need modification to
                        take this into account
  --memory MEMORY       Request worker nodes have at least NUMBER[UNITS] of
                        memory. If UNITS is not specified default is 'MB'.
                        Allowed values for UNITS are 'KB','MB','GB', and 'TB'
  -N N                  submit N copies of this job. Each job will have access
                        to the environment variable $PROCESS that provides the
                        job number (0 to NUM-1), equivalent to the number
                        following the decimal point in the job ID (the '2' in
                        134567.2).
  -n, --no_submit, --no-submit
                        generate condor_command file but do not submit
  --no-env-cleanup      do not clean environment in wrapper script
  --OS OS               specify OS version of worker node. Example --OS=SL5
                        Comma separated list '--OS=SL4,SL5,SL6' works as well.
                        Default is any available OS
  --overwrite-condor-requirements OVERWRITE_CONDOR_REQUIREMENTS, --overwrite_condor_requirements OVERWRITE_CONDOR_REQUIREMENTS
                        overwrite default condor requirements with supplied
                        requirements
  --resource-provides RESOURCE_PROVIDES
                        request specific resources by changing condor jdf
                        file. For example: --resource-provides=CVMFS=OSG will
                        add +DESIRED_CVMFS="OSG" to the job classad attributes
                        and '&&(CVMFS=="OSG")' to the job requirements
  --skip-check SKIP_CHECK
                        Skip checks that jobsub_lite does by default. Add as
                        many --skip-check flags as desired. Available checks
                        are ['rcds']. Example: --skip-check rcds
  --tar_file_name TAR_FILE_NAME, --tar-file-name TAR_FILE_NAME
                        dropbox://PATH/TO/TAR_FILE tardir://PATH/TO/DIRECTORY
                        specify TAR_FILE or DIRECTORY to be transferred to
                        worker node. TAR_FILE will be copied with RCDS/cvmfs
                        (or /pnfs), transferred to the job and unpacked there.
                        The unpacked contents of TAR_FILE will be available
                        inside the directory $INPUT_TAR_DIR_LOCAL. If using
                        the PNFS dropbox (not default), TAR_FILE will be
                        accessible to the user job on the worker node via the
                        environment variable $INPUT_TAR_FILE. The unpacked
                        contents will be in the same directory as
                        $INPUT_TAR_FILE. For consistency, when using the
                        default (RCDS/cvmfs) dropbox, $INPUT_TAR_FILE will be
                        set in such a way that the parent directory of
                        $INPUT_TAR_FILE will contain the unpacked contents of
                        TAR_FILE. Successive --tar_file_name options will be
                        in $INPUT_TAR_DIR_LOCAL_1, $INPUT_TAR_DIR_LOCAL_2,
                        etc. and $INPUT_TAR_FILE_1, $INPUT_TAR_FILE_2, etc. We
                        note here that with this flag, it is recommended to
                        use the $INPUT_TAR_DIR_LOCAL environment variable,
                        rather than $INPUT_TAR_FILE For more information,
                        please see
                        https://github.com/fermitools/jobsub_lite/wiki/File-
                        Transfers-in-jobsub-lite
  --tarball-exclusion-file TARBALL_EXCLUSION_FILE
                        File with patterns to exclude from tarffile creation
  --timeout TIMEOUT     kill user job if still running after NUMBER[UNITS] of
                        time. UNITS may be `s' for seconds (the default), `m'
                        for minutes, `h' for hours or `d' h for days.
  --use-cvmfs-dropbox   use cvmfs for dropbox (default is cvmfs)
  --use-pnfs-dropbox    use pnfs resilient for dropbox (default is cvmfs)
  --site SITE           submit jobs to these (comma-separated) sites
  --onsite, --onsite-only
                        run jobs locally only;
                        usage_model=OPPORTUNISTIC,DEDICATED
  --offsite, --offsite-only
                        run jobs offsite; usage_model=OFFSITE
  --singularity-image SINGULARITY_IMAGE, --apptainer-image SINGULARITY_IMAGE
                        Singularity image to run jobs in. Default is
                        /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-
                        wn-sl7:latest
  --no-singularity, --no-apptainer
                        Don't request a singularity container. If the site
                        your job lands on runs all jobs in singularity
                        containers, your job will also run in one. If the site
                        does not run all jobs in singularity containers, your
                        job will run outside a singularity container.

general arguments:
  --auth-methods AUTH_METHODS
                        Authorization method to use for job management.
                        Multiple values should be given in a comma-separated
                        list, e.g. "token,proxy".Currently supported methods
                        are ['token', 'proxy']. The current infrastructure
                        requires the following auth methods: ['token']
  -G GROUP, --group GROUP
                        Group/Experiment/Subgroup for priorities and
                        accounting
  --role ROLE           VOMS Role for priorities and accounting
  --subgroup SUBGROUP   Subgroup for priorities and accounting. See
                        https://cdcvs.fnal.gov/redmine/projects/jobsub/wiki/
                        Jobsub_submit#Groups-Subgroups-Quotas-Priorities for
                        more documentation on using --subgroup to set job
                        quotas and priorities
  --verbose VERBOSE     Turn on more information on internal state of program.
                        --verbose 1 is the same as --debug
  --debug               dump internal state of program (useful for debugging)
  --version             version of jobsub_lite being used
  --support-email       jobsub_lite support email

jobsub_q

Shows the queue of jobs you (or others) have submitted, and their status etc.

usage: jobsub_q [-h] [-G GROUP] [--role ROLE] [--subgroup SUBGROUP]
                [--verbose VERBOSE] [--debug] [--version] [--support-email]
                [-J JOBID] [-name NAME] [--jobsub_server JOBSUB_SERVER]
                [--user USER]

optional arguments:
  -h, --help            show this help message and exit
  -J JOBID, --jobid JOBID
                        job/submission ID
  -name NAME            Set schedd name
  --jobsub_server JOBSUB_SERVER
                        backwards compatability; ignored
  --user USER           username to query

general arguments:
  -G GROUP, --group GROUP
                        Group/Experiment/Subgroup for priorities and
                        accounting
  --role ROLE           VOMS Role for priorities and accounting
  --subgroup SUBGROUP   Subgroup for priorities and accounting. See
                        https://cdcvs.fnal.gov/redmine/projects/jobsub/wiki/
                        Jobsub_submit#Groups-Subgroups-Quotas-Priorities for
                        more documentation on using --subgroup to set job
                        quotas and priorities
  --verbose VERBOSE     Turn on more information on internal state of program.
                        --verbose 1 is the same as --debug
  --debug               dump internal state of program (useful for debugging)
  --version             version of jobsub_lite being used
  --support-email       jobsub_lite support email

also condor_q arguments: [general-opts] [restriction-list] [output-opts | analyze-opts]
(with single '-' or double '--' dashes)

    [general-opts] are
        -global                  Query all Schedulers in this pool
        -schedd-constraint       Query all Schedulers matching this constraint
        -submitter <submitter>   Get queue of specific submitter
        -name <name>             Name of Scheduler
        -pool <host>             Use host as the central manager to query
        -jobads[:<form>] <file>  Read queue from a file of job ClassAds
                   where <form> is one of:
               auto    default, guess the format from reading the input stream
               long    The traditional -long form
               xml     XML form, the same as -xml
               json    JSON classad form, the same as -json
               new     'new' classad form without newlines
        -userlog <file>          Read queue from a user log file

    [restriction-list] each restriction may be one of
        <cluster>                Get information about specific cluster
        <cluster>.<proc>         Get information about specific job
        <owner>                  Information about jobs owned by <owner>
        -factory                 Get information about late materialization job factories
        -autocluster             Get information about the SCHEDD's autoclusters
        -constraint <expr>       Get information about jobs that match <expr>
        -unmatchable             Get information about jobs that do not match any machines
        -allusers                Consider jobs from all users

    [output-opts] are
        -limit <num>             Limit the number of results to <num>
        -cputime                 Display CPU_TIME instead of RUN_TIME
        -currentrun              Display times only for current run
        -debug                   Display debugging info to console
        -dag                     Sort DAG jobs under their DAGMan
        -expert                  Display shorter error messages
        -grid                    Get information about grid jobs (includes globus)
        -goodput                 Display job goodput statistics
        -help [Universe|State]   Display this screen, JobUniverses, JobStates
        -hold                    Get information about jobs on hold
        -io                      Display information regarding I/O
        -batch                   Display DAGs or batches of similar jobs as a single line
        -nobatch                 Display one line per job, rather than one line per batch
        -idle                    Get information about idle jobs
        -run                     Get information about running jobs
        -totals                  Display only job totals
        -stream-results          Produce output as jobs are fetched
        -version                 Print the HTCondor version and exit
        -wide[:<width>]          Don't truncate data to fit in 80 columns.
                                 Truncates to console width or <width> argument.
        -autoformat[:jlhVr,tng] <attr> [<attr2> [...]]
        -af[:jlhVr,tng] <attr> [attr2 [...]]
            Print attr(s) with automatic formatting
            the [jlhVr,tng] options modify the formatting
                j   Display Job id
                l   attribute labels
                h   attribute column headings
                V   %V formatting (string values are quoted)
                r   %r formatting (raw/unparsed values)
                ,   comma after each value
                t   tab before each value (default is space)
                n   newline after each value
                g   newline between ClassAds, no space before values
            use -af:h to get tabular values with headings
            use -af:lrng to get -long equivalent format
        -format <fmt> <attr>     Print attribute attr using format fmt
        -print-format <file>     Use <file> to set display attributes and formatting
                                 (experimental, see htcondor-wiki for more information)
        -long[:<form>]           Display entire ClassAds in <form> format
                                 See -jobads for <form> choices
        -xml                     Display entire ClassAds in XML form
        -json                    Display entire ClassAds in JSON form
        -attributes X,Y,...      Attributes to show in -xml, -json, and -long

    [analyze-opts] are
        -analyze[:<qual>]        Perform matchmaking analysis on jobs
        -better-analyze[:<qual>] Perform more detailed match analysis
            <qual> is a comma separated list of one or more of
            priority    Consider user priority during analysis
            summary     Show a one-line summary for each job or machine
            reverse     Analyze machines rather than jobs
        -machine <name>          Machine name or slot name for analysis
        -mconstraint <expr>      Machine constraint for analysis
        -slotads[:<form>] <file> Read Machine ClassAds for analysis from <file>
                                 <file> can be the output of condor_status -long
        -userprios <file>        Read user priorities for analysis from <file>
                                 <file> can be the output of condor_userprio -l
        -nouserprios             Don't consider user priority during analysis (default)
        -reverse-analyze         Analyze Machine requirements against jobs
        -verbose                 Show progress and machine names in results

    Only information about jobs owned by the current jobsub group will be returned.
This default is overridden when the restriction list has usernames and/or
job ids, when the -submitter or -allusers arguments are specified, or
when the current user is a queue superuser

jobsub_fetchlog

Gets the logfiles, etc. from a job submission, either as a gzipped tarfile, or copied to a specified destination directory.

usage: jobsub_fetchlog [-h] [-G GROUP] [--role ROLE] [--subgroup SUBGROUP]
                       [--verbose VERBOSE] [--debug] [--version]
                       [--support-email] [-J JOBID] [--destdir DESTDIR]
                       [--archive-format ARCHIVE_FORMAT]
                       [job_id]

positional arguments:
  job_id                job/submission ID

optional arguments:
  -h, --help            show this help message and exit
  -J JOBID, --jobid JOBID
                        job/submission ID
  --destdir DESTDIR, --dest-dir DESTDIR, --unzipdir DESTDIR
                        Directory to automatically unarchive logs into
  --archive-format ARCHIVE_FORMAT
                        format for downloaded archive: "tar" (default,
                        compressed with gzip) or "zip"

general arguments:
  -G GROUP, --group GROUP
                        Group/Experiment/Subgroup for priorities and
                        accounting
  --role ROLE           VOMS Role for priorities and accounting
  --subgroup SUBGROUP   Subgroup for priorities and accounting. See
                        https://cdcvs.fnal.gov/redmine/projects/jobsub/wiki/
                        Jobsub_submit#Groups-Subgroups-Quotas-Priorities for
                        more documentation on using --subgroup to set job
                        quotas and priorities
  --verbose VERBOSE     Turn on more information on internal state of program.
                        --verbose 1 is the same as --debug
  --debug               dump internal state of program (useful for debugging)
  --version             version of jobsub_lite being used
  --support-email       jobsub_lite support email

jobsub_hold

Flag job as "held", preventing it from running until released with jobsub_release, or deleted.

usage: jobsub_hold [-h] [-G GROUP] [--role ROLE] [--subgroup SUBGROUP]
                   [--verbose VERBOSE] [--debug] [--version] [--support-email]
                   [-J JOBID] [-name NAME] [--jobsub_server JOBSUB_SERVER]

optional arguments:
  -h, --help            show this help message and exit
  -J JOBID, --jobid JOBID
                        job/submission ID
  -name NAME            Set schedd name
  --jobsub_server JOBSUB_SERVER
                        backwards compatability; ignored

general arguments:
  -G GROUP, --group GROUP
                        Group/Experiment/Subgroup for priorities and
                        accounting
  --role ROLE           VOMS Role for priorities and accounting
  --subgroup SUBGROUP   Subgroup for priorities and accounting. See
                        https://cdcvs.fnal.gov/redmine/projects/jobsub/wiki/
                        Jobsub_submit#Groups-Subgroups-Quotas-Priorities for
                        more documentation on using --subgroup to set job
                        quotas and priorities
  --verbose VERBOSE     Turn on more information on internal state of program.
                        --verbose 1 is the same as --debug
  --debug               dump internal state of program (useful for debugging)
  --version             version of jobsub_lite being used
  --support-email       jobsub_lite support email

also condor_hold arguments: [options] [constraints]
(with single '-' or double '--' dashes)
 where [options] is zero or more of:
  -help               Display this message and exit
  -version            Display version information and exit
  -long               Display full result classad
  -totals             Display success/failure totals
  -name schedd_name   Connect to the given schedd
  -pool hostname      Use the given central manager to find daemons
  -addr <ip:port>     Connect directly to the given "sinful string"
  -reason reason      Use the given HoldReason
  -subcode number     Set HoldReasonSubCode
 and where [constraints] is one of:
  cluster.proc        Hold the given job
  cluster             Hold the given cluster of jobs
  user                Hold all jobs owned by user
  -constraint expr    Hold all jobs matching the boolean expression
  -all                Hold all jobs (cannot be used with other constraints)

jobsub_release

Releases jobs held by jobsub_hold, or by condor daemnons or admins.

usage: jobsub_release [-h] [-G GROUP] [--role ROLE] [--subgroup SUBGROUP]
                      [--verbose VERBOSE] [--debug] [--version]
                      [--support-email] [-J JOBID] [-name NAME]
                      [--jobsub_server JOBSUB_SERVER]

optional arguments:
  -h, --help            show this help message and exit
  -J JOBID, --jobid JOBID
                        job/submission ID
  -name NAME            Set schedd name
  --jobsub_server JOBSUB_SERVER
                        backwards compatability; ignored

general arguments:
  -G GROUP, --group GROUP
                        Group/Experiment/Subgroup for priorities and
                        accounting
  --role ROLE           VOMS Role for priorities and accounting
  --subgroup SUBGROUP   Subgroup for priorities and accounting. See
                        https://cdcvs.fnal.gov/redmine/projects/jobsub/wiki/
                        Jobsub_submit#Groups-Subgroups-Quotas-Priorities for
                        more documentation on using --subgroup to set job
                        quotas and priorities
  --verbose VERBOSE     Turn on more information on internal state of program.
                        --verbose 1 is the same as --debug
  --debug               dump internal state of program (useful for debugging)
  --version             version of jobsub_lite being used
  --support-email       jobsub_lite support email

also condor_release arguments: [options] [constraints]
(with single '-' or double '--' dashes)
 where [options] is zero or more of:
  -help               Display this message and exit
  -version            Display version information and exit
  -long               Display full result classad
  -totals             Display success/failure totals
  -name schedd_name   Connect to the given schedd
  -pool hostname      Use the given central manager to find daemons
  -addr <ip:port>     Connect directly to the given "sinful string"
  -reason reason      Use the given ReleaseReason
 and where [constraints] is one of:
  cluster.proc        Release the given job
  cluster             Release the given cluster of jobs
  user                Release all jobs owned by user
  -constraint expr    Release all jobs matching the boolean expression
  -all                Release all jobs (cannot be used with other constraints)

jobsub_rm

Remove a job from the batch queues. This kills the job if it is running.

usage: jobsub_rm [-h] [-G GROUP] [--role ROLE] [--subgroup SUBGROUP]
                 [--verbose VERBOSE] [--debug] [--version] [--support-email]
                 [-J JOBID] [-name NAME] [--jobsub_server JOBSUB_SERVER]

optional arguments:
  -h, --help            show this help message and exit
  -J JOBID, --jobid JOBID
                        job/submission ID
  -name NAME            Set schedd name
  --jobsub_server JOBSUB_SERVER
                        backwards compatability; ignored

general arguments:
  -G GROUP, --group GROUP
                        Group/Experiment/Subgroup for priorities and
                        accounting
  --role ROLE           VOMS Role for priorities and accounting
  --subgroup SUBGROUP   Subgroup for priorities and accounting. See
                        https://cdcvs.fnal.gov/redmine/projects/jobsub/wiki/
                        Jobsub_submit#Groups-Subgroups-Quotas-Priorities for
                        more documentation on using --subgroup to set job
                        quotas and priorities
  --verbose VERBOSE     Turn on more information on internal state of program.
                        --verbose 1 is the same as --debug
  --debug               dump internal state of program (useful for debugging)
  --version             version of jobsub_lite being used
  --support-email       jobsub_lite support email

also condor_rm arguments: [options] [constraints]
(with single '-' or double '--' dashes)
 where [options] is zero or more of:
  -help               Display this message and exit
  -version            Display version information and exit
  -long               Display full result classad
  -totals             Display success/failure totals
  -name schedd_name   Connect to the given schedd
  -pool hostname      Use the given central manager to find daemons
  -addr <ip:port>     Connect directly to the given "sinful string"
  -reason reason      Use the given RemoveReason
  -forcex             Force the immediate local removal of jobs in the X state
                      (only affects jobs already being removed)
 and where [constraints] is one of:
  cluster.proc        Remove the given job
  cluster             Remove the given cluster of jobs
  user                Remove all jobs owned by user
  -constraint expr    Remove all jobs matching the boolean expression
  -all                Remove all jobs (cannot be used with other constraints)

jobsub_wait

jobsub_wait cluster.proc@schedd
usage: jobsub_wait [-h] [-G GROUP] [--role ROLE] [--subgroup SUBGROUP]
                   [--verbose VERBOSE] [--debug] [--version] [--support-email]
                   [-J JOBID] [-name NAME] [--jobsub_server JOBSUB_SERVER]

optional arguments:
  -h, --help            show this help message and exit
  -J JOBID, --jobid JOBID
                        job/submission ID
  -name NAME            Set schedd name
  --jobsub_server JOBSUB_SERVER
                        backwards compatability; ignored

general arguments:
  -G GROUP, --group GROUP
                        Group/Experiment/Subgroup for priorities and
                        accounting
  --role ROLE           VOMS Role for priorities and accounting
  --subgroup SUBGROUP   Subgroup for priorities and accounting. See
                        https://cdcvs.fnal.gov/redmine/projects/jobsub/wiki/
                        Jobsub_submit#Groups-Subgroups-Quotas-Priorities for
                        more documentation on using --subgroup to set job
                        quotas and priorities
  --verbose VERBOSE     Turn on more information on internal state of program.
                        --verbose 1 is the same as --debug
  --debug               dump internal state of program (useful for debugging)
  --version             version of jobsub_lite being used
  --support-email       jobsub_lite support email

(with single '-' or double '--' dashes)
Use: /usr/bin/condor_wait [options] <log-file> [job-number]
Where options are:
    -help             Display options
    -version          Display Condor version
    -debug            Show extra debugging info
    -status           Show job start and terminate info
    -echo[:<fmt>]     Echo log events relevant to [job-number]
       optional <fmt> is one or more log format options:
         ISO_DATE     date in Year-Month-Day form
         UTC          echo time as UTC time
         XML          echo in XML log format
         JSON         echo in JSON log format
    -num <number>     Wait for this many jobs to end
                       (default is all jobs)
    -wait <seconds>   Wait no more than this time
                       (default is unlimited)
    -allevents        Continue on even if all jobs have ended.
                      use with -echo to transcribe the whole log
                      cannot be used with -num

This command watches a log file, and indicates when
a specific job (or all jobs mentioned in the log)
have completed or aborted. It returns success if
all such jobs have completed or aborted, and returns
failure otherwise.

Examples:
    /usr/bin/condor_wait logfile
    /usr/bin/condor_wait logfile 35
    /usr/bin/condor_wait logfile 1406.35
    /usr/bin/condor_wait -wait 60 logfile 13.25.3
    /usr/bin/condor_wait -num 2 logfile

Transcribe an entire log to UTC timestamps:
    /usr/bin/condor_wait -all -echo:UTC logfile

Environment Variables Used By jobsub_lite

  • BEARER_TOKEN_FILE: The path to a valid bearer (access) token file for the user
  • X509_USER_PROXY: The path to a valid VOMS-extended X509 proxy certificate for the user
  • HTGETTOKENOPTS: Options to pass to underlying token-obtaining/storing code (htgettoken)
  • GROUP/JOBSUB_GROUP: Experiment/group used to run jobsub_lite commands. Either this must be set or the -G must be passed for all commands
  • JOBSUB_DROPBOX_SERVER_LIST: A space-separated list of servers hostnames for jobsub_lite to query for the RCDS dropbox API endpoints
  • JOBSUB_OUTPUT_URL: HTTP endpoint used by jobsub_lite wrapper scripts to send job logs to at the end of a job.
  • JOBSUB_FETCHLOG_URL: HTTP endpoint used by jobsub_lite by default to fetch logs
  • CMTCONFIG: Legacy environment variable for use with minervasoft job submissions
  • JOBSUB_POOL_MAP: JSON information for the --global-pool= command line option
  • JOBSUB_EXTRA_JOB_INFO: comma separated values to add as --job-info script command line options (option also added in #373)
  • JOBSUB_EXTRA_LINES: comma separated values to add as --lines options
  • JOBSUB_EXTRA_ENVIRONMENT: comma separated values to add as extra --environment options

Planned additions

  • Possible environment variable to set RCDS/PNFS dropbox default (in case of RCDS outage, for example)