Skip to content

Latest commit

 

History

History
184 lines (148 loc) · 9.3 KB

invocation.md

File metadata and controls

184 lines (148 loc) · 9.3 KB

Running a Command

To accommodate the enormous variety in syntax and semantics for input, runtime environment, invocation, and output of arbitrary programs, a CommandLineTool defines an "input binding" that describes how to translate abstract input parameters to a concrete program invocation, and an "output binding" that describes how to generate output parameters from program output.

Input binding

The tool command line is built by applying command line bindings to the input object. Bindings are listed either as part of an input parameter using the inputBinding field, or separately using the arguments field of the CommandLineTool.

The algorithm to build the command line is as follows. In this algorithm, the sort key is a list consisting of one or more numeric or string elements. Strings are sorted lexicographically based on UTF-8 encoding.

  1. Collect CommandLineBinding objects from arguments. Assign a sorting key [position, i] where position is CommandLineBinding.position and i is the index in the arguments list.

  2. Collect CommandLineBinding objects from the inputs schema and associate them with values from the input object. Where the input type is a record, array, or map, recursively walk the schema and input object, collecting nested CommandLineBinding objects and associating them with values from the input object.

  3. Create a sorting key by taking the value of the position field at each level leading to each leaf binding object. If position is not specified, it is not added to the sorting key. For bindings on arrays and maps, the sorting key must include the array index or map key following the position. If and only if two bindings have the same sort key, the tie must be broken using the ordering of the field or parameter name immediately containing the leaf binding.

  4. Sort elements using the assigned sorting keys. Numeric entries sort before strings.

  5. In the sorted order, apply the rules defined in CommandLineBinding to convert bindings to actual command line elements.

  6. Insert elements from baseCommand at the beginning of the command line.

Runtime environment

All files listed in the input object must be made available in the runtime environment. The implementation may use a shared or distributed file system or transfer files via explicit download to the host. Implementations may choose not to provide access to files not explicitly specified in the input object or process requirements.

Output files produced by tool execution must be written to the designated output directory. The initial current working directory when executing the tool must be the designated output directory. The designated output directory should be empty, except for files or directories specified using InitialWorkDirRequirement.

Files may also be written to the designated temporary directory. This directory must be isolated and not shared with other processes. Any files written to the designated temporary directory may be automatically deleted by the workflow platform immediately after the tool terminates.

For compatibility, files may be written to the system temporary directory which must be located at /tmp. Because the system temporary directory may be shared with other processes on the system, files placed in the system temporary directory are not guaranteed to be deleted automatically. A tool must not use the system temporary directory as a back-channel communication with other tools. It is valid for the system temporary directory to be the same as the designated temporary directory.

When executing the tool, the tool must execute in a new, empty environment with only the environment variables described below; the child process must not inherit environment variables from the parent process except as specified or at user option.

  • HOME must be set to the designated output directory.
  • TMPDIR must be set to the designated temporary directory.
  • PATH may be inherited from the parent process, except when run in a container that provides its own PATH.
  • Variables defined by EnvVarRequirement
  • The default environment of the container, such as when using DockerRequirement

An implementation may forbid the tool from writing to any location in the runtime environment file system other than the designated temporary directory, system temporary directory, and designated output directory. An implementation may provide read-only input files, and disallow in-place update of input files. The designated temporary directory, system temporary directory and designated output directory may each reside on different mount points on different file systems.

An implementation may forbid the tool from directly accessing network resources. Correct tools must not assume any network access unless they have the 'networkAccess' field of a 'NetworkAccess' requirement set to true but even then this does not imply a publicly routable IP address or the ability to accept inbound connections.

The runtime section available in parameter references and expressions contains the following fields. As noted earlier, an implementation may perform deferred resolution of runtime fields by providing opaque strings for any or all of the following fields; parameter references and expressions may only use the literal string value of the field and must not perform computation on the contents.

  • runtime.outdir: an absolute path to the designated output directory
  • runtime.tmpdir: an absolute path to the designated temporary directory
  • runtime.cores: number of CPU cores reserved for the tool process
  • runtime.ram: amount of RAM in mebibytes (2**20) reserved for the tool process
  • runtime.outdirSize: reserved storage space available in the designated output directory
  • runtime.tmpdirSize: reserved storage space available in the designated temporary directory

For cores, ram, outdirSize and tmpdirSize, if an implementation can't provide the actual number of reserved resources during the expression evaluation time, it should report back the minimal requested amount.

See ResourceRequirement for details on how to describe the hardware resources required by a tool.

The standard input stream, the standard output stream, and/or the standard error stream may be redirected as described in the stdin, stdout, and stderr fields.

Execution

Once the command line is built and the runtime environment is created, the actual tool is executed.

The standard error stream and standard output stream may be captured by platform logging facilities for storage and reporting. If there are multiple commands logically chained (e.g. echo a && echo b) implementations must capture the output of all the commands, and not only the output of the last command (i.e. the following is incorrect echo a && echo b > captured, as the output of echo a is not included in captured).

Tools may be multithreaded or spawn child processes; however, when the parent process exits, the tool is considered finished regardless of whether any detached child processes are still running. Tools must not require any kind of console, GUI, or web based user interaction in order to start and run to completion.

The exit code of the process indicates if the process completed successfully. By convention, an exit code of zero is treated as success and non-zero exit codes are treated as failure. This may be customized by providing the fields successCodes, temporaryFailCodes, and permanentFailCodes. An implementation may choose to default unspecified non-zero exit codes to either temporaryFailure or permanentFailure.

The exit code of the process is available to expressions in outputEval as runtime.exitCode.

Output binding

If the output directory contains a file named "cwl.output.json", that file must be loaded and used as the output object. In this case, the output object should still be type-checked against the outputs section, but outputBinding is ignored.

For Files and Directories, if the value of path is a relative path pattern (does not begin with a slash '/') then it is resolved relative to the output directory. If the value of the "path" is an absolute path pattern (it does begin with a slash '/') then it must refer to a path within the output directory. It is an error for "path" to refer outside the output directory.

Similarly, if a File or Directory "cwl.output.json" contains location, it is resolved as relative reference IRI with a base IRI representing the output directory. If location contains some other absolute IRI with a scheme supported by the implementation, the implementation may choose to accept it.

If both path and location are provided on a File or Directory in "cwl.output.json", path takes precedence.

If there is no "cwl.output.json", the output object must be generated by walking the parameters listed in outputs and applying output bindings to the tool output. Output bindings are associated with output parameters using the outputBinding field. See CommandOutputBinding for details.