-
-
Notifications
You must be signed in to change notification settings - Fork 5
/
CommandLineTool.yml
1399 lines (1206 loc) · 51.5 KB
/
CommandLineTool.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
saladVersion: v1.3
$base: "https://w3id.org/cwl/cwl#"
$namespaces:
cwl: "https://w3id.org/cwl/cwl#"
$graph:
- name: CommandLineToolDoc
type: documentation
doc:
- |
# Common Workflow Language (CWL) Command Line Tool Description, v1.3.0-dev1
This version:
* https://w3id.org/cwl/v1.3.0-dev1
Latest stable version:
* https://w3id.org/cwl/
- "\n\n"
- {$include: contrib.md}
- "\n\n"
- |
# Abstract
A Command Line Tool is a non-interactive executable program that reads
some input, performs a computation, and terminates after producing some
output. Command line programs are a flexible unit of code sharing and
reuse, unfortunately the syntax and input/output semantics among command
line programs is extremely heterogeneous. A common layer for describing
the syntax and semantics of programs can reduce this incidental
complexity by providing a consistent way to connect programs together.
This specification defines the Common Workflow Language (CWL) Command
Line Tool Description, a vendor-neutral standard for describing the
syntax and input/output semantics of command line programs.
- {$include: intro.md}
- |
## Introduction to the CWL Command Line Tool standard v1.3.0-dev1
This specification represents the latest development version from the
CWL group.
Documents should use `cwlVersion: v1.3.0-dev1` to make use of new
syntax and features introduced in v1.3.0-dev1. Existing v1.2 documents
should be trivially updatable by changing `cwlVersion`, however
CWL documents that relied on previously undefined or
underspecified behavior may have slightly different behavior in
v1.3.0-dev1.
## Changelog for v1.3.0-dev1
See also the [CWL Workflow Description, v1.3.0-dev1 changelog](Workflow.html#Changelog).
For other changes since CWL v1.0, see the
[CWL Command Line Tool Description, v1.1 changelog](https://www.commonwl.org/v1.1/CommandLineTool.html#Changelog)
and
[CWL Command Line Tool Description, v1.2.1 changelog](https://www.commonwl.org/v1.2/CommandLineTool.html#Changelog).
## Purpose
Standalone programs are a flexible and interoperable form of code reuse.
Unlike monolithic applications, applications and analysis workflows which
are composed of multiple separate programs can be written in multiple
languages and execute concurrently on multiple hosts. However, POSIX
does not dictate computer-readable grammar or semantics for program input
and output, resulting in extremely diverse command line grammar and
input/output semantics among programs. This is a particular problem in
distributed computing (multi-node compute clusters) and virtualized
environments (such as Docker containers) where it is often necessary to
provision resources such as input files before executing the program.
Often this gap is filled by hard coding program invocation and
implicitly assuming requirements will be met, or abstracting program
invocation with wrapper scripts or descriptor documents. Unfortunately,
where these approaches are application or platform specific it creates a
significant barrier to reproducibility and portability, as methods
developed for one platform must be manually ported to be used on new
platforms. Similarly, it creates redundant work, as wrappers for popular
tools must be rewritten for each application or platform in use.
The Common Workflow Language Command Line Tool Description is designed to
provide a common standard description of grammar and semantics for
invoking programs used in data-intensive fields such as Bioinformatics,
Chemistry, Physics, Astronomy, and Statistics. This specification
attempts to define a precise data and execution model for Command Line Tools that
can be implemented on a variety of computing platforms, ranging from a
single workstation to cluster, grid, cloud, and high performance
computing platforms. Details related to execution of these programs not
laid out in this specification are open to interpretation by the computing
platform implementing this specification.
- {$include: concepts.md}
- {$include: invocation.md}
- type: record
name: EnvironmentDef
doc: |
Define an environment variable that will be set in the runtime environment
by the workflow platform when executing the command line tool. May be the
result of executing an expression, such as getting a parameter from input.
fields:
- name: envName
type: string
doc: The environment variable name
- name: envValue
type: [string, Expression]
doc: The environment variable value
- type: record
name: CommandLineBinding
extends: InputBinding
docParent: "#CommandInputParameter"
doc: |
When listed under `inputBinding` in the input schema, the term
"value" refers to the corresponding value in the input object. For
binding objects listed in `CommandLineTool.arguments`, the term "value"
refers to the effective value after evaluating `valueFrom`.
The binding behavior when building the command line depends on the data
type of the value. If there is a mismatch between the type described by
the input schema and the effective value, such as resulting from an
expression evaluation, an implementation must use the data type of the
effective value.
- **string**: Add `prefix` and the string to the command line.
- **number**: Add `prefix` and decimal representation to command line.
- **boolean**: If true, add `prefix` to the command line. If false, add
nothing.
- **File**: Add `prefix` and the value of
[`File.path`](#File) to the command line.
- **Directory**: Add `prefix` and the value of
[`Directory.path`](#Directory) to the command line.
- **array**: If `itemSeparator` is specified, add `prefix` and the join
the array into a single string with `itemSeparator` separating the
items. Otherwise, first add `prefix`, then recursively process
individual elements.
If the array is empty, it does not add anything to command line.
- **object**: Add `prefix` only, and recursively add object fields for
which `inputBinding` is specified.
- **null**: Add nothing.
fields:
- name: position
type: [ "null", int, Expression ]
default: 0
doc: |
The sorting key. Default position is 0. If a [CWL Parameter Reference](#Parameter_references)
or [CWL Expression](#Expressions_(Optional)) is used and if the
inputBinding is associated with an input parameter, then the value of
`self` will be the value of the input parameter. Input parameter
defaults (as specified by the `InputParameter.default` field) must be
applied before evaluating the expression. Expressions must return a
single value of type int or a null.
- name: prefix
type: string?
doc: "Command line prefix to add before the value."
- name: separate
type: boolean?
default: true
doc: |
If true (default), then the prefix and value must be added as separate
command line arguments; if false, prefix and value must be concatenated
into a single command line argument.
- name: itemSeparator
type: string?
doc: |
Join the array elements into a single string with the elements
separated by `itemSeparator`.
- name: valueFrom
type:
- "null"
- string
- Expression
jsonldPredicate: "cwl:valueFrom"
doc: |
If `valueFrom` is a constant string value, use this as the value and
apply the binding rules above.
If `valueFrom` is an expression, evaluate the expression to yield the
actual value to use to build the command line and apply the binding
rules above. If the inputBinding is associated with an input
parameter, the value of `self` in the expression will be the value of
the input parameter. Input parameter defaults (as specified by the
`InputParameter.default` field) must be applied before evaluating the
expression.
If the value of the associated input parameter is `null`, `valueFrom` is
not evaluated and nothing is added to the command line.
When a binding is part of the `CommandLineTool.arguments` field,
the `valueFrom` field is required.
- name: shellQuote
type: boolean?
default: true
doc: |
If `ShellCommandRequirement` is in the requirements for the current command,
this controls whether the value is quoted on the command line (default is true).
Use `shellQuote: false` to inject metacharacters for operations such as pipes.
If `shellQuote` is true or not provided, the implementation must not
permit interpretation of any shell metacharacters or directives.
- type: record
name: CommandOutputBinding
extends: LoadContents
doc: |
Describes how to generate an output parameter based on the files produced
by a CommandLineTool.
The output parameter value is generated by applying these operations in the
following order:
- glob
- loadContents
- outputEval
- secondaryFiles
fields:
- name: glob
type:
- "null"
- string
- Expression
- type: array
items: string
doc: |
Find files or directories relative to the output directory, using POSIX
glob(3) pathname matching. If an array is provided, find files or
directories that match any pattern in the array. If an expression is
provided, the expression must return a string or an array of strings,
which will then be evaluated as one or more glob patterns. Must only
match and return files/directories which actually exist.
If the value of glob is a relative path pattern (does not
begin with a slash '/') then it is resolved relative to the
output directory. If the value of the glob is an absolute
path pattern (it does begin with a slash '/') then it must
refer to a path within the output directory. It is an error
if any glob resolves to a path outside the output directory.
Specifically this means globs that resolve to paths outside the output
directory are illegal.
A glob may match a path within the output directory which is
actually a symlink to another file. In this case, the
expected behavior is for the resulting File/Directory object to take the
`basename` (and corresponding `nameroot` and `nameext`) of the
symlink. The `location` of the File/Directory is implementation
dependent, but logically the File/Directory should have the same content
as the symlink target. Platforms may stage output files/directories to
cloud storage that lack the concept of a symlink. In
this case file content and directories may be duplicated, or (to avoid
duplication) the File/Directory `location` may refer to the symlink
target.
It is an error if a symlink in the output directory (or any
symlink in a chain of links) refers to any file or directory
that is not under an input or output directory.
Implementations may shut down a container before globbing
output, so globs and expressions must not assume access to the
container filesystem except for declared input and output.
- name: outputEval
type: Expression?
doc: |
Evaluate an expression to generate the output value. If
`glob` was specified, the value of `self` must be an array
containing file objects that were matched. If no files were
matched, `self` must be a zero length array; if a single file
was matched, the value of `self` is an array of a single
element. The exit code of the process is
available in the expression as `runtime.exitCode`.
Additionally, if `loadContents` is true, the file must be a
UTF-8 text file 64 KiB or smaller, and the implementation must
read the entire contents of the file (or file array) and place
it in the `contents` field of the File object for use in
`outputEval`. If the size of the file is greater than 64 KiB,
the implementation must raise a fatal error.
If a tool needs to return a large amount of structured data to
the workflow, loading the output object from `cwl.output.json`
bypasses `outputEval` and is not subject to the 64 KiB
`loadContents` limit.
- name: CommandLineBindable
type: record
fields:
inputBinding:
type: CommandLineBinding?
jsonldPredicate: "cwl:inputBinding"
doc: Describes how to turn this object into command line arguments.
- name: CommandInputRecordField
type: record
extends: [InputRecordField, CommandLineBindable]
specialize:
- specializeFrom: InputRecordSchema
specializeTo: CommandInputRecordSchema
- specializeFrom: InputEnumSchema
specializeTo: CommandInputEnumSchema
- specializeFrom: InputArraySchema
specializeTo: CommandInputArraySchema
- specializeFrom: InputBinding
specializeTo: CommandLineBinding
- name: CommandInputRecordSchema
type: record
extends: [InputRecordSchema, CommandInputSchema, CommandLineBindable]
specialize:
- specializeFrom: InputRecordField
specializeTo: CommandInputRecordField
- specializeFrom: InputBinding
specializeTo: CommandLineBinding
- name: CommandInputEnumSchema
type: record
extends: [InputEnumSchema, CommandInputSchema, CommandLineBindable]
specialize:
- specializeFrom: InputBinding
specializeTo: CommandLineBinding
- name: CommandInputArraySchema
type: record
extends: [InputArraySchema, CommandInputSchema, CommandLineBindable]
specialize:
- specializeFrom: InputRecordSchema
specializeTo: CommandInputRecordSchema
- specializeFrom: InputEnumSchema
specializeTo: CommandInputEnumSchema
- specializeFrom: InputArraySchema
specializeTo: CommandInputArraySchema
- specializeFrom: InputBinding
specializeTo: CommandLineBinding
- name: CommandOutputRecordField
type: record
extends: OutputRecordField
specialize:
- specializeFrom: OutputRecordSchema
specializeTo: CommandOutputRecordSchema
- specializeFrom: OutputEnumSchema
specializeTo: CommandOutputEnumSchema
- specializeFrom: OutputArraySchema
specializeTo: CommandOutputArraySchema
fields:
- name: outputBinding
type: CommandOutputBinding?
jsonldPredicate: "cwl:outputBinding"
doc: |
Describes how to generate this output object based on the files
produced by a CommandLineTool
- name: CommandOutputRecordSchema
type: record
extends: OutputRecordSchema
specialize:
- specializeFrom: OutputRecordField
specializeTo: CommandOutputRecordField
- name: CommandOutputEnumSchema
type: record
extends: OutputEnumSchema
specialize:
- specializeFrom: OutputRecordSchema
specializeTo: CommandOutputRecordSchema
- specializeFrom: OutputEnumSchema
specializeTo: CommandOutputEnumSchema
- specializeFrom: OutputArraySchema
specializeTo: CommandOutputArraySchema
- name: CommandOutputArraySchema
type: record
extends: OutputArraySchema
specialize:
- specializeFrom: OutputRecordSchema
specializeTo: CommandOutputRecordSchema
- specializeFrom: OutputEnumSchema
specializeTo: CommandOutputEnumSchema
- specializeFrom: OutputArraySchema
specializeTo: CommandOutputArraySchema
- type: record
name: CommandInputParameter
extends: InputParameter
doc: An input parameter for a CommandLineTool.
fields:
- name: type
type:
- CWLType
- stdin
- CommandInputRecordSchema
- CommandInputEnumSchema
- CommandInputArraySchema
- string
- type: array
items:
- CWLType
- CommandInputRecordSchema
- CommandInputEnumSchema
- CommandInputArraySchema
- string
jsonldPredicate:
"_id": "sld:type"
"_type": "@vocab"
refScope: 2
typeDSL: True
doc: |
Specify valid types of data that may be assigned to this parameter.
- name: inputBinding
type: CommandLineBinding?
doc: |
Describes how to turn the input parameters of a process into
command line arguments.
jsonldPredicate: "cwl:inputBinding"
- type: record
name: CommandOutputParameter
extends: OutputParameter
doc: An output parameter for a CommandLineTool.
fields:
- name: type
type:
- CWLType
- stdout
- stderr
- CommandOutputRecordSchema
- CommandOutputEnumSchema
- CommandOutputArraySchema
- string
- type: array
items:
- CWLType
- CommandOutputRecordSchema
- CommandOutputEnumSchema
- CommandOutputArraySchema
- string
jsonldPredicate:
"_id": "sld:type"
"_type": "@vocab"
refScope: 2
typeDSL: True
doc: |
Specify valid types of data that may be assigned to this parameter.
- name: outputBinding
type: CommandOutputBinding?
jsonldPredicate: "cwl:outputBinding"
doc: Describes how to generate this output object based on the files
produced by a CommandLineTool
- name: stdin
type: enum
symbols: [ "cwl:stdin" ]
docParent: "#CommandOutputParameter"
doc: |
Only valid as a `type` for a `CommandLineTool` input with no
`inputBinding` set. `stdin` must not be specified at the `CommandLineTool`
level.
The following
```
inputs:
an_input_name:
type: stdin
```
is equivalent to
```
inputs:
an_input_name:
type: File
streamable: true
stdin: $(inputs.an_input_name.path)
```
- name: stdout
type: enum
symbols: [ "cwl:stdout" ]
docParent: "#CommandOutputParameter"
doc: |
Only valid as a `type` for a `CommandLineTool` output with no
`outputBinding` set.
The following
```
outputs:
an_output_name:
type: stdout
stdout: a_stdout_file
```
is equivalent to
```
outputs:
an_output_name:
type: File
streamable: true
outputBinding:
glob: a_stdout_file
stdout: a_stdout_file
```
If there is no `stdout` name provided, a random filename will be created.
For example, the following
```
outputs:
an_output_name:
type: stdout
```
is equivalent to
```
outputs:
an_output_name:
type: File
streamable: true
outputBinding:
glob: random_stdout_filenameABCDEFG
stdout: random_stdout_filenameABCDEFG
```
If the `CommandLineTool` contains logically chained commands
(e.g. `echo a && echo b`) `stdout` must include the output of
every command.
- name: stderr
type: enum
symbols: [ "cwl:stderr" ]
docParent: "#CommandOutputParameter"
doc: |
Only valid as a `type` for a `CommandLineTool` output with no
`outputBinding` set.
The following
```
outputs:
an_output_name:
type: stderr
stderr: a_stderr_file
```
is equivalent to
```
outputs:
an_output_name:
type: File
streamable: true
outputBinding:
glob: a_stderr_file
stderr: a_stderr_file
```
If there is no `stderr` name provided, a random filename will be created.
For example, the following
```
outputs:
an_output_name:
type: stderr
```
is equivalent to
```
outputs:
an_output_name:
type: File
streamable: true
outputBinding:
glob: random_stderr_filenameABCDEFG
stderr: random_stderr_filenameABCDEFG
```
- type: record
name: CommandLineTool
extends: Process
documentRoot: true
specialize:
- specializeFrom: InputParameter
specializeTo: CommandInputParameter
- specializeFrom: OutputParameter
specializeTo: CommandOutputParameter
doc: |
This defines the schema of the CWL Command Line Tool Description document.
fields:
- name: class
jsonldPredicate:
"_id": "@type"
"_type": "@vocab"
type:
type: enum
name: CommandLineTool_class
symbols:
- cwl:CommandLineTool
- name: baseCommand
doc: |
Specifies the program to execute. If an array, the first element of
the array is the command to execute, and subsequent elements are
mandatory command line arguments. The elements in `baseCommand` must
appear before any command line bindings from `inputBinding` or
`arguments`.
If `baseCommand` is not provided or is an empty array, the first
element of the command line produced after processing `inputBinding` or
`arguments` must be used as the program to execute.
If the program includes a path separator character it must
be an absolute path, otherwise it is an error. If the program does not
include a path separator, search the `$PATH` variable in the runtime
environment of the workflow runner find the absolute path of the
executable.
type:
- string?
- string[]?
jsonldPredicate:
"_id": "cwl:baseCommand"
"_container": "@list"
- name: arguments
doc: |
Command line bindings which are not directly associated with input
parameters. If the value is a string, it is used as a string literal
argument. If it is an Expression, the result of the evaluation is used
as an argument.
type:
- "null"
- type: array
items: [string, Expression, CommandLineBinding]
jsonldPredicate:
"_id": "cwl:arguments"
"_container": "@list"
- name: stdin
type: ["null", string, Expression]
jsonldPredicate: "https://w3id.org/cwl/cwl#stdin"
doc: |
A path to a file whose contents must be piped into the command's
standard input stream.
- name: stderr
type: ["null", string, Expression]
jsonldPredicate: "https://w3id.org/cwl/cwl#stderr"
doc: |
Capture the command's standard error stream to a file written to
the designated output directory.
If `stderr` is a string, it specifies the file name to use.
If `stderr` is an expression, the expression is evaluated and must
return a string with the file name to use to capture stderr. If the
return value is not a string, or the resulting path contains illegal
characters (such as the path separator `/`) it is an error.
- name: stdout
type: ["null", string, Expression]
jsonldPredicate: "https://w3id.org/cwl/cwl#stdout"
doc: |
Capture the command's standard output stream to a file written to
the designated output directory.
If the `CommandLineTool` contains logically chained commands
(e.g. `echo a && echo b`) `stdout` must include the output of
every command.
If `stdout` is a string, it specifies the file name to use.
If `stdout` is an expression, the expression is evaluated and must
return a string with the file name to use to capture stdout. If the
return value is not a string, or the resulting path contains illegal
characters (such as the path separator `/`) it is an error.
- name: successCodes
type: int[]?
doc: |
Exit codes that indicate the process completed successfully.
If not specified, only exit code 0 is considered success.
- name: temporaryFailCodes
type: int[]?
doc: |
Exit codes that indicate the process failed due to a possibly
temporary condition, where executing the process with the same
runtime environment and inputs may produce different results.
If not specified, no exit codes are considered temporary failure.
- name: permanentFailCodes
type: int[]?
doc:
Exit codes that indicate the process failed due to a permanent logic
error, where executing the process with the same runtime environment and
same inputs is expected to always fail.
If not specified, all exit codes except 0 are considered permanent failure.
- type: record
name: DockerRequirement
extends: ProcessRequirement
doc: |
Indicates that a workflow component should be run in a
[Docker](https://docker.com) or Docker-compatible (such as
[Singularity](https://www.sylabs.io/) and [udocker](https://github.com/indigo-dc/udocker)) container environment and
specifies how to fetch or build the image.
If a CommandLineTool lists `DockerRequirement` under
`hints` (or `requirements`), it may (or must) be run in the specified Docker
container.
The platform must first acquire or install the correct Docker image as
specified by `dockerPull`, `dockerImport`, `dockerLoad` or `dockerFile`.
The platform must execute the tool in the container using `docker run` with
the appropriate Docker image and tool command line.
The workflow platform may provide input files and the designated output
directory through the use of volume bind mounts. The platform should rewrite
file paths in the input object to correspond to the Docker bind mounted
locations. That is, the platform should rewrite values in the parameter context
such as `runtime.outdir`, `runtime.tmpdir` and others to be valid paths
within the container. The platform must ensure that `runtime.outdir` and
`runtime.tmpdir` are distinct directories.
When running a tool contained in Docker, the workflow platform must not
assume anything about the contents of the Docker container, such as the
presence or absence of specific software, except to assume that the
generated command line represents a valid command within the runtime
environment of the container.
A container image may specify an
[ENTRYPOINT](https://docs.docker.com/engine/reference/builder/#entrypoint)
and/or
[CMD](https://docs.docker.com/engine/reference/builder/#cmd).
Command line arguments will be appended after all elements of
ENTRYPOINT, and will override all elements specified using CMD (in
other words, CMD is only used when the CommandLineTool definition
produces an empty command line).
Use of implicit ENTRYPOINT or CMD are discouraged due to reproducibility
concerns of the implicit hidden execution point (For further discussion, see
[https://doi.org/10.12688/f1000research.15140.1](https://doi.org/10.12688/f1000research.15140.1)). Portable
CommandLineTool wrappers in which use of a container is optional must not rely on ENTRYPOINT or CMD.
CommandLineTools which do rely on ENTRYPOINT or CMD must list `DockerRequirement` in the
`requirements` section.
## Interaction with other requirements
If [EnvVarRequirement](#EnvVarRequirement) is specified alongside a
DockerRequirement, the environment variables must be provided to Docker
using `--env` or `--env-file` and interact with the container's preexisting
environment as defined by Docker.
fields:
- name: class
type:
type: enum
name: DockerRequirement_class
symbols:
- cwl:DockerRequirement
doc: "Always 'DockerRequirement'"
jsonldPredicate:
"_id": "@type"
"_type": "@vocab"
- name: dockerPull
type: string?
doc: |
Specify a Docker image to retrieve using `docker pull`. Can contain the
immutable digest to ensure an exact container is used:
`dockerPull: ubuntu@sha256:45b23dee08af5e43a7fea6c4cf9c25ccf269ee113168c19722f87876677c5cb2`
- name: dockerLoad
type: string?
doc: "Specify an HTTP URL from which to download a Docker image using `docker load`."
- name: dockerFile
type: string?
doc: "Supply the contents of a Dockerfile which will be built using `docker build`."
- name: dockerImport
type: string?
doc: "Provide HTTP URL to download and gunzip a Docker images using `docker import."
- name: dockerImageId
type: string?
doc: |
The image id that will be used for `docker run`. May be a
human-readable image name or the image identifier hash. May be skipped
if `dockerPull` is specified, in which case the `dockerPull` image id
must be used.
- name: dockerOutputDirectory
type: string?
doc: |
Set the designated output directory to a specific location inside the
Docker container.
- type: record
name: SoftwareRequirement
extends: ProcessRequirement
doc: |
A list of software packages that should be configured in the environment of
the defined process.
fields:
- name: class
type:
type: enum
name: SoftwareRequirement_class
symbols:
- cwl:SoftwareRequirement
doc: "Always 'SoftwareRequirement'"
jsonldPredicate:
"_id": "@type"
"_type": "@vocab"
- name: packages
type: SoftwarePackage[]
doc: "The list of software to be configured."
jsonldPredicate:
mapSubject: package
mapPredicate: specs
- name: SoftwarePackage
type: record
fields:
- name: package
type: string
doc: |
The name of the software to be made available. If the name is
common, inconsistent, or otherwise ambiguous it should be combined with
one or more identifiers in the `specs` field.
- name: version
type: string[]?
doc: |
The (optional) versions of the software that are known to be
compatible.
- name: specs
type: string[]?
jsonldPredicate: {_type: "@id", noLinkCheck: true}
doc: |
One or more [IRI](https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier)s
identifying resources for installing or enabling the software named in
the `package` field. Implementations may provide resolvers which map
these software identifier IRIs to some configuration action; or they can
use only the name from the `package` field on a best effort basis.
For example, the IRI https://packages.debian.org/bowtie could
be resolved with `apt-get install bowtie`. The IRI
https://anaconda.org/bioconda/bowtie could be resolved with `conda
install -c bioconda bowtie`.
IRIs can also be system independent and used to map to a specific
software installation or selection mechanism.
Using [RRID](https://www.identifiers.org/rrid/) as an example:
https://identifiers.org/rrid/RRID:SCR_005476
could be fulfilled using the above-mentioned Debian or bioconda
package, a local installation managed by [Environment Modules](https://modules.sourceforge.net/),
or any other mechanism the platform chooses. IRIs can also be from
identifier sources that are discipline specific yet still system
independent. As an example, the equivalent [ELIXIR Tools and Data
Service Registry](https://bio.tools) IRI to the previous RRID example is
https://bio.tools/tool/bowtie2/version/2.2.8.
If supported by a given registry, implementations are encouraged to
query these system independent software identifier IRIs directly for
links to packaging systems.
A site specific IRI can be listed as well. For example, an academic
computing cluster using Environment Modules could list the IRI
`https://hpc.example.edu/modules/bowtie-tbb/1.22` to indicate that
`module load bowtie-tbb/1.1.2` should be executed to make available
`bowtie` version 1.1.2 compiled with the TBB library prior to running
the accompanying Workflow or CommandLineTool. Note that the example IRI
is specific to a particular institution and computing environment as
the Environment Modules system does not have a common namespace or
standardized naming convention.
This last example is the least portable and should only be used if
mechanisms based off of the `package` field or more generic IRIs are
unavailable or unsuitable. While harmless to other sites, site specific
software IRIs should be left out of shared CWL descriptions to avoid
clutter.
- name: Dirent
type: record
doc: |
Define a file or subdirectory that must be staged to a particular
place prior to executing the command line tool. May be the result
of executing an expression, such as building a configuration file
from a template.
Usually files are staged within the [designated output directory](#Runtime_environment).
However, under certain circumstances, files may be staged at
arbitrary locations, see discussion for `entryname`.
fields:
- name: entryname
type: ["null", string, Expression]
jsonldPredicate:
_id: cwl:entryname
doc: |
The "target" name of the file or subdirectory. If `entry` is
a File or Directory, the `entryname` field overrides the value
of `basename` of the File or Directory object.
* Required when `entry` evaluates to file contents only
* Optional when `entry` evaluates to a File or Directory object with a `basename`
* Invalid when `entry` evaluates to an array of File or Directory objects.
If `entryname` is a relative path, it specifies a name within
the designated output directory. A relative path starting
with `../` or that resolves to location above the designated output directory is an error.
If `entryname` is an absolute path (starts with a slash `/`)
it is an error unless the following conditions are met:
* `DockerRequirement` is present in `requirements`
* The program is will run inside a software container
where, from the perspective of the program, the root
filesystem is not shared with any other user or
running program.
In this case, and the above conditions are met, then
`entryname` may specify the absolute path within the container
where the file or directory must be placed.
- name: entry
type: [string, Expression]
jsonldPredicate:
_id: cwl:entry
doc: |
If the value is a string literal or an expression which evaluates to a
string, a new text file must be created with the string as the file contents.
If the value is an expression that evaluates to a `File` or
`Directory` object, or an array of `File` or `Directory`
objects, this indicates the referenced file or directory
should be added to the designated output directory prior to
executing the tool.
If the value is an expression that evaluates to `null`,
nothing is added to the designated output directory, the entry
has no effect.
If the value is an expression that evaluates to some other
array, number, or object not consisting of `File` or
`Directory` objects, a new file must be created with the value
serialized to JSON text as the file contents. The JSON
serialization behavior should match the behavior of string
interpolation of [Parameter
references](#Parameter_references).
- name: writable
type: boolean?
default: false
doc: |
If true, the File or Directory (or array of Files or
Directories) declared in `entry` must be writable by the tool.
Changes to the file or directory must be isolated and not
visible by any other CommandLineTool process. This may be
implemented by making a copy of the original file or
directory.
Disruptive changes to the referenced file or directory must not
be allowed unless `InplaceUpdateRequirement.inplaceUpdate` is true.
Default false (files and directories read-only by default).
A directory marked as `writable: true` implies that all files and
subdirectories are recursively writable as well.
If `writable` is false, the file may be made available using a
bind mount or file system link to avoid unnecessary copying of
the input file. Command line tools may receive an error on
attempting to rename or delete files or directories that are
not explicitly marked as writable.
- name: InitialWorkDirRequirement
type: record
extends: ProcessRequirement
doc:
Define a list of files and subdirectories that must be staged by
the workflow platform prior to executing the command line tool.
Normally files are staged within the designated output directory.
However, when running inside containers, files may be staged at
arbitrary locations, see discussion for [`Dirent.entryname`](#Dirent).