Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scatter before a when occurs #1668

Open
jjkoehorst opened this issue May 20, 2022 · 2 comments · May be fixed by #1669
Open

Scatter before a when occurs #1668

jjkoehorst opened this issue May 20, 2022 · 2 comments · May be fixed by #1669

Comments

@jjkoehorst
Copy link
Contributor

Expected Behavior

We are using a when to check if a variable is given to a workflow or not to execute a specific step.
When this variable (a String[]) we apply a when: $(inputs.kraken_database !== null) check to see if this variable is given. In addition we apply a scatter on this variable to run it for each of the elements.

Actual Behavior

It tries to perform the scatter before the when is processed.

INFO [workflow ] starting step illumina_quality_kraken2
ERROR Unhandled exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/cwltool/workflow_job.py", line 731, in try_make_job
    emptyscatter = [
  File "/usr/local/lib/python3.8/dist-packages/cwltool/workflow_job.py", line 732, in <listcomp>
    shortname(s) for s in scatter if len(cast(Sized, inputobj[s])) == 0
TypeError: object of type 'NoneType' has no len()
INFO [workflow ] completed permanentFail

Workflow Code

  illumina_quality_kraken2:
    label: Kraken2
    doc: Taxonomic classification of FASTQ reads
    when: $(inputs.kraken_database !== null)
    run: ../kraken2/kraken2.cwl
    scatter: database
    in:
      tmp_id: identifier
      identifier:
        valueFrom: $(inputs.tmp_id)_Qfiltered
      threads: threads
      kraken_database: kraken_database
      database: kraken_database
      forward_reads: phix_filter/out_forward_reads
      reverse_reads: phix_filter/out_reverse_reads
      paired_end: 
        default: true
    out: [sample_report]

Your Environment

  • cwltool version:
    /usr/local/bin/cwltool 3.1.20220224085855
@mr-c mr-c linked a pull request May 20, 2022 that will close this issue
1 task
@mr-c
Copy link
Member

mr-c commented Aug 15, 2022

https://www.commonwl.org/v1.2/Workflow.html#Scatter/gather tells us that "upstream parameters which are connected to scattered parameters must be arrays" but I can only reproduce your bug when the database workflow level input is null (as opposed to an empty array or an array with some null elements).

https://www.commonwl.org/v1.2/Workflow.html#Conditional_execution_(Optional) tells us that the when condition is evaluated after scattering

So with CWL v1.2, to skip based upon an input being missing when that input is also being scattered over requires nesting the scatter in a sub workflow.

  illumina_quality_kraken2:
    label: Kraken2
    doc: Taxonomic classification of FASTQ reads
    when: $(inputs.kraken_database !== null)
    in:
      tmp_id: identifier
      identifier:
        valueFrom: $(inputs.tmp_id)_Qfiltered
      threads: threads
      kraken_database: kraken_database
      database: kraken_database
      forward_reads: phix_filter/out_forward_reads
      reverse_reads: phix_filter/out_reverse_reads
      paired_end: 
        default: true
    run: 
      class: Worflow
      inputs:
        identifier: string  # a guess at the type
        threads: int  # a guess at the type
        kraken_database: File[]  # replace with the real type
        forward_reads: Any  # replace with the real type
        reverse_reads: Any  # replace with the real type
        paired_end: bool
      steps:
        kraken2:
          run: ../kraken2/kraken2.cwl
          scatter: database
          in:
            identifier: identifier
            threads: threads
            kraken_database: kraken_database
            database: kraken_database
            forward_reads: forward_reads
            reverse_reads: reverse_reads
            paired_end: paired_end 
          out: [sample_report]
      outputs:
        sample_report:
          type: File[]  # replace with the real type
    out: [sample_report]

@mr-c
Copy link
Member

mr-c commented Aug 15, 2022

Another workaround would be if the workflow input kraken_database was of type array instead of optional array, and it had a default value of [] (an empty array)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants