Implicit variables are null when using kuberun #1050

bentsherman · 2019-02-26T16:04:12Z

Bug report

Expected behavior and actual behavior

When I use ${baseDir} in my config file, it is properly recognized when I run my pipeline using nextflow run:

$ cat nextflow.config
[...]
      clip_path = "${baseDir}/files/fasta_adapter.txt"
[...]
$ nextflow -C nextflow.config run systemsgenetics/GEMmaker -with-docker
[...]
  Trimmomatic clip path:      ~/.nextflow/assets/systemsgenetics/GEMmaker/files/fasta_adapter.txt
[...]

Where nextflow.config is taken from nextflow.config.example in the GEMmaker repo. However, when I try the same thing with kuberun, nextflow does not recognize ${baseDir}:

$ nextflow -C nextflow.config kuberun systemsgenetics/GEMmaker
[...]
  Trimmomatic clip path:      null/files/fasta_adapter.txt
[...]

Steps to reproduce the problem

See above example.

Program output

See above example.

Environment

Nextflow version: 19.01.0
Java version: 1.8.0_191
Operating system: Ubuntu

The text was updated successfully, but these errors were encountered:

pditommaso · 2019-03-13T08:07:05Z

Made some tests but I was not able to replicate this

bentsherman · 2019-03-15T14:57:17Z

@pditommaso Could you be more specific? Do you have a kubernetes cluster that you could test it on?

Also I'm using 19.01 but is there an easy way to test the latest build, like a nightly build?

pditommaso · 2019-03-18T07:40:34Z

I'm using a local K8s cluster deployed along with Docker.

Also I'm using 19.01 but is there an easy way to test the latest build, like a nightly build?

Yes, you can try latest edge release using this command:

NXF_VER=19.03.0-edge nextflow run .. etc

Or exporting the NXF_VER variable in your (host) environment and using NF as usual.

bentsherman · 2019-05-17T20:58:47Z

It looks like $baseDir is evaluated in nextflow.config before it arrives on the k8s cluster. I checked the configmap which transmits nextflow.config:

$ kubectl get configmap nf-config-f5791ed2 -o yaml
[...]
             clip_path = 'null/files/fasta_adapter.txt'
[...]

Could we change nextflow so that it doesn't evaluate variables like $baseDir until the config file is in kubernetes?

pditommaso · 2019-05-20T14:43:50Z

ah, this helps a lot. Let me check.

pditommaso · 2019-05-21T07:31:33Z

This is quite tricky to solve. Unfortunately, for now, you will need to avoid the use baseDir in the config when deploying via kubern.

bentsherman · 2019-05-21T12:17:16Z

Yes I suspected it might be that way. It's no problem, I can work around it pretty easily.

stale · 2020-04-27T04:34:40Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

bentsherman · 2021-09-28T17:22:29Z

Just wanted to add that projectDir has the same issue, so I think this problem is general to all implicit variables as they are resolved in nextflow.config before executing nextflow run on the k8s cluster. For anyone who is dealing with this problem, you can work around it for any param that uses an implicit variable by specifying that param with the explicit project directory on k8s via command-line or params file.

xhejtman · 2022-01-17T18:17:46Z

Paolo pointed me to this issue. I might have similar issue as functions from nextflow.config are not passed when using kuberun which make some pipelines to fail, e.g., nf-core/sarek. Is there any workaround for this? Our workaround is using patched nextflow binaries so that a hook is executed before running nextflow and adding the functions to pod's nextflow.config. Is there any possiblity to do it without patch? Or are you open to implementing some kind of hooks support? Can create PR for hooks.

bentsherman · 2022-01-18T01:13:06Z

What if you include the functions in the main nextflow.config in the github repo? Does it work then?

kuberun is really only intended for basic usage of k8s, so it doesn't support everything you might want to do. You can always fall back to doing manually what kuberun does automatically: create a yaml for a submitter pod that launches your nextflow pipeline with whatever options, then create the pod. In fact I created a script to do just this. I made it by reverse engineering the pod yaml that is generated by kuberun. IIRC it does not have any problem with implicit variables, and shouldn't have a problem with extra functions.

Feel free to submit a PR based on your work. We might be able to figure out a solution from it. But in general it is recommended that you use an external service like Nextflow Tower to manage submitter pods instead of kuberun.

xhejtman · 2022-01-18T08:41:47Z

Are you able to run nf-core/sarek pipeline then? I think you do not need any data as it fails on the functions.

To be more clear, I do not use functions in my local nextflow.config.

As can be seen here: (at the very end)
https://github.com/nf-core/sarek/blob/master/nextflow.config

there is a function check_resources. kuberun download this config from git (I know you can alter the local copy, but it is a bit ugly to me), it creates a config map from this config. In this config map, the function is missing and it fails as check_resources it not defined.

So we created a script, that is injected into command of the master pod, this script simply appends the function to the nextflow.config from the config map.

As of PR, I was wondering if adding script name to be executed in the master pod before executing nextflow itself would be accepted? If so, I will happily do PR. I suppose using cmdline option to specify path + scriptname? -headscript ?

bentsherman · 2022-01-19T01:52:34Z

I will try nf-core/sarek with kuberun when I get a chance this week. But I understand the problem you describe.

I don't think your general solution of injecting an arbitrary pre-script hook will be accepted into Nextflow. We might be able to fix this particular problem with kuberun. I will look into it when I get a chance, this week or next week hopefully.

Either way, you can provision your own submitter node (like a head node on a HPC cluster) and launch the pipeline from that node using nextflow run with k8s executor. That is essentially what kuberun does for you in the background.

xhejtman · 2022-01-22T20:10:00Z

I don't think your general solution of injecting an arbitrary pre-script hook will be accepted into Nextflow. We might be able to fix this particular problem with kuberun. I will look into it when I get a chance, this week or next week hopefully.

Well, I see that tower.nf offers to run pre-run script. So why not to add the same for CLI?

bentsherman · 2022-01-22T22:03:28Z

Ah, I haven't played with that feature in Tower, so I'll have to look at it and see how it might be imitated in Nextflow itself. But also, if you're already using Tower then why do you still need kuberun?

xhejtman · 2022-01-22T22:09:04Z

No, I just tried your demo to see what tower offers and see if it can be used by someone that is not familiar with nextflow and pipelines and cmd line. But some users still prefer CLI as they have a single shell script which they just need to run.

bentsherman · 2022-02-03T23:36:12Z

I finally was able to spin up a GKE cluster and test the sarek pipeline. It fails with kuberun as you say, but I was able to get around kuberun by the following:

provision a head pod manually via kubectl create
login to the head pod via kubectl exec
run the nextflow pipeline like normal: nextflow run nf-core/sarek -profile test

I provide the PVC through a config setting:

k8s {
  storageClaimName = '<pvc-name>'
}

I used kube-login.sh from this repo to provision the head node.

With Tower it makes to have a pre-run script because Tower is responsible for provisioning the head node. Nextflow isn't intended to provision head nodes in general, it just so happens that it can hack one together in k8s via kuberun. But kuberun has many limitations, including this issue, so it is not fit for production use. If someone can find a solution to this issue then a PR is welcome, but ultimately the solution is to use an external service like Tower.

If someone prefers the CLI, there is now a Tower CLI too. If they just have a small test and kuberun isn't working for them, they can use the solution I described here.

xhejtman · 2022-02-05T13:37:13Z

OK, thanks for testing.

Do you know why kuberun manipulates the nextflow.config? Is there some strong reason for it? I basically need kuberun just to provision the head node and running nextflow, mostly the same thing kube-login.sh does. Was kuberun intended for something more?

We already run nextflow from some 'proxy' pod, but you need to run it at least from tmux/screen to run it in background.

To conclude, if you say that kuberun is not for production use, we can provide workaround (e.g. as you suggested), however I see kuberun easier for end user.

bentsherman · 2022-02-07T21:08:49Z

Eventually I'd like to study how kuberun works under the hood, but I think the problem is that kuberun resolves the nextflow config prematurely, which is why implicit variables and functions get lost in the process.

The other script in that repo, kube-run.sh, really does what I want kuberun to do, which is to provision a head pod, copy any local nextflow config, launch nextflow like normal from the head pod, and then sit back and forward the logs. That script also avoids the implicit variable issue and you don't have to stay connected to the pod. That script is really the ideal workaround, more so that kube-login.sh, and that's what I recommend you use so long as this issue persists in kuberun.

bentsherman · 2022-05-24T18:31:02Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implicit variables are null when using kuberun #1050

Implicit variables are null when using kuberun #1050

bentsherman commented Feb 26, 2019

pditommaso commented Mar 13, 2019

bentsherman commented Mar 15, 2019

pditommaso commented Mar 18, 2019

bentsherman commented May 17, 2019

pditommaso commented May 20, 2019

pditommaso commented May 21, 2019

bentsherman commented May 21, 2019

stale bot commented Apr 27, 2020

bentsherman commented Sep 28, 2021

xhejtman commented Jan 17, 2022

bentsherman commented Jan 18, 2022

xhejtman commented Jan 18, 2022

bentsherman commented Jan 19, 2022

xhejtman commented Jan 22, 2022

bentsherman commented Jan 22, 2022

xhejtman commented Jan 22, 2022 •

edited

Loading

bentsherman commented Feb 3, 2022

xhejtman commented Feb 5, 2022

bentsherman commented Feb 7, 2022 •

edited

Loading

bentsherman commented May 24, 2022

olgabot commented Jun 2, 2022

stale bot commented Jan 16, 2023

bentsherman commented Jan 17, 2023

Implicit variables are null when using kuberun #1050

Implicit variables are null when using kuberun #1050

Comments

bentsherman commented Feb 26, 2019

Bug report

Expected behavior and actual behavior

Steps to reproduce the problem

Program output

Environment

pditommaso commented Mar 13, 2019

bentsherman commented Mar 15, 2019

pditommaso commented Mar 18, 2019

bentsherman commented May 17, 2019

pditommaso commented May 20, 2019

pditommaso commented May 21, 2019

bentsherman commented May 21, 2019

stale bot commented Apr 27, 2020

bentsherman commented Sep 28, 2021

xhejtman commented Jan 17, 2022

bentsherman commented Jan 18, 2022

xhejtman commented Jan 18, 2022

bentsherman commented Jan 19, 2022

xhejtman commented Jan 22, 2022

bentsherman commented Jan 22, 2022

xhejtman commented Jan 22, 2022 • edited Loading

bentsherman commented Feb 3, 2022

xhejtman commented Feb 5, 2022

bentsherman commented Feb 7, 2022 • edited Loading

bentsherman commented May 24, 2022

olgabot commented Jun 2, 2022

stale bot commented Jan 16, 2023

bentsherman commented Jan 17, 2023

xhejtman commented Jan 22, 2022 •

edited

Loading

bentsherman commented Feb 7, 2022 •

edited

Loading