-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confusion caused by parameter + multiple execution of global section. #1219
Comments
I think I've raised the concern a few times in the past that global section got executed multiple times. It is expected that global section should only be executed once. But I recall you gave some good reasons why it executes multiple times ... clearly this now causes troubles. Maybe let's revisit those reasons? |
The easiest example is
because Cannot recall immediately the reason for ignoring |
I see, now I recall the |
Yes, we can hack and separate |
For example
and We can enforce the |
This is a more acceptable behavior than the example above that For global objects pickled, are they shared to substeps? In other words each substep will use the environment loaded by the step rather than loading and unpickling saved object on their own? If objects are shared, are they read-only? |
Right now we send Let me revisit this after I completes #1218 |
Use the following to test the pickleability of things:
So
|
Can we then instead only pickle parameters and variables, and run the rest of the code -- I guess this is not easy to do? |
We can do something complicated but I am not sure how reliable it can be... Basically, we can parse the global section and
Now the global section is separated into
This should work most of the time but
but I guess we can tolerate such limitations. |
Not sure if you are still using it but I will have to remove the |
I do not use these any more. I think you managed to convinced me to stop using these at some point. |
Yes, I think I was under the influence of snakemake which can manage multiple files, but frowned upon the complexity and the implementation and especially the using of these features, especially because they are not in line with our design philosophy ... I am removing them now. |
Done. This also resolves #1155 (execution of global statement on remote host) because now the global statement will only be executed once (locally) and be pickled to all steps, substeps, and tasks. |
Great! Since it is pickled, how does each substep use this pickled data? Do they each load it on their own, or the step loads it and copies to all substeps -- by reference or by value? |
As far as I can tell, each substep will load it because everything is mixed up on the same worker. This could be improved but right now correctness is more important. |
The trunk is not ready as there are still randomly failed tests and I can now reproduce the DAG problem but I believe the key parts are done. |
This can be reproduced reliably now on your end with some example? I do not think I get it every time, at least I did not see it in versions before 0.18.7. |
No. Only with the complicated example you have with 20+k jobs and many nested workflows. I can possibly try to trim down the workflow to create a test case though. |
A user reported a case in gitter which boils down to
and the output is
We should investigate the behavior of
parameter
inglobal
section more closely.The text was updated successfully, but these errors were encountered: