-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the import of the new industry module #386
Fix the import of the new industry module #386
Conversation
what, if any, disadvantage is there of having the industry config unavailable from the top-level? E.g., if submodularising Euro-Calliope, how would one now provide overrides for default industry module config items? |
I don't know how and whether that would work at all. It would be good to test that. I'd personally have the config in the top-level, but isn't that somewhat out of scope? |
Still, good comment because it made me available that I broke the config and that went through unnoticed because of a |
I suppose it is, if our ultimate aim is to move the industry "module" out of the core EC repo. |
I agree. Before we do that, we should have a proper plan of the pros and cons though. I wouldn't do that any time soon -- likely not before the next release. |
Having said that, I figured there is another problem: #387. |
I do not approve this pull request., it would break our entire design. The module was designed according to snakemake specs:
In theory any EC project will be completely free to use the industry module as they see fit once we move things outside. Please reconsider merging this in. |
Can you please try doing the following in the code without your changes?
|
@irm-codebase I'm not sure this change breaks things as you suggest. The module rules are renamed as suggested, they are just "imported" from the We could (and should) fix the default rule issue by using the I'd stick with the config file as it was (under the |
This would be the content of configfile: "modules/industry/config.yaml"
validate(config["industry"], "../modules/industry/schema.yaml")
module module_industry:
snakefile: "../modules/industry/industry.smk"
config: config["industry"]
use rule * from module_industry as module_industry_* In addition, the first level of the schema would be removed and |
@brynpickering I can settle for that! Just please keep the As for For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are general comments on how to fix these bugs while keeping modularity.
modules/industry/config.yaml
Outdated
@@ -1,13 +1,12 @@ | |||
industry: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep the industry
top level. Otherwise, modules and general configuration will conflict between each other if developers use similar names.
properties: | ||
industry: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. The definition of this level is crucial for modules to work properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to remove it from here (and necessary for schema validation given that the config is merged with the rest of the project config). validating would become validate(config["industry"], "../modules/industry/schema.yaml")
which means industry:
should not be in the schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important to have it here so that the module makes sure you are specifying the module configuration separately. Otherwise, someone might add all these at the top level, causing conflicts!
The only limitation that this might impose is that modules cannot expect the same name, which is good, imo.
rules/industry.smk
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider using one file for module imports, not individual files per module. This should minimize bloat down the line.
We can go with individual files if we think we may write additional module rules at the project level (snakemake allows this).
I do not have strong opinions on this, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the one file idea.
Snakefile
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core bug is fixed by adding default_target: True
to rule all:
This should avoid having to heavily modify the industry module beyond the schema
improvements.
Sounds good @irm-codebase. Happy to make the updates @timtroendle ? |
@timtroendle @brynpickering The reasoning was this:
If you set it to |
Not if you only pass |
The schema is directly checking for I guess it's just preference, then. Either is fine. |
c628e65
to
4bb09da
Compare
Sorry this was based on my wrong assumption that one cannot call "configfile" twice. I changed according to all your comments, @irm-codebase and @brynpickering , please have another look. |
4bb09da
to
cb3c2f7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. But I added one last suggestion.
rules/modules.smk
Outdated
@@ -0,0 +1,9 @@ | |||
# Industry | |||
configfile: "./modules/industry/config.yaml" | |||
validate(config["industry"], "../modules/industry/schema.yaml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one final improvement.
I think we should move this validation within the module's smk. This way, we avoid users skipping validation because they find it cumbersome and always enforce it.
That way, we still force users to separate a module's configuration due to additionalProperties: false
(although it's not really explicit code-wise).
Could you test this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me, also in terms of separation of concerns. I tested it and it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good.
I'll make sure to merge these changes into future module updates.
I have one more comment myself. Personally, I don't think the "industry" key in the config in What I suggest to do:
|
I agree. Down the line, the configuration inside the module will only serve as an example. In theory, projects should be able to specify their own configuration, overriding the default. So I do not see this as redundant at all. However, I tried the approach you mention, but it lead to issues. Snakemake would return an error when it detected the There is another reason why I kept the configuration within the folder: it's because the module is still in development, and I want to avoid doubling efforts by having a config file in another location as well. Once it's fully ready, we should take your approach. I added this note to #381. Still, please let's avoid having two config files before the rest of the steps are done. |
One last argument in favor of keeping the "industry" key even for the example configuration @timtroendle : It keeps correct module compartmentalization explicit. I believe there should be minimal friction between the configuration file that users see as an example, and how users will actually employ it. If we remove this level, this becomes implicit: we need extra documentation or users reading the schema. Otherwise they will get errors. I know it's a minor detail, but I think that this kind of thing is what makes code easy to use for others. We can go either way, of course. |
Ok, that's fine by me. @brynpickering any other comments or do you approve? |
There were two problems with the industry module that broke the main workflow:
all
rule was overwrittenconfig
was overwrittenThese are fixed here.
Checklist
Any checks which are not relevant to the PR can be pre-checked by the PR creator. All others should be checked by the reviewer. You can add extra checklist items here if required by the PR.