Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We shouldn't require authors to be in config-references.yml to run a recipe #28

Open
ledm opened this issue May 15, 2019 · 15 comments
Open
Assignees
Labels
enhancement New feature or request

Comments

@ledm
Copy link
Contributor

ledm commented May 15, 2019

In the documentation section of recipes, there is an obligatory field: authors. This requires the names listed in the authors section of the recipe to be in the config-references.yml file.

Otherwise, you will get an error like this one:

ValueError: Tag 'lee' does not exist in section 'authors' of /users/modellers/ledm/workspace/ESMValToolTest/ESMValTool/esmvaltool/config-references.yml

This should 100% be a requirement for pushing recipes back to github.

However, this excludes new users from ownership of their own recipe if they don't have access to the source code. For instance they are running a shared installation or ESMValTool, or they are using the user installation instead of the develop installation.

Can we relax this requirement when running a recipe, but keep it for submitting recipes?

@valeriupredoi and I spoke about it earlier today. Any thoughts @mattiarighi, @bouweandela ?

@valeriupredoi
Copy link
Contributor

Right on @ledm ! This is particularily needed when the tool is centrally installed on a large data-store cluster like Jasmin or DKRZ - brand new users will want to first run an existing recipe with that sourceable installation and then write their own recipe - they have the option to keep the provenance info on a given existing recipe (which is not nice) or they should be able to use a new_author or newbie or jasmin-user before they get more serious and get added as authors themselves

@mattiarighi
Copy link
Contributor

What about adding an entry in config-references.yml for such cases, something like:

undef:
  name: undefined user
  institute: -

@valeriupredoi
Copy link
Contributor

sounds good @mattiarighi - can it be Batman instead? 😁
@bouweandela you the main provenance person, what say you (about putting this in, not Batman)

@mattiarighi
Copy link
Contributor

sounds good @mattiarighi - can it be Batman instead?

No, that might disappoint Marvel fans.

@bouweandela
Copy link
Member

bouweandela commented May 16, 2019

I'm not so sure about this issue: I think it is best if people check out the diagnostics repository (after the planned split) and start developing in there, this will make it much easier to later create a pull request to add what you're developing back to esmvaltool. If you just want to try out esmvaltool, I think it's not necessary to put yourself as an author on the recipe, it only makes sense to do that by the time you want to publish your results and by that time it would be good if your code is in a shape that a pull request can be made anyway. However, I'm not a scientist of course, so maybe I'm missing something here..

@ledm
Copy link
Contributor Author

ledm commented May 16, 2019

I've got to disagree with you @bouweandela. It takes several hours to install a personal copy of ESMValTool from scratch. When host a 4 hour ESMValTool tutorial, we don't want to spend half of it installing ESMVallTol. This means that nearly all new users will always use a central installation ESMValTool, at least on day one.

Say you're a new user: you take some one else's recipe, changes the few bits that you can and run it using a central installation of ESMValTool. Your recipe will then fail immediately because your name is not on the accepted authorship list. To that user, this will feel like we are purposely preventing them from contributing, and it will make them feel excluded and they will have less drive to continue using ESMValTool.

At the very least, we can make it so that the authors field is not a required in a recipe. People can then just leave it blank until they start a PR.

@bouweandela
Copy link
Member

Say you're a new user: you take some one else's recipe, changes the few bits that you can and run it using a central installation of ESMValTool.

After we've split the core and diagnostics installation (in preparation to actually splitting the repositories), it should be easy to use a centrally installed version of esmvaltool with a custom diagnostics scripts location, so long as you're not changing the dependencies, because the diagnostic scripts do not actually need to be installed, just their dependencies.

It takes several hours to install a personal copy of ESMValTool from scratch

It' that's the case, this is a serious problem and something that probably needs fixing more urgently. It takes me about 15 minutes to install esmvaltool (or less, depending on how fast the machine is), but I guess I'm not a good example of a new user.

@ledm
Copy link
Contributor Author

ledm commented May 16, 2019

With regards to an installation from scratch, it takes a while to install on jasmin, as users need to install miniconda2 in addition to all the packages listed in environment.yml.

In all cases, a full installation can be a significant hurdle to jump for a new user.

We're going to be hosting an ESMValTool or UKESM users in mid-june so it would be great to have resolved this issue before then. (I assume that the split will occur in the first week of june at the DLR meeting.)

@valeriupredoi
Copy link
Contributor

yes, I fully support a john-doe user group

@mattiarighi mattiarighi transferred this issue from ESMValGroup/ESMValTool Jun 11, 2019
@mattiarighi mattiarighi added paper enhancement New feature or request labels Jun 11, 2019
@bouweandela
Copy link
Member

I took this into account when splitting the packages. @ledm After ESMValGroup/ESMValTool#1148 is merged, can you please test if things now work as you would like?

@bouweandela
Copy link
Member

This has now been implemented in the code, but it should still be documented.

Note that when the config-references.yml file is not available, no information from that file can be retrieved, so authors should write their full name and references etc. in the recipe and diagnostic scripts if they still want reasonable looking provenance output.

@stefsmeets
Copy link
Contributor

Hi @bouweandela , has this ever been resolved? As we are expanding the API, it is getting increasingly difficult for me to test recipe / metadata rendering since this depends on the ability of author names (tags) being resolved. Part of this is the forward dependency on esmvaltool to define author names, and the other is that there is no 'default' user to fall back to or to test with.

@bouweandela
Copy link
Member

No, this has not been completely solved, or the issue would have been closed. If you try to run a recipe with an author tag that is not available in the config-references.yml file (author some_author in this case), the run will stop with this error message:

ValueError: Tag 'some_author' does not exist in section 'authors' of /home/bandela/src/esmvalgroup/esmvaltool/esmvaltool/config-references.yml

The problem is that a collection of recipes, diagnostic scripts, bibtex references, and config-references.yml files needs to match each other. However, it is possible to use ESMValCore to just run a recipe and associated diagnostic script without the bibtex references and config-references.yml file, but obviously, you cannot replace tags with a value if you do not have the contents to replace them with, so this information will be missing from the output.

@stefsmeets
Copy link
Contributor

From your previous comment I got the impression that it was resolved, just not documented. I don't just want to run a recipe, I want to make sure the resolvers work correctly for the API / html generation. The outcome depends too much on whether esmvaltool is installed or not.

I can work around the issue in the tests, but I am not super happy with the direction that is taking. I see other parts of the code have similar work-arounds. I think it would be useful to have a sort of representative 'mini-repo' inside the core which we can import from and test with (filled with john-doe's and tags / recipes / references for testing).

@stefsmeets
Copy link
Contributor

Example of the issues with the tests from #991 : https://app.circleci.com/pipelines/github/ESMValGroup/ESMValCore/4261/workflows/1c532868-d1b1-4e91-8673-cf00bba300da/jobs/19824/steps

This is a run of esmvaltool in a subprocess in the tests, where we cannot patch in mocked diagnostics / tags. This problem gets exposed in #991 because we need to do this write an html summary. This means the author tags must be resolved, but they can't be, because the TAGS dict is empty I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants