Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider a new file for specifying workflow options #3900

Open
oliver-sanders opened this issue Oct 28, 2020 · 5 comments
Open

consider a new file for specifying workflow options #3900

oliver-sanders opened this issue Oct 28, 2020 · 5 comments
Labels
small speculative blue-skies ideas
Milestone

Comments

@oliver-sanders
Copy link
Member

Note: From previous discussions (I think at CylcCon2019) which never made it into an issue.

Note: Raising an issue after @TomekTrzeciak made me aware we don't currently have one.

Small, low priority issue for consideration after 8.0.0 release.

TLDR;

Should we provide a Cylc alternative for defining suite variables implemented in Python to help bridge the gap to Cylc9.

This would be effectively be a slightly more sophisticated way of doing this {% from ".cylc_conf" import suite_variables %} in the flow.cylc file.

The main use cases are expected to be with the more programatic workflows where the definition changes wildly with different input values. Or where users are defining workflows based on datasets, etc.

Cylc Suite Variables & Set Files

cylc run currently supports loading suite variables from an arbitrary file using the --set-file option.

Similar to the -s option the --set-file option loads all values as strings, so in the following "set file" example both answer and question would be interpreted as strings:

answer=42
question=what do you get when you multiply 6 by 7

This is highly restrictive, it can be overcome to some extent by parsing values in the flow.cylc file but it's starting to get a bit ridiculous:

#!Jinja2
{% from "ast" import literal_eval %}
{% set parsed_answer = literal_eval(answer) %}

Set files must be manually specified on the command line, all in all they are a bit cumbersome and not well used.

Rose *:suite.rc Variables

Related To: #3819

Of course Rose provides an interface which supports literals, currently implemented by lazily dumping the raw values into the suite.rc file and leaving Jinja2 to parse them. In Cylc8 we plan to use literal_eval. Rose example:

[jinja2:suite.rc]
answer=42
question="what do you get when you multiply 6 by 7"

Literal values are a big improvement, however, the passing of more advanced objects than basic literals may be desired. For a compelling example consider Pandas datasets.

Rose Optional Configurations

The Rose framework also provides support for multiple pre-defined variants of the same config supported via optional configurations. This is often to enhance portability, often in combination with the following Jinja2 include pattern:

#!Jinja2

{% include "site/" + SITE + "-suite.rc" %}

This functionality is nice for cases where there are a finite number of potential options which are known in advance. Optional configurations can be used in a multi-dimensional manner, however, this is not the intended usage and Rose linearises the configuration making it impossible to script logic to handle edge cases or resolve conflicts.

Direct Jinja2 Imports

Of course you can just define the options you want in a Python file and import that using Jinja2:

#!Jinja2

{% from ".my_module" import variables %}

This imposes no restrictions on data type and exposes Python's import mechanism as a proxy to modularity.

Cylc9 & The Future

Related To: #1962

We plan to develop a Python API for workflow configuration for a future release of Cylc. With this change users will be able to define variables in Python with no restriction on data type, utilise Python's modularity and do all sorts of wonderful and potentially dangerous things.

Even in Cylc9 there is some credit to the idea of keeping inputs / driving data separate from program logic. I.E. define top-level "suite variables" in one place and generate the workflow configuration from these variables in another place.

Cylc8 & The Now

Idea:

  • Cylc could provide its own alternative to the rose-suite.conf file for defining suite variables implemented in Python with no restriction on data types.
  • This file could provide an interface for returning suite variables and environment variables but also potentially things like xtriggers, etc.
  • These "suite variables" would not be suitable for storage in the database as is currently done as that would require serialisation. They would live only for the lifetime of the workflow configuration.

Benefits:

  • This interface would help to bridge the gap to Cylc9.
  • This may still have relevance into the future.
  • Provides a pattern for separating inputs and configuration.

Implementation:

  • A Python module inside the workflow with a standard name.
  • The module would be loaded before the configuration so that it can be used within the configuration.
  • Note nothing here can't be achieved currently by just importing a Python module from Jinja2 directly (even in Cylc7).
  • However, this would provide a somewhat nicer way of doing it.
@oliver-sanders oliver-sanders added small question Flag this as a question for the next Cylc project meeting. labels Oct 28, 2020
@oliver-sanders oliver-sanders added this to the 8.x milestone Oct 28, 2020
@hjoliver
Copy link
Member

Nice write-up @oliver-sanders . However,

Note nothing here can't be achieved currently by just importing a Python module from Jinja2 directly (even in Cylc7).

Given the amount of other stuff we need to get done as well, is it not sufficient to just document and publicize this capability? (Which was added relatively recently).

@oliver-sanders
Copy link
Member Author

Yep agreed I'm not convinced of the value of a new file, just following up on an old lead, and documenting the options and functionalities we currently have.

Should document the pattern of importing Jinja2 modules for the purpose of defining inputs at some point.

Tagged for consideration against 8.x for now.

@TomekTrzeciak
Copy link
Contributor

Note nothing here can't be achieved currently by just importing a Python module from Jinja2 directly (even in Cylc7).

That depends on what you're trying to achieve. For example, with rose suite-run and Cylc 7 Jinja2 gets executed only after the suite install, so if I want to perform some action prior to that it is already too late. Not sure how that will look like in Cylc 8.

What I would hope for here is to execute a piece of custom code that would have an awareness of the context it is executed in, such as Cylc command and options being used, early on in the process. This sort of thing could be better suited to custom plugins, provided there would be a way to bundle them with the suite and load automatically by cylc command(s).

@oliver-sanders
Copy link
Member Author

provided there would be a way to bundle them with the suite and load automatically by cylc command(s).

Possibly - #3780

This sort of thing could be better suited to custom plugins

Likely

Would be good to get a handle on the use case(s) here, what exactly are you trying to achieve? Control over workflow installation? An intermediate configuration stage between installation and run?

@TomekTrzeciak
Copy link
Contributor

Would be good to get a handle on the use case(s) here, what exactly are you trying to achieve? Control over workflow installation? An intermediate configuration stage between installation and run?

We have a suite build step that generates include file (and a bunch of of other files) with most of the workflow definition in it. This obviously has to run at the very early stage of suite life cycle and currently gets fired off via custom Jinja2 filter from suite.rc. This works but is a bit convoluted and I wonder if a plugin would offer a cleaner solution, in particular if custom environments #3780 were possible (e.g., we need networkx for graph processing).

Another nice thing about plugins is that you can offer several entry points to target various stages along the suite life line. And you can add more in the future (within the reason) as the use cases present themselves.

@oliver-sanders oliver-sanders added speculative blue-skies ideas and removed question Flag this as a question for the next Cylc project meeting. labels Apr 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
small speculative blue-skies ideas
Projects
None yet
Development

No branches or pull requests

3 participants