Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate pre-populating default arguments based on JSON Schemas? #1686

Closed
tcompa opened this issue Jul 25, 2024 · 2 comments · Fixed by #1688
Closed

Deprecate pre-populating default arguments based on JSON Schemas? #1686

tcompa opened this issue Jul 25, 2024 · 2 comments · Fixed by #1688

Comments

@tcompa
Copy link
Collaborator

tcompa commented Jul 25, 2024

We currently use e.g. the TaskV2.default_args_parallel_from_args_schema property within the task creation, to pre-populate task arguments with the top-level default coming from JSON Schemas - if any.

I think this is the evolution of some old feature (probably called something like "default_arguments"), that was more relevant when the defaults were really only set in the database (as opposed, e.g., to the call signature and JSON Schemas).
In this moment, my best guess is that this is serving no scope: if argument a has a default which is not set, nothing should change in the way a task runs. Upon runtime, the task would set the argument a to its default and continue as usual.

Note that there is one difference, namely that the value of a is not stored in the database explicitly, but only implicitly (via the version of the task that we are using, which then maps to the source code that includes the defaults). However this is an incomplete information, since it only holds for top-level arguments. If we decide that we need the backend to fully store the task arguments, that's not what we have now and we would need to introduce more complex logic that actually uses JSON Schemas in the backend.


Let's review this issue, and deprecate the feature if appropriate. This would also contribute to clarifying some fractal-web issue (ref fractal-analytics-platform/fractal-web#538).

@tcompa tcompa linked a pull request Jul 25, 2024 that will close this issue
2 tasks
@jluethi
Copy link
Collaborator

jluethi commented Jul 25, 2024

Good summary to help us get back to that later!

Note that there is one difference, namely that the value of a is not stored in the database explicitly, but only implicitly (via the version of the task that we are using, which then maps to the source code that includes the defaults). However this is an incomplete information, since it only holds for top-level arguments. If we decide that we need the backend to fully store the task arguments, that's not what we have now and we would need to introduce more complex logic that actually uses JSON Schemas in the backend.

Agreed, that also came up as a potential weird edge-case. What if I never change a default value and then update my task? Does that mean after the update, it will use the new default parameters? Is that the expected behavior?
And how would I ensure that the default value is really what I want to be set there?

Clearly that also wouldn't work too well in the current version, e.g. doesn't handle all the nested structures. Maybe this will be something we need to handle as a feature of task updating (e.g. warning a user if any defaults change) and make the backend side of this much simpler by not storing such default arguments.

The broader question here for me is:
Do we have a concept of "this value has been set by the user", @lorenzocerrone describes it as "activate this parameter"? Would we want that? Or is this unneeded complexity?
I think exploring this in the context of more complex models will be relevant

@tcompa
Copy link
Collaborator Author

tcompa commented Jul 25, 2024

One quick feedback:

Do we have a concept of "this value has been set by the user", @lorenzocerrone describes it as "activate this parameter"? Would we want that? Or is this unneeded complexity?

What is described here and implemented in #1686 is that the backend would not ever read the JSON schemas any more. It would take whatever was associated to the tasks (which is expected to be up-to-date with the tasks' Python code) and send it to the webclient for form generation. The content of the args_schema fields would be just an arbitrary string, on the backend side.

If we proceed in this way, then there doesn't even exist a concept of "list of possible arguments", since this would need to come from the JSON Schema.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

2 participants