-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use JSON schemas for WorkflowTask arguments #82
Comments
An example of a schema could be the following, where for a given workflow task we have a property which is a list of objects. Each object represents the definition of a workflow task argument. Arguments could be nested. {
"workflow_task_name": "workflow_task_name",
"workflow_task_id": 1,
"workflow_task_schema": [
{
"argument_name": "task_argument_name",
"argument_type": "object",
"argument_description": "description",
"default_value": "default value, with respect to argument type",
"is_required": true,
"inner_argument": {
"argument_name": "inner_argument_name",
"argument_type": "name of type",
"argument_description": null,
"default_value": null,
"is_required": true,
"inner_argument": null
}
}
]
} |
Here (below) is a realistic example of what comes out of pydantic If needed, I suggest that the logic for converting it towards a different kind of schema (like the one in #82 (comment)) is implemented as part of the the web client - to avoid relying on custom definitions (i.e. different from pydantic) in multiple places (tasks repository and web-client repository). Example 1Task arguments class TaskArguments(BaseModel, extra=Extra.forbid):
x: int = Field(description="This is the description of argument x")
y: Optional[str]
print(json.dumps((TaskArguments.schema()), indent=2)) JSON schema: {
"title": "TaskArguments",
"type": "object",
"properties": {
"x": {
"title": "X",
"description": "This is the description of argument x",
"type": "integer"
},
"y": {
"title": "Y",
"type": "string"
}
},
"required": [
"x"
],
"additionalProperties": false
} Example 2Task arguments: class TaskArguments(BaseModel, extra=Extra.forbid):
input_paths: Sequence[str]
output_path: str
metadata: Dict[str, Any]
image_extension: str
image_glob_patterns: Optional[list[str]]
allowed_channels: Sequence[Dict[str, Any]]
num_levels: Optional[int]
coarsening_xy: Optional[int]
metadata_table: Optional[str]
print(json.dumps((TaskArguments.schema()), indent=2)) JSON schema {
"title": "TaskArguments",
"type": "object",
"properties": {
"input_paths": {
"title": "Input Paths",
"type": "array",
"items": {
"type": "string"
}
},
"output_path": {
"title": "Output Path",
"type": "string"
},
"metadata": {
"title": "Metadata",
"type": "object"
},
"image_extension": {
"title": "Image Extension",
"type": "string"
},
"image_glob_patterns": {
"title": "Image Glob Patterns",
"type": "array",
"items": {
"type": "string"
}
},
"allowed_channels": {
"title": "Allowed Channels",
"type": "array",
"items": {
"type": "object"
}
},
"num_levels": {
"title": "Num Levels",
"type": "integer"
},
"coarsening_xy": {
"title": "Coarsening Xy",
"type": "integer"
},
"metadata_table": {
"title": "Metadata Table",
"type": "string"
}
},
"required": [
"input_paths",
"output_path",
"metadata",
"image_extension",
"allowed_channels"
],
"additionalProperties": false
} |
Another example: from pydantic import BaseModel
from pydantic import Extra
from typing import Optional
import json
class Channel(BaseModel):
x: int
y: Optional[int]
class TaskArguments(BaseModel, extra=Extra.forbid):
channels: list[Channel]
print(json.dumps((TaskArguments.schema()), indent=2)) {
"title": "TaskArguments",
"type": "object",
"properties": {
"channels": {
"title": "Channels",
"type": "array",
"items": {
"$ref": "#/definitions/Channel"
}
}
},
"required": [
"channels"
],
"additionalProperties": false,
"definitions": {
"Channel": {
"title": "Channel",
"type": "object",
"properties": {
"x": {
"title": "X",
"type": "integer"
},
"y": {
"title": "Y",
"type": "integer"
}
},
"required": [
"x"
]
}
}
} |
args
JSON schema
Here are some first thoughts, with @rkpasia. A lot more will come up during implementation, and also we will open more specific issues.
Something a bit more general:
|
Here is the first example we should address (note: priority is obviously for scalars). Possible additional complexity would come from defining custom types, as in #82 (comment). Note: Pydantic: from pydantic import BaseModel
from pydantic import Extra
from pydantic import Field
from typing import Optional
import json
class TaskArguments(BaseModel, extra=Extra.forbid):
i1: int
i2: int = 1
i3: int = Field(description="Description of i3")
i4: int = Field(examples=["i4=8"])
i5: Optional[int] = None
f1: float
f2: float = 0.5
b1: bool
b2: bool = Field(
description="Description of b2",
default=True,
title="b2 argument",
)
b3: Optional[bool]
a1: list
a2: list[int]
a3: list[list[int]] = Field(default=[[1, 2], [3, 4]])
o1: dict
o2: dict[str, int]
o3: dict[str, list[int]]
print(json.dumps((TaskArguments.schema()), indent=2)) JSON: {
"title": "TaskArguments",
"type": "object",
"properties": {
"i1": {
"title": "I1",
"type": "integer"
},
"i2": {
"title": "I2",
"default": 1,
"type": "integer"
},
"i3": {
"title": "I3",
"description": "Description of i3",
"type": "integer"
},
"i4": {
"title": "I4",
"examples": [
"i4=8"
],
"type": "integer"
},
"i5": {
"title": "I5",
"type": "integer"
},
"f1": {
"title": "F1",
"type": "number"
},
"f2": {
"title": "F2",
"default": 0.5,
"type": "number"
},
"b1": {
"title": "B1",
"type": "boolean"
},
"b2": {
"title": "b2 argument",
"description": "Description of b2",
"default": true,
"type": "boolean"
},
"b3": {
"title": "B3",
"type": "boolean"
},
"a1": {
"title": "A1",
"type": "array",
"items": {}
},
"a2": {
"title": "A2",
"type": "array",
"items": {
"type": "integer"
}
},
"a3": {
"title": "A3",
"default": [
[
1,
2
],
[
3,
4
]
],
"type": "array",
"items": {
"type": "array",
"items": {
"type": "integer"
}
}
},
"o1": {
"title": "O1",
"type": "object"
},
"o2": {
"title": "O2",
"type": "object",
"additionalProperties": {
"type": "integer"
}
},
"o3": {
"title": "O3",
"type": "object",
"additionalProperties": {
"type": "array",
"items": {
"type": "integer"
}
}
}
},
"required": [
"i1",
"i3",
"i4",
"f1",
"b1",
"a1",
"a2",
"o1",
"o2",
"o3"
],
"additionalProperties": false
} |
The only use case where this rule would not be appropriate is (as far as I can tell) the following: I am developing a new task, where I do not use pydantic for argument validation (*). I still want to have an argument schema, so I write it from scratch (since I don't have a pydantic model to export). I include arguments A, B and C, but later on during development I want to include another argument D. In principle I would have to modify the schema, and calling the task-edit endpoint (go to the task page, find my own task, click "edit", send the new schema), but maybe I'd rather just add a new argument from within the WorkflowTask editor, without changing the schema. To be honest, this seems a sufficiently edge case that we should not support it, at least for the moment. (*) |
Agreed.
We can start that way. Would make sense to focus on the required arguments and eventually have a way to define what are "advanced" arguments. For example, the cellpose task has many potential arguments that users will very rarely change, but they always need to set the level.
I'd prefer if optional arguments just have some default and None is a valid default for them. Adding a second complexity level of whether or not each argument is active sounds cumbersome.
I had some reservations about this, but your arguments are convincing. At least for the top level.
I think we should move the defaults also into the pydantic schemes and optional parameters should have a default. If no default is provided, it defaults to None => empty box. None needs to be a valid value then for optional arguments, but not for required arguments.. And all defaults are shown.
Agreed. Not sure about the best user-interface for this. It could be a description that is always shown. But an info button is a good start. Additional complexities to worry about later: |
Many changes to task registration do sound cumbersome though, so I can imagine we'd eventually want to support this. But clearly not a priority as long as there are the 2 ways of having defined parameters by a schema or being able to add them fresh. When we tackle 7 (adding lists, object content), it automatically means allowing addition of some arguments in the schema example (though limited), so that seems to be the point for generalization of these 2 approaches to me |
Reference about pydantic v1 vs v2: |
This is obviously complete, at least in its first version. Closing. |
An example of a
WorkflowTask.args
JSON schema could be:Where Channel is an "object of string->boolean key/value pairs, with keys "A", "B" and "C"".
Each argument comes with
required=True/False
(even more complex: what about
WorkflowTask.task.default_args
?)The text was updated successfully, but these errors were encountered: