Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature][workflow] Namespace for workflow #18818

Open
1 of 2 tasks
fishbone opened this issue Sep 22, 2021 · 4 comments
Open
1 of 2 tasks

[Feature][workflow] Namespace for workflow #18818

fishbone opened this issue Sep 22, 2021 · 4 comments
Assignees
Labels
beta Beta release feture enhancement Request for new feature and/or capability P2 Important issue, but not time-critical usability workflow

Comments

@fishbone
Copy link
Contributor

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

In ray, we have already had a namespace concept which should be useful for workflow as well.

Recommend API should be:

workflow.init(namespace=name_space)
Actor.get_or_create(namespace=name_space)

Use case

No response

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!
@fishbone fishbone added enhancement Request for new feature and/or capability workflow-usability workflow labels Sep 22, 2021
@fishbone fishbone added this to the Workflows Pre-Beta milestone Sep 22, 2021
@fishbone fishbone self-assigned this Sep 22, 2021
@fishbone fishbone added the P1 Issue that should be fixed within a few weeks label Sep 22, 2021
@fishbone fishbone assigned wuisawesome and unassigned fishbone Sep 29, 2021
@lchu-ibm
Copy link
Contributor

lchu-ibm commented Oct 7, 2021

@iycheng can you share a little more details on the expected design? E.g. should the namespace act as physical checkpointing structure namespace as well? or should the namespace only affect the "names"?

For the first choice, I meant we will have from:
workflow_data/workflow_id/steps/step_name/...
to
workflow_data/namespace_id/workflow_id/steps/step_name/...
So we could have same workflow_id in different namespace:
workflow_data/namespace=aaaaaaa/workflow_id/ vs workflow_data/namespace=bbbbbb/workflow_id/

For the second choice, we will simply "append" the namespace to the "names":
workflow_data/aaaaaaa-workflow_id/ vs workflow_data/bbbbbb-workflow_id/.

@wuisawesome
Copy link
Contributor

probably better to do the first option. i think we want to be able to operations like list or delete by namespace, so having the extra structure would help let us efficiently look things up by namespace.

@lchu-ibm
Copy link
Contributor

lchu-ibm commented Nov 18, 2021

@wuisawesome @iycheng

I am picking up this lost thread today, one question -
Is it better to have namespace a "session" scope or not? What I meant is:

  1. if we want a session scope namespace, after we do workflow.init(namespace=name_space), all workflows in the current session should use this namespace. So basically workflow storage will always use this namespace as the prefix for checkpoint reading and writing. And no end api needs to specify namespace. And if user needs to put/get from a different namespace, we can have a switch-namespace function to change the session level namespace.
workflow.init(namespace=A)
workflow.run(workflow_id)  # no need to give namespace here
Actor.get_or_create(actor_id) # no need to give namespace here
workflow.get_output(workflow_id)  # no need to give namespace here
workflow.get_metadata(workflow_id)  # no need to give namespace here

workflow.set_namespace(namespace=B)
Actor.get_or_create(actor_id) # get actor from another namespace
  1. for non-session namespace, we do namespace parameter for all related apis so namespace is controlled on lower (finer) level.

workflow.init(namespace=A)   # this is actually not quite useful here 
workflow.run(workflow_id, **namespace=A**) 
Actor.get_or_create(actor_id, **namespace=A**) 
Actor.get_or_create(actor_id_2, **namespace=B**) 
Actor.get_or_create(actor_id_3, **namespace=C**) 
workflow.get_output(workflow_id, **namespace=A**)  
workflow.get_metadata(workflow_id, **namespace=A**)  

1 is easier-to-use and backward compatible. And it hides the notion of namespace for users who don't have the need for namespace.

2 gives more flexibility, and we can do cross-namespace operation. e.g. maybe I can use a virtual actor from namespace A to run my workflow in namespace B. But it involves adding namespace parameter to a LOT of apis as I could imagine. From "static" ones like get_actor, get_output, get_metadata to internal ones like those in execution.py, recovery.py and etc.

Well, there is a way # 3, which does it in a hybrid way - we still make namespace a session scope but also add the parameter in all related individual functions and set value default to None. So if not given that parameter, it will pick the session's namespace, otherwise it will temporarily overwrite the session's default namespace. We can use a context manager similar to workflow_step_context.

I haven't given enough thought on this but would like to know your thoughts here.

@fishbone
Copy link
Contributor Author

@lchu-ibm I think way #3 is better

workflow.init(namespace=A)   # this is actually not quite useful here 
workflow.run(workflow_id) # run in namespace=A
workflow.run(workflow_id, namespace=B) # run in namespace=B

Actor.get_or_create(actor_id) # fetch actor in A
Actor.get_or_create(actor_id_2, namespace=B) # fetch actor in B

This is more aligned with ray's API

get_actor(name: str, namespace: Union[str, NoneType] = None) -> 'ray.actor.ActorHandle'
    Get a handle to a named actor.

Basically by default namespace is None and if not given, it'll use the one from context otherwise, it'll use the given one.

@fishbone fishbone added P2 Important issue, but not time-critical and removed P1 Issue that should be fixed within a few weeks labels Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beta Beta release feture enhancement Request for new feature and/or capability P2 Important issue, but not time-critical usability workflow
Projects
None yet
Development

No branches or pull requests

4 participants