Skip to content
This repository has been archived by the owner on Jul 3, 2023. It is now read-only.

Fixes ray workflow adapter to work with Ray 2.0 #189

Merged
merged 2 commits into from
Aug 29, 2022
Merged

Conversation

elijahbenizzy
Copy link
Collaborator

@elijahbenizzy elijahbenizzy commented Aug 28, 2022

New ray workflows API integration. This breaks when running locally (for me) with a very cryptic RecursionError. Spent some time digging in and I think its something odd with my env?

Changes

Testing

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.
  • Reviewers requested with the Reviewers tool ➡️

Testing checklist

Python - local testing

  • python 3.6
  • python 3.7

@elijahbenizzy elijahbenizzy force-pushed the ray-fixes branch 3 times, most recently from 40ed6ba to 479da70 Compare August 28, 2022 23:51
@elijahbenizzy elijahbenizzy marked this pull request as ready for review August 28, 2022 23:55
@elijahbenizzy elijahbenizzy requested a review from skrawcz August 29, 2022 00:00
@elijahbenizzy
Copy link
Collaborator Author

Local error:

            # TODO(ujvl): Consider how to allow user to retrieve the ready objects.
            values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
            for i, value in enumerate(values):
                if isinstance(value, RayError):
                    if isinstance(value, ray.exceptions.ObjectLostError):
                        worker.core_worker.dump_object_store_memory_usage()
                    if isinstance(value, RayTaskError):
>                       raise value.as_instanceof_cause()
E                       ray.exceptions.RayTaskError(WorkflowExecutionError): ray::WorkflowManagementActor.execute_workflow() (pid=90345, ip=127.0.0.1, repr=<ray.workflow.workflow_access.WorkflowManagementActor object at 0x11bf64670>)
E                       ray.exceptions.RayTaskError(RecursionError): ray::_workflow_task_executor_remote() (pid=90344, ip=127.0.0.1)
E                         File "/Users/elijahbenizzy/.pyenv/versions/hamilton-fresh/lib/python3.9/site-packages/ray/workflow/task_executor.py", line 115, in _workflow_task_executor_remote
E                           return _workflow_task_executor(
E                         File "/Users/elijahbenizzy/.pyenv/versions/hamilton-fresh/lib/python3.9/site-packages/ray/workflow/task_executor.py", line 84, in _workflow_task_executor
E                           raise e
E                         File "/Users/elijahbenizzy/.pyenv/versions/hamilton-fresh/lib/python3.9/site-packages/ray/workflow/task_executor.py", line 79, in _workflow_task_executor
E                           output = func(*args, **kwargs)
E                         File "/Users/elijahbenizzy/dev/dagworks/hamilton/hamilton/experimental/h_ray.py", line 151, in <lambda>
E                           def check_node_type_equivalence(node_type: typing.Type, input_type: typing.Type) -> bool:
E                         File "/Users/elijahbenizzy/dev/dagworks/hamilton/hamilton/experimental/h_ray.py", line 151, in <lambda>
E                           def check_node_type_equivalence(node_type: typing.Type, input_type: typing.Type) -> bool:
E                         File "/Users/elijahbenizzy/dev/dagworks/hamilton/hamilton/experimental/h_ray.py", line 151, in <lambda>
E                           def check_node_type_equivalence(node_type: typing.Type, input_type: typing.Type) -> bool:
E                         [Previous line repeated 991 more times]
E                       RecursionError: maximum recursion depth exceeded
E
E                       The above exception was the direct cause of the following exception:
E
E                       ray::WorkflowManagementActor.execute_workflow() (pid=90345, ip=127.0.0.1, repr=<ray.workflow.workflow_access.WorkflowManagementActor object at 0x11bf64670>)
E                         File "/Users/elijahbenizzy/.pyenv/versions/3.9.10/lib/python3.9/concurrent/futures/_base.py", line 439, in result
E                           return self.__get_result()
E                         File "/Users/elijahbenizzy/.pyenv/versions/3.9.10/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
E                           raise self._exception
E                         File "/Users/elijahbenizzy/.pyenv/versions/hamilton-fresh/lib/python3.9/site-packages/ray/workflow/workflow_access.py", line 209, in execute_workflow
E                           await executor.run_until_complete(job_id, context, wf_store)
E                         File "/Users/elijahbenizzy/.pyenv/versions/hamilton-fresh/lib/python3.9/site-packages/ray/workflow/workflow_executor.py", line 109, in run_until_complete
E                           await asyncio.gather(
E                         File "/Users/elijahbenizzy/.pyenv/versions/hamilton-fresh/lib/python3.9/site-packages/ray/workflow/workflow_executor.py", line 356, in _handle_ready_task
E                           raise err
E                       ray.workflow.exceptions.WorkflowExecutionError: Workflow[id=test-test_smoke_screen_module] failed during execution.

@elijahbenizzy elijahbenizzy changed the title Fixes ray integration tests WIP Fixes ray integration tests Aug 29, 2022
@elijahbenizzy elijahbenizzy changed the title Fixes ray integration tests Fixes ray workflow adapter to work with Ray 2.0 Aug 29, 2022
Copy link
Collaborator

@skrawcz skrawcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put more info into the commit message please? e.g. ray 2.0 came out and since ray workflows are in alpha the API broke, and now they use bind ...

hamilton/experimental/h_ray.py Outdated Show resolved Hide resolved
Ray workflows was in an experimental state. With the release of Ray
2.0.0 the workflow API changed. This adapts to work with the new API.
@elijahbenizzy elijahbenizzy merged commit 5a1d698 into main Aug 29, 2022
@elijahbenizzy elijahbenizzy deleted the ray-fixes branch August 29, 2022 04:03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants