Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core][experimental] Raise an exception if a leaf node is found during compilation #47757

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

kevin85421
Copy link
Member

@kevin85421 kevin85421 commented Sep 20, 2024

Why are these changes needed?

Leaf nodes are nodes that are not output nodes and have no downstream nodes. If a leaf node raises an exception, it will not be propagated to the driver. Therefore, this PR raises an exception if a leaf node is found during compilation.

Another solution: implicitly add leaf node to MultiOutputNode

Currently, the function execute can return multiple CompiledDAGRefs. The UX we want to provide is to implicitly add leaf nodes to the MultiOutputNode but not return the references of the leaf nodes. For example, a MultiOutputNode is containing 3 DAG nodes (2 normal DAG nodes + 1 leaf node).

x, y = compiled_dag.execute(input_vals) # We don't return the ref for the leaf node.

However, the ref for leaf node will be GC(ed) in execute, and CompiledDAGRef’s del will call get if it was never called which makes execute to be a sync instead of an async operation which is not acceptable.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com>
@kevin85421 kevin85421 changed the title [core][experimental] Propagate leaf node errors to users [WIP][core][experimental] Propagate leaf node errors to users Sep 20, 2024
Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com>
Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com>
@kevin85421 kevin85421 changed the title [WIP][core][experimental] Propagate leaf node errors to users [core][experimental] Propagate leaf node errors to users Sep 25, 2024
@kevin85421 kevin85421 changed the title [core][experimental] Propagate leaf node errors to users [core][experimental] Raise an exception if a leaf node is found during compilation Sep 25, 2024
@kevin85421 kevin85421 marked this pull request as ready for review September 25, 2024 20:22
@rkooo567
Copy link
Contributor

Another solution: implicitly add leaf node to MultiOutputNode

Can you create an issue for this? Also can you share me the error message?

Copy link
Contributor

@rkooo567 rkooo567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit comment for improving error message further.

"Compiled DAG doesn't support leaf nodes that don't have "
"downstream nodes and are not output nodes. There are "
f"{len(leaf_nodes)} leaf nodes in the DAG. Please add them to "
f"the MultiOutputNode. These nodes are: {leaf_nodes}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you improve the error message to show how to solve this error step-by-step? For example, assuming a leaf node is w.f.bind() it could say sth like add the output of w.f.bind() to MultiOutputNode

What I recommend you is to try raising the error on your own and fix it looking at this error message. I think it is not very trivial if you assume you are not a developer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants