Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core][experimental] Correct num_input_consumers for CachedChannel #47489

Merged
merged 4 commits into from
Sep 5, 2024

Conversation

kevin85421
Copy link
Member

@kevin85421 kevin85421 commented Sep 4, 2024

Why are these changes needed?

Problem statement

Without this PR, the num_input_consumers would be 1 because both inp[0] and inp[1] are only referred to in one task on the actor, so CachedChannel will not be created. The read will eventually time out because the mutable object is being read by the same actor twice.

def test_actor_method_bind_diff_input_attr_1(ray_start_regular):
    actor = Actor.remote(0)
    with InputNode() as inp:
        # Two class methods are bound to two different input
        # attribute nodes.
        output1 = actor.inc.bind(inp[0])
        output2 = actor.inc.bind(inp[1])
        dag = MultiOutputNode([output1, output2])
    compiled_dag = dag.experimental_compile()
    ref = compiled_dag.execute(0, 1)
    assert ray.get(ref) == [0, 1]

    ref = compiled_dag.execute(1, 2)
    assert ray.get(ref) == [2, 4]

    compiled_dag.teardown()
image

Solution

If a task is bound to either an InputNode or an InputAttributeNode, increment num_input_consumers by 1. Therefore, num_input_consumers will be 2, and a CachedChannel will be created.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com>
@kevin85421 kevin85421 marked this pull request as ready for review September 4, 2024 23:05
Signed-off-by: kaihsun <kaihsun@anyscale.com>
Copy link
Contributor

@ruisearch42 ruisearch42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

Need to change title?

Signed-off-by: kaihsun <kaihsun@anyscale.com>
Signed-off-by: kaihsun <kaihsun@anyscale.com>
@kevin85421 kevin85421 changed the title [core][experimental] Support reading different InputAttributeNodes in different tasks within an actor [core][experimental] Correct num_input_consumers for CachedChannel Sep 5, 2024
@kevin85421 kevin85421 changed the title [core][experimental] Correct num_input_consumers for CachedChannel [core][experimental] Correct num_input_consumers for CachedChannel Sep 5, 2024
@kevin85421
Copy link
Member Author

Need to change title?

How about "Correct num_input_consumers for CachedChannel"?

@rkooo567
Copy link
Contributor

rkooo567 commented Sep 5, 2024

ready to be merged @kevin85421 ?

Copy link
Contributor

@ruisearch42 ruisearch42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM when tests pass!

@kevin85421
Copy link
Member Author

ready to be merged @kevin85421 ?

It will be ready to be merged after tests pass.

@rkooo567 rkooo567 enabled auto-merge (squash) September 5, 2024 02:03
@github-actions github-actions bot added the go add ONLY when ready to merge, run all tests label Sep 5, 2024
@rkooo567
Copy link
Contributor

rkooo567 commented Sep 5, 2024

premerge failures

@kevin85421
Copy link
Member Author

Retry the unrelated RLlib failures.

@rkooo567 rkooo567 merged commit 03a387c into ray-project:master Sep 5, 2024
7 checks passed
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
…ay-project#47489)

Without this PR, the num_input_consumers would be 1 because both inp[0] and inp[1] are only referred to in one task on the actor, so CachedChannel will not be created. The read will eventually time out because the mutable object is being read by the same actor twice.

Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
…ay-project#47489)

Without this PR, the num_input_consumers would be 1 because both inp[0] and inp[1] are only referred to in one task on the actor, so CachedChannel will not be created. The read will eventually time out because the mutable object is being read by the same actor twice.

Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
…ay-project#47489)

Without this PR, the num_input_consumers would be 1 because both inp[0] and inp[1] are only referred to in one task on the actor, so CachedChannel will not be created. The read will eventually time out because the mutable object is being read by the same actor twice.

Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
…ay-project#47489)

Without this PR, the num_input_consumers would be 1 because both inp[0] and inp[1] are only referred to in one task on the actor, so CachedChannel will not be created. The read will eventually time out because the mutable object is being read by the same actor twice.

Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
…ay-project#47489)

Without this PR, the num_input_consumers would be 1 because both inp[0] and inp[1] are only referred to in one task on the actor, so CachedChannel will not be created. The read will eventually time out because the mutable object is being read by the same actor twice.

Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
…ay-project#47489)

Without this PR, the num_input_consumers would be 1 because both inp[0] and inp[1] are only referred to in one task on the actor, so CachedChannel will not be created. The read will eventually time out because the mutable object is being read by the same actor twice.

Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
…ay-project#47489)

Without this PR, the num_input_consumers would be 1 because both inp[0] and inp[1] are only referred to in one task on the actor, so CachedChannel will not be created. The read will eventually time out because the mutable object is being read by the same actor twice.

Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
…ay-project#47489)

Without this PR, the num_input_consumers would be 1 because both inp[0] and inp[1] are only referred to in one task on the actor, so CachedChannel will not be created. The read will eventually time out because the mutable object is being read by the same actor twice.

Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
…ay-project#47489)

Without this PR, the num_input_consumers would be 1 because both inp[0] and inp[1] are only referred to in one task on the actor, so CachedChannel will not be created. The read will eventually time out because the mutable object is being read by the same actor twice.

Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants