New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Autoscaler][Placement Group] Skip placed bundle when requesting resource #48924

Open

mimiliaogo wants to merge 4 commits into ray-project:master from mimiliaogo:pg-overprovision

Contributor

mimiliaogo commented Nov 25, 2024 •

edited

Loading

Why are these changes needed?

Before the PR, when a node in a placement group (PG) goes down, the autoscaler attempts to reschedule the entire PG (all bundles). However, this will lead to overprovisioning. Details: #40212

This PR solved this by skipping already placed bundles (i.e., bundles with an associated node_id) when demanding resources in autoscaler.

Before: Every bundles get rescheduled

After: Only one node will be scaled up

Related issue number

Closes #40212

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

mimiliaogo requested review from hongchaodeng and a team as code owners

November 25, 2024 18:57

jcotant1 added the core label

kevin85421 self-assigned this

kevin85421 reviewed

View reviewed changes

Member

kevin85421 left a comment

add a test

python/ray/autoscaler/_private/resource_demand_scheduler.py Outdated

-                      shapes = [dict(bundle.unit_resources) for bundle in placement_group.bundles]
+                      # Skip **placed** bundle (which has node id associated with it).
+                      for bundle in placement_group.bundles:
+                          if bundle.node_id:

Member

kevin85421 Dec 4, 2024

Is it an empty string or None? If it is None, use is instead.

Suggested change

      
                        if bundle.node_id:
          
                        if bundle.node_id is not None:

Contributor Author

mimiliaogo Dec 13, 2024

fix in 7a44207, it should be an empty byte string.

mimiliaogo added 2 commits

December 9, 2024 09:18


          skip placed bundle when request resource

5de1e9b

Signed-off-by: Mimi Liao <mimiliao2000@gmail.com>


          add autoscaler test for placement group reschedule when node dies

Signed-off-by: Mimi Liao <mimiliao2000@gmail.com>

mimiliaogo force-pushed the pg-overprovision branch from 5818b44 to 6353847 Compare

December 13, 2024 02:27

mimiliaogo commented

View reviewed changes

python/ray/autoscaler/v2/tests/test_e2e.py Outdated

+                              break
+                      # TODO(mimi): kill_raylet won't trigger reschedule in autoscaler v1
+                      # kill_raylet(node["NodeManagerAddress"], node["NodeManagerPort"])

Contributor Author

mimiliaogo Dec 13, 2024

I found that when using kill_raylet, reschedule won't be triggered in autoscaler v1, even when the cluster status shows the node is killed. In this case, v1 will fail and v2 pass.
Both v1 and v2 pass when using kill_node.


          Improve node_id check to explicitly handle empty byte string

7a44207

Signed-off-by: Mimi Liao <mimiliao2000@gmail.com>

mimiliaogo requested a review from kevin85421

December 13, 2024 02:48

kevin85421 reviewed

View reviewed changes

python/ray/autoscaler/v2/tests/test_e2e.py Outdated


		from ray.autoscaler.v2.sdk import get_cluster_status

		def verify_nodes(active=3, idle=1):

Member

kevin85421 Dec 16, 2024

Suggested change

      
                    def verify_nodes(active=3, idle=1):
          
                    def verify_nodes(active, idle):

python/ray/autoscaler/v2/tests/test_e2e.py Outdated

+                      def kill_node(node_id):
+                          # kill -9
+                          import subprocess

Member

kevin85421 Dec 16, 2024

Move import to top-level. Typically, Ray uses deferred import only to avoid circular dependencies.

python/ray/autoscaler/v2/tests/test_e2e.py Outdated

+                      wait_for_condition(lambda: verify_nodes(3, 1))
+                      # Kill a node
+                      def kill_raylet(ip, port, graceful=True):

Member

kevin85421 Dec 16, 2024

Remove this function because it is not used for now.

python/ray/autoscaler/v2/tests/test_e2e.py Outdated

+                      # Wait for the node to be removed
+                      wait_for_condition(lambda: verify_nodes(2, 1), 20)
+                      # Check that the placement group is rescheduled

Member

kevin85421 Dec 16, 2024

where is the logic to check the placement group is rescheduled?

Contributor Author

mimiliaogo Dec 16, 2024

wait_for_condition(lambda: verify_nodes(3, 1)) is to check the autoscaler rescheduling. However, this comment is redundant, I've already removed it.

python/ray/autoscaler/v2/tests/test_e2e.py Outdated


		ray.get(pg.ready())

		from ray.autoscaler.v2.sdk import get_cluster_status

Member

kevin85421 Dec 16, 2024

Do we need to import this? It seems to have already been imported at the top level.

Contributor Author

mimiliaogo Dec 16, 2024

Above suggestions fixed in 415dcf8


          fix review comments

415dcf8

Signed-off-by: Mimi Liao <mimiliao2000@gmail.com>

kevin85421 added the go label

kevin85421 approved these changes

View reviewed changes

Member

kevin85421 commented Dec 17, 2024

CI fails. Can you fix the CI errors?

kevin85421 mentioned this pull request

[Bug] Autoscaler sideacr crashes, bringing down head pod, if request exceeds max pod replicas ray-project/kuberay#2385

Open

2 tasks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core go