Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spread replicas with custom resources in torch tune serve release test #46093

Merged
merged 1 commit into from
Jun 20, 2024

Conversation

zcin
Copy link
Contributor

@zcin zcin commented Jun 17, 2024

Spread replicas with custom resources in torch tune serve release test

For the Golden Notebook Torch Tune Serve release test.
Use custom resources to make sure one replica gets scheduled for each of the two nodes in the cluster.

Signed-off-by: Cindy Zhang cindyzyx9@gmail.com

For the Golden Notebook Torch Tune Serve release test.
Use custom resources to make sure one replica gets scheduled for each of the two nodes in the cluster.


Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
@zcin zcin marked this pull request as ready for review June 17, 2024 21:44
@zcin zcin requested review from edoakes and aslonnie June 17, 2024 21:45
@zcin zcin added the go add ONLY when ready to merge, run all tests label Jun 17, 2024
num_replicas=2,
ray_actor_options={"num_gpus": 1, "resources": {"worker": 1}}
if use_gpu
else {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this won't spread across nodes if no GPU is used, is that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think no_gpu is used for smoke test. There is only one worker node type in this test, so no gpu should mean all replicas get started on head node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Collaborator

@aslonnie aslonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(if ed says it is good, it is good)


via GIPHY

@zcin
Copy link
Contributor Author

zcin commented Jun 20, 2024

@edoakes ready to merge?

@edoakes edoakes merged commit f6c819f into ray-project:master Jun 20, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants