-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] Make Placement Group Wildcard and Indexed Resource Assignments Consistent #48088
Conversation
Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
can you run some pg related release tests @MengjinYan ? |
@MengjinYan there are some failures from core tests. can you take a look? I am in progress with reviewing |
Yes. I took a look and found the bug. I didn't consider the case where multiple PGs coexists on a host and allocated the indexed bundle regardless of the pg_id when the index is not specified. Will submit a fix for that soon. |
…ces in the case where the bundle id is not specified Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally lgtm! Some concerns around performance (because I remember the scheduler was sensitive to performance for some large stress tests)
/*for_indexed_resource=*/true); | ||
if (data) { | ||
ResourceID original_resource_id(data->original_resource); | ||
absl::flat_hash_set<ResourceID> &resource_set = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you get data by bracket and if the data doesn't exist, it. creates a new one
absl::flat_hash_set<ResourceID> &resource_set = | |
pg_indexed_resources_.find(original_resource_id) != pg_indexed_resources_.end() { | |
it->second.erase(...) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Will update.
/*for_wildcard_resource=*/false, | ||
/*for_indexed_resource=*/true); | ||
if (data) { | ||
pg_indexed_resources_[ResourceID(data->original_resource)].emplace(resource_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually for this particular case, I'd like to create a default vector when the particular resource id doesn't exist. I'm wondering if square brackets is okay here. As I don't have a lot of context on c++ best practices so please let me know what's the best pattern.
@@ -43,6 +45,21 @@ bool NodeResourceInstanceSet::Has(ResourceID resource_id) const { | |||
|
|||
void NodeResourceInstanceSet::Remove(ResourceID resource_id) { | |||
resources_.erase(resource_id); | |||
|
|||
// Remove from the pg_indexed_resources_ as well | |||
auto data = ParsePgFormattedResource(resource_id.Binary(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this will add significant overhead to the scheduler (because parsing string is so much more expensive). is there any way to do this only when we know resource id is pg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline. And I checked the code. The Remove
function is only called in the DeleteLocalResource
function in LocalResourceManager
when returning the pg bundle. The overhead added shouldn't impact the schedule of the task.
At the same time, the concerns on the Set
and TryAllocate
are valid. One potential idea I can think of is adding a boolean in the ResourceID
to indicate whether it is a pg or not. Will further explore the idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the overhead will be smaller if PG resources starts with a prefix so we can check using startswith
instead of regex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense!
At same time, I think the downside is: we need to make sure any custom resources won't start with the same prefix and we still need to do the parsing for all the placement group allocation.
Also, as the release didn't show noticeable performance regression, the code can be checked in as it is. I'll add the idea to the TODO for the future improvement.
@@ -69,6 +86,14 @@ NodeResourceInstanceSet &NodeResourceInstanceSet::Set(ResourceID resource_id, | |||
resources_.erase(resource_id); | |||
} else { | |||
resources_[resource_id] = std::move(instances); | |||
|
|||
// Popluate the pg_indexed_resources_map_ | |||
auto data = ParsePgFormattedResource(resource_id.Binary(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above
@@ -87,6 +87,12 @@ class NodeResourceInstanceSet { | |||
return resources_; | |||
} | |||
|
|||
/// Only for testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it really used? I couldn't find it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. It is not used. I was to follow the above function for resources_
.
At the same time, I think it makes sense to remove the function and add it when it is needed in the test
allocations[resource_id] = std::move(*allocation); | ||
auto data = ParsePgFormattedResource(resource_id.Binary(), | ||
/*for_wildcard_resource*/ true, | ||
/*for_indexed_resource*/ true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this supposed to be false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is to find all pg resources including both wildcard resources and indexed resources.
And my understanding is that we need to set both "for_wildcard_resource" and "for_indexed_resource" to be true
to find the resource ids for both.
So here I think we should still set it to be true
.
return std::nullopt; | ||
} | ||
} else { | ||
// The case where the bundle index is specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there no case where the demands only include indexed resources, but not wildcard resources? I assume no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(feel like the logic assumes this to be correct. maybe add it to the docstring?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, here:
ray/src/ray/common/bundle_spec.cc
Line 199 in d80b22f
// Note that all nodes that have placement group have a special |
At the same time, I'll add the assumption to the comment as well.
auto allocation = TryAllocate(pair.first, resource_demands.Get(pair.first)); | ||
|
||
if (allocation) { | ||
wildcard_allocation = *allocation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe do this only once to avoid repetitive copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Currently, the same allocation will used for both wildcard resource and indexed resource.
Instead of doing a copy here and 2 moves for both wildcard and index resource, I think I can be a copy for the indexed resource and a move for the wildcard resource,
Co-authored-by: SangBin Cho <rkooo567@gmail.com> Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
Co-authored-by: SangBin Cho <rkooo567@gmail.com> Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
Discussed offline and ran the "many tasks" benchmark test. From the benchmark test result, we didn't see noticeable performance regression compare to results from tests running on the master branch. At the same time, the current idea that we have for the improvement will need non-trivial effort and would be worth a separate effort for that. I'll also run the stress_test_many_tasks tests to further verify the latency. And if there still noticeable regression, the code should e able to check in without further performance improvements regarding the string parsing. At the same time, I also added "TODO" comments at the place where we do the parsing with the improvement idea. |
src/ray/common/bundle_spec.cc
Outdated
@@ -95,15 +95,15 @@ std::string BundleSpecification::DebugString() const { | |||
} | |||
|
|||
std::string FormatPlacementGroupResource(const std::string &original_resource_name, | |||
const PlacementGroupID &group_id, | |||
const std::string &group_id_str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
group_id_hex?
src/ray/common/bundle_spec.h
Outdated
/// [original_resource_name]_group_[bundle_index]_[group_id_str] | ||
/// \return The corresponding formatted placement group resource string. | ||
std::string FormatPlacementGroupResource(const std::string &original_resource_name, | ||
const std::string &group_id_str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
group_id_hex?
@@ -43,6 +45,21 @@ bool NodeResourceInstanceSet::Has(ResourceID resource_id) const { | |||
|
|||
void NodeResourceInstanceSet::Remove(ResourceID resource_id) { | |||
resources_.erase(resource_id); | |||
|
|||
// Remove from the pg_indexed_resources_ as well | |||
auto data = ParsePgFormattedResource(resource_id.Binary(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the overhead will be smaller if PG resources starts with a prefix so we can check using startswith
instead of regex.
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com> Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
premerge failures! |
Test failures are unrelated rllib tests |
… Consistent (ray-project#48088) Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
… Consistent (ray-project#48088) Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com> Signed-off-by: mohitjain2504 <mohit.jain@dream11.com>
Why are these changes needed?
Currently when we allocate resources to tasks or actors with placement group,
The behavior above leads to incorrect GPU assignment when specifying bundle index. Also, task can potentially be allocated across bundles if no bundle index specified.
This PR fix the above behavior to make the allocation on the wildcard resource and the indexed resource consistent with the following changes:
TryAllocate
function inresource_instance_set.cc
to add logic to handle the pg resource allocation. If the resource requirement include resource in a placement group,(1) it will always first try to allocation on index resources. For resource requirements without a bundle index, it will iterate through all the bundles and try to find one with enough space.
(2)Then it will try to allocate the wildcard resource based on the index resource allocation.
Related issue number
Closes #29811
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.