Skip to content

Commit

Permalink
[core][autoscaler] Fix idle time duration when node resource is not u…
Browse files Browse the repository at this point in the history
…pdated periodically (#37121)

Why are these changes needed?
It was assumed resource update is broadcasted periodically (which isn't the case), so the idle time wasn't updated when the node keeps in the idle state.

This PR makes the raylet sent the last idle time (if idle) to the GCS, and allows GCS to calculate the duration.
---------

Signed-off-by: rickyyx <rickyx@anyscale.com>
  • Loading branch information
rickyyx authored Jul 6, 2023
1 parent e876693 commit c6b969a
Show file tree
Hide file tree
Showing 4 changed files with 32 additions and 5 deletions.
14 changes: 12 additions & 2 deletions python/ray/autoscaler/v2/tests/test_sdk.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import os
import sys
import time

# coding: utf-8
from dataclasses import dataclass
Expand Down Expand Up @@ -102,7 +103,7 @@ def verify():


def test_node_state_lifecycle_basic(ray_start_cluster):

start_s = time.perf_counter()
cluster = ray_start_cluster
cluster.add_node(num_cpus=0)
ray.init(address=cluster.address)
Expand Down Expand Up @@ -174,13 +175,22 @@ def verify_cluster_busy():
# Kill the node.
cluster.remove_node(node)

# Sleep for a bit so head node should be idle longer than this.
time.sleep(3)

def verify_cluster_no_node():
state = get_cluster_resource_state(stub)
now_s = time.perf_counter()
test_dur_ms = (now_s - start_s) * 1000
assert_node_states(
state,
[
NodeState(node_id, NodeStatus.DEAD),
NodeState(head_node_id, NodeStatus.IDLE, lambda idle_ms: idle_ms > 0),
NodeState(
head_node_id,
NodeStatus.IDLE,
lambda idle_ms: idle_ms > 3 * 1000 and idle_ms < test_dur_ms,
),
],
)
return True
Expand Down
15 changes: 13 additions & 2 deletions src/ray/gcs/gcs_server/gcs_autoscaler_state_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -200,10 +200,21 @@ void GcsAutoscalerStateManager::GetNodeStates(
auto const &node_resource_data = cluster_resource_manager_.GetNodeResources(
scheduling::NodeID(node_state_proto->node_id()));
if (node_resource_data.idle_resource_duration_ms > 0) {
// The node is idle.
// The node was reported idle.
node_state_proto->set_status(rpc::autoscaler::NodeStatus::IDLE);

// We approximate the idle duration by the time since the last idle report
// plus the idle duration reported by the node:
// idle_dur = <idle-dur-reported-by-raylet> + <time-since-gcs-gets-last-report>
//
// This is because with lightweight resource update, we don't keep reporting
// the idle time duration when there's no resource change. We also don't want to
// use raylet reported idle timestamp since there might be clock skew.
RAY_CHECK(node_resource_data.last_resource_update_time != absl::nullopt);
node_state_proto->set_idle_duration_ms(
node_resource_data.idle_resource_duration_ms);
node_resource_data.idle_resource_duration_ms +
absl::ToInt64Milliseconds(
absl::Now() - node_resource_data.last_resource_update_time.value()));
} else {
node_state_proto->set_status(rpc::autoscaler::NodeStatus::RUNNING);
}
Expand Down
5 changes: 4 additions & 1 deletion src/ray/raylet/scheduling/cluster_resource_data.h
Original file line number Diff line number Diff line change
Expand Up @@ -436,9 +436,12 @@ class NodeResources {
// The key-value labels of this node.
absl::flat_hash_map<std::string, std::string> labels;

// The idle duration of the node from resources.
// The idle duration of the node from resources reported by raylet.
int64_t idle_resource_duration_ms = 0;

// The timestamp of the last resource update if there was a resource report.
absl::optional<absl::Time> last_resource_update_time = absl::nullopt;

/// Normal task resources could be uploaded by 1) Raylets' periodical reporters; 2)
/// Rejected RequestWorkerLeaseReply. So we need the timestamps to decide whether an
/// upload is latest.
Expand Down
3 changes: 3 additions & 0 deletions src/ray/raylet/scheduling/cluster_resource_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,9 @@ bool ClusterResourceManager::UpdateNodeAvailableResourcesIfExist(

// Update the idle duration for the node in terms of resources usage.
node_resources->idle_resource_duration_ms = resource_data.idle_duration_ms();

// Last update time to the local node resources view.
node_resources->last_resource_update_time = absl::Now();
return true;
}

Expand Down

0 comments on commit c6b969a

Please sign in to comment.