[Ray Core] The node storing the actor will be kill unexpectedly when autoscaler is turned on #46172
Labels
bug
Something that is supposed to be working; but isn't
core
Issues that should be addressed in Ray Core
core-autoscaler
autoscaler related issues
P3
Issue moderate in impact or severity
What happened + What you expected to happen
The scenario is that an actor is created to synchronise some data between workers, but if the data in the actor is not updated for a period of time, the node in which the actor resides may be determined to be idle and deleted by the autoscaler, which doesn't meet my expectations because I still hold the actor's handler
Expected behavior: if there is still an actor alive on a node, it should not be deleted.
Versions / Dependencies
Ray 2.23.0
Reproduction script
The above Python script reproduces the bug reliably, reporting
The actor is dead because its node has died.
after time.sleep(120). The configuredidleTimeoutSeconds
is 60.Issue Severity
High: It blocks me from completing my task.
The text was updated successfully, but these errors were encountered: