You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a cluster scales in, the terminating nodes are very quickly shutdown without any consideration for processing requests that they may be running. Current requests just error out, and in the case of proxied requests, the results are just lost.
The desired solution is to implement a graceful shutdown using ASG lifecycle hooks - specifically the terminate hook. The notification should be consumed by the terminating node, and it can then stop registering to the orchestrator and then wait 10 minutes. Or ideally, if the node has the ability to determine if any requests are running on it, then it could wait until all requests have completed before sending the continue command to the lifecycle (this would avoid the global 10 minute wait).
The text was updated successfully, but these errors were encountered:
When a cluster scales in, the terminating nodes are very quickly shutdown without any consideration for processing requests that they may be running. Current requests just error out, and in the case of proxied requests, the results are just lost.
The desired solution is to implement a graceful shutdown using ASG lifecycle hooks - specifically the terminate hook. The notification should be consumed by the terminating node, and it can then stop registering to the orchestrator and then wait 10 minutes. Or ideally, if the node has the ability to determine if any requests are running on it, then it could wait until all requests have completed before sending the continue command to the lifecycle (this would avoid the global 10 minute wait).
The text was updated successfully, but these errors were encountered: