Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to re-add recovered tasks to the load balancer #2168

Merged
merged 5 commits into from
Jan 22, 2021

Conversation

ssalinas
Copy link
Member

Previously our handling of this was to just clean up everything if it was already removed from the load balancer. This PR adds a few things to the reregistration flow:

  • Always enqueue a pending request to reevaluate state/scale so that we don't end up with a multiple instance 1s kind of state and get stuck there
  • If the task is recoverable in zk (not persisted yet) but has already been removed from the lb. Instead of cleaning it up, treat it as if it were a new healthy task that just finished passing healthchecks. This way we can recover running tasks, especially in cases of large network partitions where relaunching that many new things could take some large amount of time

Updated the unit test for this to make sure the lb pending add is present, but would appreciate extra 👀 on it

cc @pschoenfelder @ajammala @rosalind210

@ssalinas ssalinas added the staging Merged to staging branch label Jan 22, 2021
@pschoenfelder
Copy link
Contributor

🚢

@ssalinas ssalinas merged commit d59909c into master Jan 22, 2021
@ssalinas ssalinas deleted the reregistration_handling branch January 22, 2021 15:59
@ssalinas ssalinas added this to the 1.5.0 milestone May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
staging Merged to staging branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants