Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lifecycle tweaks for Singularity #2190

Merged
merged 7 commits into from
Mar 19, 2021
Merged

Lifecycle tweaks for Singularity #2190

merged 7 commits into from
Mar 19, 2021

Conversation

ssalinas
Copy link
Member

Small one:

  • Make sure to cancel in progress task reconciliation on shutdown

Bigger one:
Currently when a new leader takes over we can hit an order of events like the following:

  • Gains leadership
  • Other api instances see this and start sending writes to new leader
  • leader cache bootstrapping starts
  • leader cache bootstrapping loads type X (e.g. requests, deploys, etc) but is not yet activated
  • Writes make it through to zk, but not to leader cache because active() is false
  • leader cache finishes bootstrapping

We now have a case of a write for type X that made it to ZK, but not to the leader cache. This can result in missing deploy data, missing requests, requests that are present but actually were deleted, etc.

The new PR here adds an additional check within active() that should serve to block the calling thread when bootstrapping of the leader cache is still in progress. So, we will delay all incoming writes and wait for this to finish, rather than trying to reconcile state afterwards.

Would like feedback on the java-y bits here for sure:

  • Is this performant enough during regular operation? In staring it down it should just be one additional if statement on a volatile boolean
  • Any race conditions where the calling thread could get stuck permanently? I think I covered this with the while loop and synchronized objects, but want to be sure

Comment on lines 69 to 71
synchronized (syncObject) {
syncObject.notify();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would this do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bit of background on wait + notify in https://www.baeldung.com/java-wait-notify. Though realizing I probably need notifyAll here actually. Leaving as a TODO for later

@ssalinas ssalinas added the staging Merged to staging branch label Mar 18, 2021
@rosalind210
Copy link
Contributor

🚢

@ssalinas ssalinas merged commit fa151c3 into master Mar 19, 2021
@ssalinas ssalinas deleted the lifecycle_tweaks branch March 19, 2021 13:20
@ssalinas ssalinas added this to the 1.5.0 milestone May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
staging Merged to staging branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants