Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaner failover + update dependencies #1908

Merged
merged 32 commits into from
Mar 6, 2019
Merged

Cleaner failover + update dependencies #1908

merged 32 commits into from
Mar 6, 2019

Conversation

ssalinas
Copy link
Member

This consolidates our many Managed classes and scheduled executor usages so we have clear control over the order in which they shut down (and make sure they all actually do...). This should help keep state more consistent on singularity failover. Ideally I'll add in the ability for Singulairty to not fail over for loss of mesos connection, and instead try to reconnect

@ssalinas ssalinas changed the title (WIP) Cleaner failover Cleaner failover Feb 19, 2019
@ssalinas ssalinas changed the title Cleaner failover Cleaner failover + update dependencies Feb 20, 2019
@ssalinas ssalinas added the hs_qa label Feb 20, 2019
@ssalinas
Copy link
Member Author

notes:

  • NP_NONNULL_PARAM_VIOLATION came with the new findbugs version. Wasn't going to fix all occurrences as part of this PR as many were pretty far reaching or covered by logic a few lines above (which findbugs does not account for). Suppressed for now
  • metrics-jmx split off from the normal dropwizard package and is now a new dep
  • abort emails should have a full stack trace now
  • leader only pollers are auto-discovered. As long as they are bound in guice, the managed class will start/stop them

@@ -403,14 +424,25 @@ public SingularityPendingRequestParent scheduleImmediately(SingularityUser user,

checkConflict(requestWithState.getState() != RequestState.PAUSED, "Request %s is paused. Unable to run now (it must be manually unpaused first)", requestWithState.getRequest().getId());

// Check these to avoid unnecessary calls to taskManager
Integer activeTasks = null;
Integer pendingTasks = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to default these to 0 instead of null to avoid NPE's in checkRunNowRequest().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can do that 👍

@baconmania
Copy link
Contributor

🚢

@baconmania
Copy link
Contributor

🚢

@ssalinas ssalinas merged commit 274c39a into master Mar 6, 2019
@ssalinas ssalinas deleted the clean_failover branch March 6, 2019 15:22
@ssalinas ssalinas added this to the 0.23.0 milestone Jun 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants