Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Singularity should be able to handle registration, re-registration, and leadership changes without aborting #273

Closed
wsorenson opened this issue Oct 22, 2014 · 4 comments
Assignees

Comments

@wsorenson
Copy link
Contributor

No description provided.

@wsorenson wsorenson self-assigned this Oct 22, 2014
@wsorenson
Copy link
Contributor Author

Some complications:

Losing leadership:

  • Mesos sends an error when framework fails over - in the past, there were some cases where this was the only way we found out about a framework failing over (so we needed to stop leadership)
  • Deletion of leader latch doesn't re-create the leader latch

Uncaught exceptions:

  • Probably could handle, but I would feel more comfortable here if we we had more control over the SchedulerDriver, the issue is right now some of our aborting is a hedge against mesos java bugs.

@tpetr
Copy link
Contributor

tpetr commented Oct 14, 2015

@wsorenson thoughts on tackling this? or are we happy with our abort strategy?

@stevenschlansker
Copy link
Contributor

End-user opinion: the aborts are worrying but once you learn to stop fretting they really do not present any sort of urgent problem to us.

@ssalinas ssalinas mentioned this issue Oct 11, 2017
2 tasks
@ssalinas
Copy link
Member

ssalinas commented Nov 7, 2019

fixed in #2032

@ssalinas ssalinas closed this as completed Nov 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants