-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shutdown cluster cleanly when SIGTERM or SIGINT is received #1705
Conversation
To test, use the Python alphabet partitioned and alphabet examples. Start up each, verify recovery files exist in /tmp Do ctrl-c to shutdown verify that files no longer exist. In the case of alphabet partitioned, you should test by doing ctrl-c on different workers. |
lib/wallaroo/startup.pony
Outdated
@@ -634,3 +639,30 @@ actor Startup | |||
"the Wallaroo Community License at https://github.com/WallarooLabs/" + | |||
"wallaroo/blob/master/LICENSE.md for details, and also please visit " + | |||
"the page at http://www.wallaroolabs.com/pricing****\n").cstring()) | |||
|
|||
fun ref _setup_shutdown_handler(c: Connections, f: Startup, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name f
is unclear in this context
fac3672
to
358f9e2
Compare
358f9e2
to
fe37410
Compare
@@ -33,7 +33,7 @@ actor Main | |||
Machida.start_python() | |||
|
|||
try | |||
SignalHandler(ShutdownHandler, Sig.int()) | |||
SignalHandler(MachidaShutdownHandler, Sig.int()) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should Sig.term()
be added here also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slfritchie it wasn't needed previously to get Python to exit correctly, but SIGINT was. So I veer towards, no. But, let's get @aturley's thoughts as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like @SeanTAllen says, this was a special case that we added to fix a specific problem. Unless there's a problem with the behavior there no reason to handle Sig.term()
.
419bccd
to
1ddda2f
Compare
Previously, if a developer working with Wallaroo sent a SIGINT via ctrl-c in the terminal, Wallaroo would exit but wouldn't gracefully shutdown. This could lead to errors on the next application startup as recovery files weren't dispose of properly. With this commit, sending SIGTERM or SIGINT will shut down Wallaroo gracefully in the same fashion that the external "shutdown cluster" command work. Once we have "shrink to fit" in place, the plan would be to have SIGTERM and SIGINT remove that individual worker from the cluster rathter than shut the entire cluster down. Once we have shrink to fit in place, we might change our mind and go in a different direction and leave this functionality as is. Closes #1701
1ddda2f
to
8b52a68
Compare
Previously, if a developer working with Wallaroo sent a SIGINT via
ctrl-c in the terminal, Wallaroo would exit but wouldn't gracefully
shutdown. This could lead to errors on the next application startup as
recovery files weren't dispose of properly.
With this commit, sending SIGTERM or SIGINT will shut down Wallaroo
gracefully in the same fashion that the external "shutdown cluster"
command work.
Once we have "shrink to fit" in place, the plan would be to have SIGTERM
and SIGINT remove that individual worker from the cluster rathter than
shut the entire cluster down. Once we have shrink to fit in place, we
might change our mind and go in a different direction and leave this
functionality as is.
Closes #1701