Allow running processes in such a way that stop signals are preserved #3461
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
One-off processes, including
convox run
,convox exec
, and timers don't pass stop signals to processes properly.For exec and non-detached runs, this makes sense, as Convox runs a
sleep {timeout}
command in the container and then attaches a secondary process to it (a ladocker exec
). But for timers and detached runs, SIGTERM is never received by the process. The problem stems from the use of["sh", "-c", command]
, assh -c
does not propagate signals to apps.This violates 12 factor as processes aren't given the opportunity to run any cleanup tasks. While apps should also be robust against the occasional SIGTERM, Convox should do it's best to allow more graceful stoppage.
Our use-case revolves around spinning up ETL pipelines on regular intervals, but really any worker process use-case that that wants to use Convox's autoscaling features to create processes on-demand will benefit from this.
This change allows you to disable the shell wrapper for detached runs. Simply removing
sh -c
would likely be a breaking change for users who have come to expect their commands to be run in a shell, so instead I put the behavior behind a flag. I called is flag "bare" but I don't love the name, perhaps[--no]-shell
,--shell=false
or similar would be better. Alternatively, we could use a rack parameter, or we could make this the default behavior going forward with a version check. Looking for guidance on this.I only added support to the
aws
provider, as thek8s
provider (andlocal
by extension) already behave this way.Note this MR does not fix the same issue for timers, as I wasn't sure of a good backwards-compatible solution.
By the way, detached runs don't actually support the timeout option. This could be resolved by using the unix timeout command. I could add that to this MR also.