Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: after stop callback #740

Closed
rajyan opened this issue Sep 4, 2023 · 0 comments · Fixed by #741
Closed

Feature: after stop callback #740

rajyan opened this issue Sep 4, 2023 · 0 comments · Fixed by #741

Comments

@rajyan
Copy link
Contributor

rajyan commented Sep 4, 2023

Description

Currently we can configure callbacks after the "shutdown" by

if (callback = Shoryuken.stop_callback)
logger.debug { 'Calling stop_callback' }
callback.call
end
fire_event(:shutdown, true)
end

stop_callback or registering a callback to :shutdown event, but we cannot configure a callback after the executor has stopped (after the executor.kill in stop! or after wait_for_termination in stop call).

def stop!
initiate_stop
executor.shutdown
return if executor.wait_for_termination(Shoryuken.options[:timeout])
executor.kill
end

I believe registering a callback is useful in some cases when we want to gracefully shutdown the Shoryuken worker.

Usage example

For example, we have a job that takes about 10 sec ~ 5 min with a long enough visibility timeout.
When the job starts, we take a DB lock and change the status of the job (model) to "job_running" to make sure the job runs one at a time (to avoid at-least once duplication and run it exactlly once).

We want to stop this job gracefully as possible on deploy, so we have configured "timeout" option to wait for the job to finish after the shutdown is triggered by SIGTERM.

return if executor.wait_for_termination(Shoryuken.options[:timeout])

Although, we are using AWS Fargate (and Fargate spot) for the job worker, which can only wait for 2 minutes at max after the SIGTERM (see https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html stop timeout), so we are currently configuring ":timeout" options to 60 seconds.

In this case, I want to configure some kind of callback to run after stop to check for "killed while running" jobs (status "job_running"), and rollback the job status to "job_ready" if the job is retryable (which is retried by SQS with visibility timeout) or change the job status to "job_failed" and notify them.

@rajyan rajyan changed the title After stop callback Feature: after stop callback Sep 4, 2023
@phstc phstc closed this as completed in #741 Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant