-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle stale jobs more carefully before purging them. #4615
Conversation
stale_jobs = [] | ||
for failed_job in failed_jobs: | ||
# the job may not actually exist anymore in Redis | ||
if not failed_job: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we just compact
these?
# the job could have an empty ended_at value in case | ||
# of a worker dying before it can save the ended_at value, | ||
# in which case we also consider them stale | ||
if not failed_job.ended_at: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels more like an or
conditional and less like a multi-branch statement to me.
for job in stale_jobs: | ||
job.delete() | ||
stale_jobs = [] | ||
for failed_job in failed_jobs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you share my thoughts on the other couple of comments, this whole block might be better represented by a filter on failed_jobs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow, what do you mean with "a filter on failed_jobs
"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just mean that it feels like stale_jobs
is just a sub-list of failed_jobs
that satisfies a predicate. Something like:
is_stale = lambda job: job.ended_at is None or
(datetime.utcnow() - job.ended_at).seconds > settings.JOB_DEFAULT_FAILURE_TTL
stale_jobs = filter(is_stale, compact(failed_jobs))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you may be right that this is another way to write it, I don't see this as more readable. But it's up to you, feel free to change the patch the way you like it better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with keeping @jezdez's implementation as is, as it leaves room for explaining the different steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Don't feel strongly either way, but generally I'm more in the camp of having descriptive variable / function / lambda names instead of comments (i.e. is_stale = worker_died or too_old
). They just expire slower than comments.
What type of PR is this? (check all applicable)
Description
In case of a dead worker, the ended_at value could be None, preventing
the purge task from running successfully and leading to the purge never
running successfully.
Related Tickets & Documents
Mobile & Desktop Screenshots/Recordings (if there are UI changes)