-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RQ: periodically clear failed jobs #4306
Conversation
redash/tasks/general.py
Outdated
def purge_failed_jobs(): | ||
jobs = rq_redis_connection.scan_iter('rq:job:*') | ||
|
||
is_idle = lambda key: rq_redis_connection.object('idletime', key) > settings.JOB_DEFAULT_FAILURE_TTL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This subcommand is available when maxmemory-policy is set to an LRU policy or noeviction.
(This command being idletime
)
Is this the default config for Redis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default is noeviction
. On AWS, for example, it defaults to volatile-lru
.
redash/tasks/general.py
Outdated
stale_jobs = [key for key in jobs if is_idle(key) and has_failed(key) and not_in_any_failed_registry(key)] | ||
|
||
for key in stale_jobs: | ||
rq_redis_connection.delete(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth removing it from the FailedRegistry while we at it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do that, but the point is that we want to let the FailedJobRegistry handle its own state. From what I can tell, there aren't any dire consequences to have ghost job ids in the FailedJobRegistry (it is only used for requeueing and in that case - these jobs will simply not get requeued)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we delete from FailedJobRegistry
, we might as well just do that and avoid checking for job inclusion (and avoid the whole comment+bypass at the top of the function).
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just worried that over a year it can accumulate quite a lot of job ids there, which might have some consequences in performance or at least memory usage.
Re. avoid checking job inclusion: I guess we can skip this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah 02555b9 makes things simpler.
Co-Authored-By: Arik Fraimovich <arik@arikfr.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
As per rq/rq#1143, failed jobs stay in Redis forever. If this is true, we should implement our own periodic cleanup of these jobs.