Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileInfoPollerServer lookback exceeds limit leads to reprocessing files #855

Merged
merged 3 commits into from
Aug 21, 2024

Conversation

michaeldjeffrey
Copy link
Contributor

Some Background

A FileInfoPollerServer keeps track of processed files in a db table. To keep from growing forever, this table is cleaned when a FileInfoPollerServer is started, and every 12 hours afterwards, to the most recent 100 processed files. For most cases, this is well and good.

To fetch files for processing, the timestamp of the latest processed file and an offset Duration is used to determine how far back to look.

The Situation

A burst of files arrived on devnet exceeding the 100 row limit. And they were processed. It was good.

For sake of this tale, let's say 150 files arrived in 10 minutes. And the offset provided was 1 hour.

Later, the cleanup job came around (or when the service was restarted) and removed the 50 that were over the limit.

The next time the FileInfoPollerServer was asked for some files to process, it took the latest timestamp, added the offset and requested those files from s3. It pulled down everything from the last hour.

However, the first 50 files it pulled down that were already processed were no longer in the database because of the cleanup job. So they were dutifully re-processed. It was bad.

A Solution

When the cleanup job runs, either on interval or startup, we find the timestamp of the row at the allowed limit (in this case, still 100).

If the offset provided is earlier than the 100th item (t100), the cleanup job will remove rows older than the offset.
If t100 is older, rows older than it are removed.

In this way, files that may be fetched for processing should always exist in the database until there is no longer a chance they will be fetched. And we keep around some history for debugging.

…okback offset

When the cleaning process is triggered, it will get the timestamp of the
100th oldest file that has been processed. If that time is greater than
the lookback offset, we will only remove files older than the lookback.
Otherwise, we will remove any file older than the 100th entry.

- add cache clean logging
- break out getting FileInfo from s3 with `FileInfoPollerStore` trait
`FileInfoPollerServer` has a test that uses a database now.
@michaeldjeffrey michaeldjeffrey merged commit 0cced21 into main Aug 21, 2024
17 checks passed
@michaeldjeffrey michaeldjeffrey deleted the mj/file-store-lookback-exceeds-limit branch August 21, 2024 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants