`FileInfoPollerServer` lookback exceeds limit leads to reprocessing files #855

michaeldjeffrey · 2024-08-21T00:07:13Z

Some Background

A FileInfoPollerServer keeps track of processed files in a db table. To keep from growing forever, this table is cleaned when a FileInfoPollerServer is started, and every 12 hours afterwards, to the most recent 100 processed files. For most cases, this is well and good.

To fetch files for processing, the timestamp of the latest processed file and an offset Duration is used to determine how far back to look.

The Situation

A burst of files arrived on devnet exceeding the 100 row limit. And they were processed. It was good.

For sake of this tale, let's say 150 files arrived in 10 minutes. And the offset provided was 1 hour.

Later, the cleanup job came around (or when the service was restarted) and removed the 50 that were over the limit.

The next time the FileInfoPollerServer was asked for some files to process, it took the latest timestamp, added the offset and requested those files from s3. It pulled down everything from the last hour.

However, the first 50 files it pulled down that were already processed were no longer in the database because of the cleanup job. So they were dutifully re-processed. It was bad.

A Solution

When the cleanup job runs, either on interval or startup, we find the timestamp of the row at the allowed limit (in this case, still 100).

If the offset provided is earlier than the 100th item (t100), the cleanup job will remove rows older than the offset.
If t100 is older, rows older than it are removed.

In this way, files that may be fetched for processing should always exist in the database until there is no longer a chance they will be fetched. And we keep around some history for debugging.

…okback offset When the cleaning process is triggered, it will get the timestamp of the 100th oldest file that has been processed. If that time is greater than the lookback offset, we will only remove files older than the lookback. Otherwise, we will remove any file older than the 100th entry. - add cache clean logging - break out getting FileInfo from s3 with `FileInfoPollerStore` trait

file_store/src/file_info_poller.rs

`FileInfoPollerServer` has a test that uses a database now.

michaeldjeffrey added 2 commits August 20, 2024 13:32

break out file poller store into trait for testing

7a90c4a

michaeldjeffrey requested review from maplant, andymck, bbalser and jeffgrunewald August 21, 2024 00:07

andymck reviewed Aug 21, 2024

View reviewed changes

file_store/src/file_info_poller.rs Show resolved Hide resolved

move file-store to the CI job that needs a postgres container for tests

97c0137

`FileInfoPollerServer` has a test that uses a database now.

andymck approved these changes Aug 21, 2024

View reviewed changes

bbalser approved these changes Aug 21, 2024

View reviewed changes

michaeldjeffrey merged commit 0cced21 into main Aug 21, 2024
17 checks passed

michaeldjeffrey deleted the mj/file-store-lookback-exceeds-limit branch August 21, 2024 17:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`FileInfoPollerServer` lookback exceeds limit leads to reprocessing files #855

`FileInfoPollerServer` lookback exceeds limit leads to reprocessing files #855

michaeldjeffrey commented Aug 21, 2024

FileInfoPollerServer lookback exceeds limit leads to reprocessing files #855

FileInfoPollerServer lookback exceeds limit leads to reprocessing files #855

Conversation

michaeldjeffrey commented Aug 21, 2024

Some Background

The Situation

A Solution

`FileInfoPollerServer` lookback exceeds limit leads to reprocessing files #855

`FileInfoPollerServer` lookback exceeds limit leads to reprocessing files #855