-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add history purge command to minder server cli. #3976
Conversation
d3c3752
to
5d6d4ac
Compare
cmd/server/app/history_purge.go
Outdated
return nil | ||
} | ||
|
||
// filterRecords sift through the records separating the latest for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this could be implemented in the SQL query by using the latest_evaluation_statuses
table and the EXCEPT
operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I integrated this suggestion, but I haven't used the EXCEPT
operator and implemented it using joins.
6ca77ea
to
d7a66c7
Compare
This command is used to manage the life cycle of the history log. The business requirement is to delete all records older than 30 days maintaining the most recent one for each entity/rule pair even if older than 30 days. Implementing this requirement mandates the processing of the whole set of records older than 30 days, which cannot be processed in chunks without creating arbitrary holes in the history. As a first approximation, the proposed implementation loads all records in RAM, filters out the ones to keep, and issues a series of deletions of up to 1000 records each. A test was added to keep track of the record size. A future improvement would be to spill records to secondary storage, where we would perform sorting and filtering, but it was overly complex and unjustified at this point in time. Fixes #3636
Table `latest_evaluation_statuses` tracks the latest evaluation id for any given entity/rule pair. Adding it via left join allows us to determine which records are not the latest ones among those older than 30 days by relying totally on the database rather than doing the processing in application code. This also lowers a little bit the resources necessary to process deletions.
Summary
This command is used to manage the life cycle of the history log. The business requirement is to delete all records older than 30 days maintaining the most recent one for each entity/rule pair even if older than 30 days.
Implementing this requirement mandates the processing of the whole set of records older than 30 days, which cannot be processed in chunks without creating arbitrary holes in the history. As a first approximation, the proposed implementation loads all records in RAM, filters out the ones to keep, and issues a series of deletions of up to 1000 records each. A test was added to keep track of the record size. A future improvement would be to spill records to secondary storage, where we would perform sorting and filtering, but it was overly complex and unjustified at this point in time.
Fixes #3636
Change Type
Testing
Some unit tests, mostly manual tests.
Review Checklist: