-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cleanup mechanism for Iceberg Glue tests #11903
Conversation
After one day any schemas with the 'ci_iceberg_integration_test' property will be deleted.
b11123c
to
e8bea25
Compare
public class TestGlueCleanup | ||
{ | ||
// All tests creating Glue schemas should add this key/value pair to the schema properties. | ||
public static final String GLUE_SCHEMA_PROPERTY_KEY = "ci_iceberg_integration_test"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the property-based approach clever and neat.
I am concerned however that some new tests will forget to set it. Also, it's less obvious than a name. Simple persons like me will notice a name, but won't notice the propery.
I would rather have it based on schema name only.
It is not a hard ask to require each test schema name to begin with test_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, but was worried about deleting schemas that might be used for other things. I could see there being test_*
schemas that are supposed to be long-lived, but I don't know what else might be in the CI's AWS account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI must run on an isolated account.
tests may erroneously delete something, so you won't let tests run where you keep anything precious.
i agree ci_
or ci_test_
would be a better prefix, but we have already existing cases of using test_
and am not worried about enforcing this in a generic fashion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI must run on an isolated account.
tests may erroneously delete something, so you won't let tests run where you keep anything precious.
I guess I agree that you shouldn't run tests anywhere with precious databases, but when writing the tests I ran them against my personal AWS account, so I guess I wrote this with the same level of care as I would treat that account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use your company's testing account when working locally. this will allow the test code to be simpler, and your personal AWS account to be safe. Still, removing test_
schemas isn't "very destructive" even in a personal account.
Description
After one day any schemas with the 'ci_iceberg_integration_test' property will be deleted.
Test infra improvement
Test only
Ensure test cancellation does not result in databases which are never deleted.
Related issues, pull requests, and links
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
(x) No release notes entries required.
( ) Release notes entries required with the following suggested text: