Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: change constraint alters to non blocking for postgres #25164

Closed

Conversation

eschutho
Copy link
Member

@eschutho eschutho commented Sep 1, 2023

SUMMARY

This migration is marked as causing potential downtime, which we did see using a postgres db where it was locking the table while dropping the constraint and deadlocking with production db requests. I’m using two different techniques here to attempt to make this a more performant/non blocking migration so that we can run it without any downtime.

  1. instead of dropping the old constraint and creating a new one, I am renaming the old constraint which is a non-blocking action. Then creating the new constraint with the cascade on delete.
  2. utilizing postgres’ feature to not validate constraint updates. I’m using this on the subsequent drop constraint as again it is a way to drop the column in a non-blocking way. Then we validate the changes which also don’t block.

I have run the benchmark scripts several times up to over 1m rows which is much more than the size of the table that we had issues with. Other than the tests locally, this hasn’t been validated in production yet.

The other callout is that with renaming and then creating a new constraint, there is a period of time when there will be two constraints before the original one is dropped. From what I can tell it should work fine under these conditions but it has only been tested locally.

These changes only affect migrations running on Postgres metadata databases.

for the embedded table:
Results:

Current: 0.40 s
10+: 0.44 s
100+: 0.44 s
1000+: 0.26 s
10000+: 0.26 s

For the dashboard slices table:
Current: 0.30 s
10+: 0.45 s
100+: 0.45 s
1000+: 0.28 s

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@eschutho eschutho requested a review from a team as a code owner September 1, 2023 23:02
@eschutho eschutho force-pushed the elizabeth/non-blocking-migrations branch from c9982d2 to 9bafa76 Compare September 1, 2023 23:23
@pull-request-size pull-request-size bot added size/L and removed size/M labels Sep 1, 2023
@eschutho eschutho force-pushed the elizabeth/non-blocking-migrations branch 3 times, most recently from a5c10b5 to 3cef55e Compare September 2, 2023 00:08
@pull-request-size pull-request-size bot removed the size/L label Sep 2, 2023
@eschutho eschutho changed the title DRAFT: change constraint alters to non blocking DRAFT: change constraint alters to non blocking for postgres Sep 2, 2023
@eschutho eschutho force-pushed the elizabeth/non-blocking-migrations branch from 3cef55e to 424ed12 Compare September 2, 2023 00:51
@pull-request-size pull-request-size bot added size/L and removed size/M labels Sep 2, 2023
@eschutho eschutho changed the title DRAFT: change constraint alters to non blocking for postgres fix: change constraint alters to non blocking for postgres Sep 5, 2023
@michael-s-molina
Copy link
Member

@eschutho Thank you for the improvement. I'll run this in our staging environment to test the change. Assuming it works, #24939, #24938, #24628, #24488 all add ON DELETE CASCADE constraints. Can you extract the non-blocking logic from the migrations and move it to migrations/shared/constraints.py? Can you also use batch_alter_table like redefine does?

Copy link
Member

@craig-rueda craig-rueda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Agree with @michael-s-molina that this should prob be dried up

@michael-s-molina
Copy link
Member

@eschutho There's also a typo in one script name:

__dd_on_delete_cascade_for_embedded_dashboards

I think you meant _add_on_delete_cascade_for_embedded_dashboards

@eschutho
Copy link
Member Author

eschutho commented Sep 5, 2023

thanks for the review @michael-s-molina. Since this is only improving Postgres, I'm thinking of just closing the Pr and running it locally unless others feel there's a benefit to the codebase in master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants