Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[db] Eliminating duplicate key constraint violations #3712

Merged
merged 1 commit into from
Sep 20, 2023

Conversation

bruntib
Copy link
Contributor

@bruntib bruntib commented Jul 22, 2022

In concurrent storage of two runs containing the same files leads to
duplicate key constraint violation. Some DBMSs can gracefully handle
this issue by supporting "ON CONFLICT DO NOTHING" clause at INSERT
statement.

@bruntib bruntib added database 🗄️ Issues related to the database schema. bugfix 🔨 labels Jul 22, 2022
@bruntib bruntib added this to the release 6.20.0 milestone Jul 22, 2022
@bruntib bruntib requested a review from Szelethus July 22, 2022 15:20
@bruntib bruntib requested a review from vodorok as a code owner July 22, 2022 15:20
Copy link
Contributor

@vodorok vodorok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@dkrupp dkrupp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add test cases. Please reproduce the error and check if it is solved after the change.

filepath=file_path,
filename=os.path.basename(file_path),
content_hash=content_hash).on_conflict_do_nothing(
index_elements=['id'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index_elements=["content_hash"]?? Shouldn't we list the key clashing constraint fields here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can test this fix and provoke the error by storing a file to the server (with masstore run) that has been stored already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[2022-08-15 09:30:01 UTC] (23505) - ERROR: duplicate key value violates unique constraint "uq_files_filepath" [2022-08-15 09:30:01 UTC] (23505) - DETAIL: Key (filepath, content_hash)=(/path/to/blabla.h, <hash_value>) already exists.
[2022-08-15 09:30:01 UTC] <???> (23505) - STATEMENT: INSERT INTO files (filepath, filename, content_hash, remote_url, tracking_branch) VALUES ('/path/to/blabla.h', 'blabla.h', '<hash_value>', NULL, NULL) RETURNING files.id

In concurrent storage of two runs containing the same files leads to
duplicate key constraint violation. Some DBMSs can gracefully handle
this issue by supporting "ON CONFLICT DO NOTHING" clause at INSERT
statement.
@bruntib bruntib force-pushed the duplicate_key_violation branch from ef405cb to eebce30 Compare August 18, 2023 12:59
@bruntib bruntib requested a review from dkrupp August 18, 2023 13:22
@dkrupp dkrupp merged commit 90464ac into Ericsson:master Sep 20, 2023
@bruntib bruntib deleted the duplicate_key_violation branch October 2, 2023 05:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix 🔨 database 🗄️ Issues related to the database schema.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants