-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"duplicate key value violates unique constraint" when syncing two repositories with identical sub-repositories in parallel #2278
Comments
From: wilful (wilful) I thought the artifact would be reused for de-duplicate. But had a conflict =( |
From: @dralley (dalley) Hi wilful, Could you provide a little more information? Which versions of Pulp are you running, and what steps did you take that lead you to that error? |
From: @dralley (dalley) This can be reproduced if you sync the same url into two repos at the same time, or by syncing two different urls with the same repo content at the same time. It's a race condition in the sync pipeline. @wilful, does this match your experience? Or did you experience this while syncing the repos one after another, independently and not in parallel? |
From: @dralley (dalley) The duplicate 7828 mentions
So we should try that out |
From: @ggainey (ggainey) I experimented with the mentioned OLE repos on current-master and was unable to reproduce. Used this script:
(Note: 4 cycles took something over an hour on my system) |
From: @dkliban (dkliban@redhat.com) Based on the previous comment, I am closing. |
From: @dralley (dalley) I was able to reproduce this with a different traceback 3 times in a row - script attached
We have the same problem with the RPM plugin. |
From: pulpbot (pulpbot) |
From: @bmbouter (bmbouter) I closed my PR because I don't see a change in pulpcore that can be made to fix this. I've summarized my findings here: pulp/pulpcore#1717 (comment) Per convo in matrix I am moving to pulp_rpm to get some input there. If there is something pulpcore can do to resolve please share the idea. |
Can no longer reproduce - we've fixed a lot of concurrency bugs though, I bet this is one of them. |
I am able to reproduce this issue and I have created a bugzilla for this issue. For more information please refer to the bugzilla. https://bugzilla.redhat.com/show_bug.cgi?id=2077363 @dralley, @ggainey Can we reopen this issue? It seems like I don't have a permission to reopen it. |
unassigning, I have a new top priority |
This pulp-cli/jq script follows @hao-yu 's observations from BZ# 2077363 to reproduce the problem when run against a 'clean' system:
The suggestion at #2278 (comment) def makes the problem go away, resulting in a copy of a given subrepo being created for each repo syncing that content. This connects the sub-repos to their parent-repos, where the current behavior results in a subrepo with a given name/treeinfo-hash being shared by all repos that specify that name/treeinfo tuple. That sharing doesn't buy much for the Pulp instance (since the content is de-duplicated), and it feels like a potential source of other subtly-wrong behavior that we haven't noticed yet. The remaining question is, "what (if anything?) do we need to do to fix existing systems that have already sync'd using the current behavior"? Will need some investigation and thinking. |
@goosemania has a great description of Why This Approach Won't Work, here : #2304 (comment) |
DistributionTree digest and subrepo-names now both end with the pulp-id of the "owning" Repository, making them unique to that repo and therefore protected from concurrent-updates against anything that is changing that Repository. Addon/Variant/Image are transitively made unique by virtue of having their DistributionTree be part of their unique-together. Sub-repo **content** (e.g. Packages et al) are de-duplicated via their existing uniqueness constraints. The end result is a minor increase in Content objects (i.e., DistTrees/Addons/Images/Variants that used to have only one instance are now one-per-containing-repo), and a small impact on subrepo-syncing (since previously-unique subrepos will now have a first-sync that would have been skipped). Content will continue to only be sync'd once. fixes pulp#2278. fixes pulp#2775. closes pulp#2304. [nocoverage]
DistributionTree digest and subrepo-names now both end with the pulp-id of the "owning" Repository, making them unique to that repo and therefore protected from concurrent-updates against anything that is changing that Repository. Addon/Variant/Image are transitively made unique by virtue of having their DistributionTree be part of their unique-together. Sub-repo **content** (e.g. Packages et al) are de-duplicated via their existing uniqueness constraints. The end result is a minor increase in Content objects (i.e., DistTrees/Addons/Images/Variants that used to have only one instance are now one-per-containing-repo), and a small impact on subrepo-syncing (since previously-unique subrepos will now have a first-sync that would have been skipped). Content will continue to only be sync'd once. fixes pulp#2278. [nocoverage]
DistributionTree digest and subrepo-names now both end with the pulp-id of the "owning" Repository, making them unique to that repo and therefore protected from concurrent-updates against anything that is changing that Repository. Addon/Variant/Image are transitively made unique by virtue of having their DistributionTree be part of their unique-together. Sub-repo **content** (e.g. Packages et al) are de-duplicated via their existing uniqueness constraints. The end result is a minor increase in Content objects (i.e., DistTrees/Addons/Images/Variants that used to have only one instance are now one-per-containing-repo), and a small impact on subrepo-syncing (since previously-unique subrepos will now have a first-sync that would have been skipped). Content will continue to only be sync'd once. fixes #2278. [nocoverage]
DistributionTree digest and subrepo-names now both end with the pulp-id of the "owning" Repository, making them unique to that repo and therefore protected from concurrent-updates against anything that is changing that Repository. Addon/Variant/Image are transitively made unique by virtue of having their DistributionTree be part of their unique-together. Sub-repo **content** (e.g. Packages et al) are de-duplicated via their existing uniqueness constraints. The end result is a minor increase in Content objects (i.e., DistTrees/Addons/Images/Variants that used to have only one instance are now one-per-containing-repo), and a small impact on subrepo-syncing (since previously-unique subrepos will now have a first-sync that would have been skipped). Content will continue to only be sync'd once. fixes #2278. [nocoverage] (cherry picked from commit 52a9acc)
DistributionTree digest and subrepo-names now both end with the pulp-id of the "owning" Repository, making them unique to that repo and therefore protected from concurrent-updates against anything that is changing that Repository. Addon/Variant/Image are transitively made unique by virtue of having their DistributionTree be part of their unique-together. Sub-repo **content** (e.g. Packages et al) are de-duplicated via their existing uniqueness constraints. The end result is a minor increase in Content objects (i.e., DistTrees/Addons/Images/Variants that used to have only one instance are now one-per-containing-repo), and a small impact on subrepo-syncing (since previously-unique subrepos will now have a first-sync that would have been skipped). Content will continue to only be sync'd once. fixes #2278. [nocoverage] (cherry picked from commit 52a9acc)
DistributionTree digest and subrepo-names now both end with the pulp-id of the "owning" Repository, making them unique to that repo and therefore protected from concurrent-updates against anything that is changing that Repository. Addon/Variant/Image are transitively made unique by virtue of having their DistributionTree be part of their unique-together. Sub-repo **content** (e.g. Packages et al) are de-duplicated via their existing uniqueness constraints. The end result is a minor increase in Content objects (i.e., DistTrees/Addons/Images/Variants that used to have only one instance are now one-per-containing-repo), and a small impact on subrepo-syncing (since previously-unique subrepos will now have a first-sync that would have been skipped). Content will continue to only be sync'd once. fixes #2278. [nocoverage] (cherry picked from commit 52a9acc)
DistributionTree digest and subrepo-names now both end with the pulp-id of the "owning" Repository, making them unique to that repo and therefore protected from concurrent-updates against anything that is changing that Repository. Addon/Variant/Image are transitively made unique by virtue of having their DistributionTree be part of their unique-together. Sub-repo **content** (e.g. Packages et al) are de-duplicated via their existing uniqueness constraints. The end result is a minor increase in Content objects (i.e., DistTrees/Addons/Images/Variants that used to have only one instance are now one-per-containing-repo), and a small impact on subrepo-syncing (since previously-unique subrepos will now have a first-sync that would have been skipped). Content will continue to only be sync'd once. fixes pulp#2278. [nocoverage] (cherry picked from commit 52a9acc)
DistributionTree digest and subrepo-names now both end with the pulp-id of the "owning" Repository, making them unique to that repo and therefore protected from concurrent-updates against anything that is changing that Repository. Addon/Variant/Image are transitively made unique by virtue of having their DistributionTree be part of their unique-together. Sub-repo **content** (e.g. Packages et al) are de-duplicated via their existing uniqueness constraints. The end result is a minor increase in Content objects (i.e., DistTrees/Addons/Images/Variants that used to have only one instance are now one-per-containing-repo), and a small impact on subrepo-syncing (since previously-unique subrepos will now have a first-sync that would have been skipped). Content will continue to only be sync'd once. fixes #2278. [nocoverage] (cherry picked from commit 52a9acc)
Author: wilful (wilful)
Redmine Issue: 8967, https://pulp.plan.io/issues/8967
The original issue is difficult to reproduce any longer, but there are similar issues which can be. see https://pulp.plan.io/issues/8967#note-16
========================
Hi for all!
Me need added for pulp server two repositories:
http://downloads.linux.hpe.com/SDR/repo/spp/redhat/7/x86_64/current/
http://downloads.linux.hpe.com/SDR/repo/mcp/CentOS/7/x86_64/current/
But i can't do it, becouse:
How can I find out in which repository this package is?
The text was updated successfully, but these errors were encountered: