-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create better primary keys for subtrees #2304
Comments
From: pulpbot (pulpbot) PR: #2176 |
From: pulpbot (pulpbot) PR: #2180 |
From: pulpbot (pulpbot) PR: #2185 |
TL;DR: I'm using pulp_rpm 3.17.1. Reproducer:
The illustration of the situation below. We do not track repo versions for subrepos, so always the latest version is used.
Problems with changing subrepos names only:
Potential solution:
@HolgerHees , for you information, since you brought this problem to our attention. I also believe the fix won't be backportable (3.14/3.16 will need a separate solution). I suggest to encourage folks to upgrade to 3.17 and not even try to come up with a solution for old branches. It would be very error-prone. |
That's definitely fine for 3.16 since 3.17 is replacing it going forwards, I don't anticipate we will do any more 3.16.z releases. For 3.14 it would be good to summarize the practical impact of this bug for any users who might be stuck on that release on a BZ. It sounds like it presents a data correctness problem for anyone using the standard test / prod promotion workflow - they might get a newer distribution tree on their slow branch than they were intending. |
Good insight! Using repomd.xml hash sounds like the right way to go |
I don't have useful feedback on the solution here - just want to record that pulp/pulpcore#2192 is related, and we'll need to update whatever we/I end up doing to fix that import/export problem, when we address this. |
DistributionTree digest and subrepo-names now both end with the pulp-id of the "owning" Repository, making them unique to that repo and therefore protected from concurrent-updates against anything that is changing that Repository. Addon/Variant/Image are transitively made unique by virtue of having their DistributionTree be part of their unique-together. Sub-repo **content** (e.g. Packages et al) are de-duplicated via their existing uniqueness constraints. The end result is a minor increase in Content objects (i.e., DistTrees/Addons/Images/Variants that used to have only one instance are now one-per-containing-repo), and a small impact on subrepo-syncing (since previously-unique subrepos will now have a first-sync that would have been skipped). Content will continue to only be sync'd once. fixes pulp#2278. fixes pulp#2775. closes pulp#2304. [nocoverage]
I'm addressing this under #2278 - closing this as a dup so we can bring all the discussion to one place. |
Author: holger.hees (holger.hees)
Date: 2021-11-12
Redmine Issue: 9566, https://pulp.plan.io/issues/9566
Scenario:
My primary pulp instance is hosting 2 distributions of the same repository (like staging and production) which are referencing different versions of the same repository. During my initial run, both distributions are point to version 1. So far so good.
Now I have a secondary pulp instance which is mirroring the 2 primary distributions by creating separate remotes and repositories.
The repository on the primary node contains now a subtree which is identically in version 1 for staging and production. Means it has the same hash.
Now, during the sync process the metadata for this subtree are stored by createing a primary key like "{repodata}-{treeinfo['hash']}". This collides with staging and production, because contentwise and with the hash, the subtree is the same for both staging and production. The key should be something like "{repodata}-{treeinfo['hash']}-{repository_pk}"
#2173
The text was updated successfully, but these errors were encountered: