-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pulp_deb duplicate package handling is erroneous #921
Comments
quba42
added
Triage-Needed
.bugfix
CHANGES/<issue_number>.bugfix
and removed
Triage-Needed
labels
Oct 25, 2023
downstream: OR-4275 |
quba42
added a commit
to ATIX-AG/pulp_deb
that referenced
this issue
Oct 30, 2023
closes pulp#921 As a result of the behaviour fixes, we can also drop the now superfluous duplicate distribution checking. We are using validate_duplicate_content to catch incoming duplicates, and we are removing old duplicates, so it is not possible to run into this error. As a result it is not worth the performance cost to check for it.
quba42
added a commit
to ATIX-AG/pulp_deb
that referenced
this issue
Oct 31, 2023
closes pulp#921 As a result of the behaviour fixes, we can also drop the now superfluous duplicate distribution checking. We are using validate_duplicate_content to catch incoming duplicates, and we are removing old duplicates, so it is not possible to run into this error. As a result it is not worth the performance cost to check for it.
quba42
added a commit
to ATIX-AG/pulp_deb
that referenced
this issue
Oct 31, 2023
closes pulp#921 As a result of the behaviour fixes, we can also drop the now superfluous duplicate distribution checking. We are using validate_duplicate_content to catch incoming duplicates, and we are removing old duplicates, so it is not possible to run into this error. As a result it is not worth the performance cost to check for it.
quba42
added a commit
to ATIX-AG/pulp_deb
that referenced
this issue
Nov 13, 2023
closes pulp#921 As a result of the behaviour fixes, we can also drop the now superfluous duplicate distribution checking. We are using validate_duplicate_content to catch incoming duplicates, and we are removing old duplicates, so it is not possible to run into this error. As a result it is not worth the performance cost to check for it.
quba42
added a commit
to ATIX-AG/pulp_deb
that referenced
this issue
Nov 15, 2023
closes pulp#921 As a result of the behaviour fixes, we can also drop the now superfluous duplicate distribution checking. We are using validate_duplicate_content to catch incoming duplicates, and we are removing old duplicates, so it is not possible to run into this error. As a result it is not worth the performance cost to check for it.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Version
all versions affected
Describe the bug
We use
remove_duplicates
from pulpcore to handle duplicates: https://github.com/pulp/pulp_deb/blob/main/pulp_deb/app/models/repository.py#L96As it turns out, this does not do exactly what we want for APT repos in two ways:
pool/
folder locations, are unwise/bad practice, but not actually forbidden.remove_duplicates
does not discriminate between these two cases, and ends up removing packages from perfectly legitimate APT repos.remove_duplicates
only compares incoming packages with previously present packages, it does not search the incoming packages for duplicates. Instead it is meant to be combined withvalidate_duplicate_content
which throws an error if the incoming packages contain duplicates. However, we don't and can't use this because this would completely prevent syncing of upstream repos containing "acceptable duplicates".To Reproduce
One version of the problem is to perform a pulp to pulp sync of only the default distribution, followed by a sync of all distributions.
The default distribution contains an "acceptable duplicate" of every package, and all of these are suddenly kicked out by
remove_duplicates
. With the next sync, the incoming default distribution packages will kick out all the packages from the structured distributions and each sync will oscillate back and forth between no packages fordefault
and no packages for everything else.Expected behavior
We need our own APT specific version of
remove_duplicates
, that will kick out illigitimate duplicates, but not acceptable duplicates.We also need our own version of the functionality of
validate_duplicate_content
. Unfortunately this may require significantly more queries than the version in pulpcore, because of the need to distinguish between the two cases.The text was updated successfully, but these errors were encountered: