-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AptByHash #795
Comments
Whilst being a good idea, is this fully implemented by (at least one in widespread use) apt client? Edit: As pulp is like a man-in-the-middle for software delivery, this may also be interesting on the consuming side of pulp (sync). |
There is this: https://wiki.ubuntu.com/AptByHash And at least on the repo side official Debian and Ubuntu repos appear to support this, e.g.: http://ftp.de.debian.org/debian/dists/bullseye/main/by-hash/ Merely having publications serve everything referenced in their InRelease file "by hash" as well should be pretty straightforward. The big implementation challenge I see for supporting this within Pulp, is that this specification wants a new Pulp publication (served at a particular base_path) to retain part of the old publication while loading the new publication. By Pulp design, publications exist very independently of each other, so that part might be quite tricky. It would be great if we could get a workflow description for how this feature should work in the Pulp context, and clarify what level of support we want to achieve.
I had not thought of that, but yes, having sync download by hash instead of by path (where supported by the remote repo) should also be fairly straight forward (at least in principle). |
This is a good question. I think the simplest solution would be to use a cache that proxies requests to pulp and in fact this is what we're doing. Also, ideally you want to keep the files around for a set period of time which caching the files would do (unlike republishing them). Maybe documenting how to set this up in the pulp_deb docs might be a solution? Alternatively, I think having the publish code look at the latest publication (or publications?) for the repository and copying over the necessary PublishedMetadata records shouldn't be too difficult.
👍
👍 I don't know that @adamsanaglo will have time during his internship to implement this but we'd certainly be happy to file a feature. |
Squashing aptbyhash commits with correct commit message closes pulp#795
closes pulp#795
closes pulp#795
closes pulp#795
closes pulp#795
closes pulp#795
closes pulp#795
closes pulp#795
closes pulp#795
closes pulp#795
restored one new line for improved readability closes pulp#795
…ByHash that was initially started by adamsanaglo. Addressed git commit and code formatting issues pointed out by quba42 closes pulp#795
The Problem.
Almost all users of apt have seen the 'Hash Sum Mismatch' errors during an 'apt-get update'. Apt repos have an inherent race condition. If a repo is updated while a user is fetching metadata, they may get mismatched files (i.e. an old copy of InRelease and a new copy of Packages.gz), which would result in a checksum mismatch.
Solution
The proposed solution is to mitigate this by adding the checksum of the package data/metadata files to their names. So rather than downloading the file (based on its path) and using the checksum to validate its contents, the client will use the filename to identify the expected checksum, and then download a file whose name matches that checksum.
The text was updated successfully, but these errors were encountered: