AptByHash #795

adamsanaglo · 2023-06-08T18:47:44Z

The Problem.
Almost all users of apt have seen the 'Hash Sum Mismatch' errors during an 'apt-get update'. Apt repos have an inherent race condition. If a repo is updated while a user is fetching metadata, they may get mismatched files (i.e. an old copy of InRelease and a new copy of Packages.gz), which would result in a checksum mismatch.

Solution
The proposed solution is to mitigate this by adding the checksum of the package data/metadata files to their names. So rather than downloading the file (based on its path) and using the checksum to validate its contents, the client will use the filename to identify the expected checksum, and then download a file whose name matches that checksum.

mdellweg · 2023-06-09T10:08:53Z

Whilst being a good idea, is this fully implemented by (at least one in widespread use) apt client?
Can you add a link to the corresponding specification?

Edit: As pulp is like a man-in-the-middle for software delivery, this may also be interesting on the consuming side of pulp (sync).

quba42 · 2023-06-12T06:54:16Z

There is this: https://wiki.ubuntu.com/AptByHash

And at least on the repo side official Debian and Ubuntu repos appear to support this, e.g.: http://ftp.de.debian.org/debian/dists/bullseye/main/by-hash/

Merely having publications serve everything referenced in their InRelease file "by hash" as well should be pretty straightforward. The big implementation challenge I see for supporting this within Pulp, is that this specification wants a new Pulp publication (served at a particular base_path) to retain part of the old publication while loading the new publication. By Pulp design, publications exist very independently of each other, so that part might be quite tricky.

It would be great if we could get a workflow description for how this feature should work in the Pulp context, and clarify what level of support we want to achieve.

this may also be interesting on the consuming side of pulp (sync).

I had not thought of that, but yes, having sync download by hash instead of by path (where supported by the remote repo) should also be fairly straight forward (at least in principle).

daviddavis · 2023-06-12T18:32:04Z

The big implementation challenge I see for supporting this within Pulp, is that this specification wants a new Pulp publication (served at a particular base_path) to retain part of the old publication while loading the new publication. By Pulp design, publications exist very independently of each other, so that part might be quite tricky.

This is a good question. I think the simplest solution would be to use a cache that proxies requests to pulp and in fact this is what we're doing. Also, ideally you want to keep the files around for a set period of time which caching the files would do (unlike republishing them). Maybe documenting how to set this up in the pulp_deb docs might be a solution?

Alternatively, I think having the publish code look at the latest publication (or publications?) for the repository and copying over the necessary PublishedMetadata records shouldn't be too difficult.

It would be great if we could get a workflow description for how this feature should work in the Pulp context, and clarify what level of support we want to achieve.

👍

this may also be interesting on the consuming side of pulp (sync).

👍 I don't know that @adamsanaglo will have time during his internship to implement this but we'd certainly be happy to file a feature.

closes pulp#795

pulp#795

closes pulp#795

Squashing aptbyhash commits with correct commit message closes pulp#795

closes pulp#795 resolving aptbyhash pr comments closes pulp#795

closes pulp#795

Closes pulp#795

restored one new line for improved readability closes pulp#795

…ByHash that was initially started by adamsanaglo. Addressed git commit and code formatting issues pointed out by quba42 closes pulp#795

closes pulp#795

adamsanaglo added Feature Triage-Needed labels Jun 8, 2023

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 18, 2023

Resolving aptbyhash pr comments

5968ac0

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 19, 2023

Resolving aptbyhash pr comments

5ff2dff

pulp#795

adamsanaglo mentioned this issue Jul 19, 2023

Generating checksum named files for aptbyhash #833

Closed

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 19, 2023

resolving aptbyhash pr comments

67b7a6e

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 19, 2023

Generating checksum named metadata files for AptByHash (squashed)

0368039

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 19, 2023

Merge branch 'main' of github.com:adamsanaglo/pulp_deb

985befc

Squashing aptbyhash commits with correct commit message closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 20, 2023

Generating checksum named metadata files for AptByHash (squashed)

d3d7f21

closes pulp#795 resolving aptbyhash pr comments closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 20, 2023

Generating checksum named metadata files for AptByHash (squashed)

15df971

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 20, 2023

Ignoring files

971bb30

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 24, 2023

getting checksum from artifact model

a96a74c

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 24, 2023

Generating checksum named metadata files for AptByHash (squashed)

88c8829

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 24, 2023

Ignoring files

69b8c3c

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 24, 2023

getting checksum from artifact model

b15b0a1

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 25, 2023

cleaning settings file

be1a92a

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 25, 2023

Generating checksum named metadata files for AptByHash (squashed)

4a1ddf0

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 25, 2023

Ignoring files

1789c32

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 25, 2023

getting checksum from artifact model

9c02a0d

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 25, 2023

Clean up

b91745c

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 25, 2023

Using checksum map

0711fb2

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Jul 26, 2023

Modified functional test requirements

99fc84e

closes pulp#795

hstct removed the Triage-Needed label Aug 9, 2023

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

Generating checksum named metadata files for AptByHash (squashed)

2ef1d29

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

Ignoring files

d68c99b

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

getting checksum from artifact model

5c9ac13

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

Clean up

ff83b6c

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

Using checksum map

f6477e0

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

Modified functional test requirements

9525e4a

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

supporting all allowed checksum types

6d1954a

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

Resolving conflicts

be9b5dd

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

Generating checksum named metadata files for AptByHash (squashed)

2c118f1

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

Ignoring files

e748e4e

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

getting checksum from artifact model

88e84d8

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

Clean up

5850dc1

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

Using checksum map

f81df32

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

Modified functional test requirements

ccb3310

closes pulp#795

adamsanaglo added a commit to adamsanaglo/pulp_deb that referenced this issue Aug 10, 2023

Supporting all allowed checksums

aae13a8

closes pulp#795

quba42 added .feature CHANGES/<issue_number>.feature and removed Feature labels Aug 23, 2023

acheng-01 mentioned this issue Aug 31, 2023

Continuing checksum named files for aptbyhash #883

Closed

acheng-01 pushed a commit to acheng-01/pulp_deb that referenced this issue Aug 31, 2023

Generating checksum named metadata files for AptByHash (squashed)

b625e6f

Closes pulp#795

acheng-01 mentioned this issue Sep 1, 2023

Continuing #833 RE: Generating Checksum Named Files for AptByHash #884

Merged

acheng-01 pushed a commit to acheng-01/pulp_deb that referenced this issue Sep 5, 2023

squashing previous commits

4d728f0

restored one new line for improved readability closes pulp#795

acheng-01 pushed a commit to acheng-01/pulp_deb that referenced this issue Sep 25, 2023

Add ability to publish package indices using AptByHash format

fe55235

closes pulp#795

acheng-01 pushed a commit to acheng-01/pulp_deb that referenced this issue Oct 9, 2023

Add ability to publish package indices using AptByHash format

0c497c6

closes pulp#795

acheng-01 pushed a commit to acheng-01/pulp_deb that referenced this issue Oct 9, 2023

Add ability to publish package indices using AptByHash format

e242b8c

closes pulp#795

quba42 closed this as completed in #884 Oct 23, 2023

daviddavis pushed a commit to daviddavis/pulp_deb that referenced this issue Nov 20, 2023

Add ability to publish package indices using AptByHash format

ced4f49

closes pulp#795

daviddavis pushed a commit to daviddavis/pulp_deb that referenced this issue Nov 20, 2023

Add ability to publish package indices using AptByHash format

c2fc28c

closes pulp#795

daviddavis mentioned this issue Mar 4, 2024

Use by-hash files when syncing content #1031

Open

acheng-01 mentioned this issue Apr 1, 2024

Allow source files to publish at by-hash paths #1059

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AptByHash #795

AptByHash #795

adamsanaglo commented Jun 8, 2023

mdellweg commented Jun 9, 2023 •

edited

Loading

quba42 commented Jun 12, 2023

daviddavis commented Jun 12, 2023 •

edited

Loading

AptByHash #795

AptByHash #795

Comments

adamsanaglo commented Jun 8, 2023

mdellweg commented Jun 9, 2023 • edited Loading

quba42 commented Jun 12, 2023

daviddavis commented Jun 12, 2023 • edited Loading

mdellweg commented Jun 9, 2023 •

edited

Loading

daviddavis commented Jun 12, 2023 •

edited

Loading