Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DESIGN] A phasing effect for publications that were recently served by a distribution. #911

Closed
mdellweg opened this issue Oct 11, 2023 · 4 comments · Fixed by #925
Closed
Labels
Story For big issues that probably need to be split up into several tasks.

Comments

@mdellweg
Copy link
Member

mdellweg commented Oct 11, 2023

Apt-by-hash is supposed to serve files at known locations for some time (order of days) after the corresponding publication has been replaced by a newer version. This can be accomplished by a distribution that can serve multiple publications like overlays, as long as it can be assured that the publications exist sufficiently long.

It won't you'd need to retain something for this feature to work. But you need to anyway, because serving the by-hash files without the pool (of the same version) is not gaining you much. So let's say, you have a retain count of X, you add new versions no more than once a day, and the distribution serves files from the current and if not found yet publications it was attached to (by increasing age up to X days).
Now you may identify one bad file in the second to latest publication, you definitely need to stop serving right away, you can still delete that publication, and the distribution would immediately forget it (because the join model cascade deletes with the publication).
If you delete the repository, all you are left with is an empty distribution with no faint memory about any publication (as today).

Originally posted by @mdellweg in #884 (comment)

@quba42 quba42 added the Story For big issues that probably need to be split up into several tasks. label Oct 11, 2023
@daviddavis
Copy link
Contributor

daviddavis commented Oct 20, 2023

I had a chance to think of a design. Feedback welcome.

I'm imagining this will be released as a tech preview and is independent of the APT_BY_HASH feature we introduced.

Setting

We'll introduce a new setting PUBLICATION_CACHE_DURATION to enable the content app to serve artifacts from previously distributed publications for a particular distribution for a period of time.

I think minutes probably makes sense and a default of 0 (at least initially). When using APT_BY_HASH, the recommendation should be 4320 minutes (3 days).

Model

DistributedPublication

Inherits from ModelBase.

  • publication - foreign key, cascade delete
  • distribution - foreign key, cascade delete

Whenever a distribution is created/updated and publication_id is set or has changed, it will create a new DistributedPublication with the publication_id.

Whenever a publication is created, it will check for any distributions where repo id matches the publication's repo id. For any match, a new DistributedPublication gets created.

Content handler

The distribution will have a content handler method to handle requests. When a request comes in, it'll search the artifacts for any completed publication for the particular distribution where distribution_publication.pulp_created >= datetime.now() - timedelta(minutes=PUBLICATION_CACHE_DURATION). The artifacts ought to be sorted by distribution_publication.pulp_created and the first match gets returned to the user.

If there are no matching distributed_publications or the code fails to find a matching artifact, the method returns letting the content app handle the request.

@quba42
Copy link
Collaborator

quba42 commented Oct 23, 2023

Do we need to somehow discriminate between artifacts that should still be served, and artifacts that should no longer be served?
My understanding was that APT by Hash should keep old metadata, but not necessarily old packages. or did I get that wrong?

@daviddavis
Copy link
Contributor

daviddavis commented Oct 23, 2023

My original thought was that pulp_deb would only serve old by-hash files but @mdellweg made the point that the publication should serve the old artifacts along with the old metadata.

What if we introduce a second setting PUBLICATION_CACHE_PATH_REGEX that defaults to .* and that users can set so they can decide which paths will be served by old publications? If users want to serve only by hash files, they can set it to .*/by-hash/.*.

@mdellweg
Copy link
Member Author

This all sounds good to me.
However, let's not introduce any new global setting. I believe this "setting" should be on the distribution model (maybe not in the first cut anyway).

daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 1, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 1, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 6, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 6, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 14, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 15, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 15, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 15, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 15, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 20, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 20, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 20, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 20, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 22, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 22, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 22, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 22, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Nov 27, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Dec 11, 2023
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 3, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 26, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 30, 2024
daviddavis added a commit to daviddavis/pulp_deb that referenced this issue Jan 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Story For big issues that probably need to be split up into several tasks.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants