Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addfileprovides_queue: return queues containing all added provides #531

Closed
wants to merge 1 commit into from

Conversation

kontura
Copy link
Contributor

@kontura kontura commented Jun 5, 2023

Change addfileprovides_queue() to return queues containing all added provides.

(They can be stored in repository meta section REPOSITORY_ADDEDFILEPROVIDES like usual:
https://github.com/openSUSE/libsolv/blob/master/examples/solv/fileprovides.c#L73)

Before this change it returned file dependencies and storing those can lead to problems:

  1. first load metadata without filelists
  2. call pool_addfileprovides_queue(...)
  3. and store queues in REPOSITORY_ADDEDFILEPROVIDES
  4. then on the next run filelists are loaded but during adding fileprovides none are added because stored file dependecies match required file dependencies.

This can be solved by returning added file provides instead. This also matches with the REPOSITORY_ADDEDFILEPROVIDES name.

This is changing API.
I am not sure if it is a good solution to the problem.

The queues returned can be stored in repository meta section
REPOSITORY_ADDEDFILEPROVIDES:
https://github.com/openSUSE/libsolv/blob/master/examples/solv/fileprovides.c#L73

Storing file dependencies can lead to problems:
- first load metadata without filelists
- call pool_addfileprovides_queue(...)
- and store queues in REPOSITORY_ADDEDFILEPROVIDES
then on the next run filelists are loaded but during adding fileprovides none
are added because stored file dependecies match required file dependencies.

This can be solved by returning added file provides instead.
This also matches with the REPOSITORY_ADDEDFILEPROVIDES name.
@kontura kontura changed the title addfileprovides_queue: return an array containing all added provides addfileprovides_queue: return queues containing all added provides Jun 27, 2023
@mlschroe
Copy link
Member

I don't really understand the problem you're trying to solv. The REPOSITORY_ADDEDFILEPROVIDES list should contain all the file list provides that have been added to the package provides. I.e. they need to include all the old entries.

I don't understand why dnf5 repoclosure does not work correctly in the dnf5 issue. If a file is in the addedfileprovides list of a repository, it has been added to the provides of the packages that contain it and thus should not be reported as missing. So the old on-disk solv file seems to be in an inconsistent state.

(Btw you shouldn't need to mess with the considered map in make_provides_ready() as you set the POOL_FLAG_WHATPROVIDESWITHDISABLED flag. Just saying...)

@kontura
Copy link
Contributor Author

kontura commented Jul 25, 2023

I don't really understand the problem you're trying to solv. The REPOSITORY_ADDEDFILEPROVIDES list should contain all the file list provides that have been added to the package provides. I.e. they need to include all the old entries.

It is quite possible we have somehow messed up the solvfile which is causing the issues.
However the way I understand the problem and what I am trying to solve is this:
First we load just primary.xml, for example:

<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns="http://linux.duke.edu/metadata/common" xmlns:rpm="http://linux.duke.edu/metadata/rpm" packages="4">
<package type="rpm">
  <name>agua</name>
  <arch>x86_64</arch>
  <version epoch="0" ver="1.0" rel="1.fc29"/>
  <checksum type="sha256" pkgid="YES">ade558f47b9c4ca4de9a53d90c48a84ef6c6007908e86bfb71f2e576b055ddb9</checksum>
  <summary>Made up package</summary>
  <description>planta description</description>
  <packager></packager>
  <url>None</url>
  <time file="1690201478" build="1690201478"/>
  <size package="6627" installed="0" archive="252"/>
  <location href="x86_64/agua-1.0-1.fc29.x86_64.rpm"/>
  <format>
    <rpm:license>GPLv3+</rpm:license>
    <rpm:vendor></rpm:vendor>
    <rpm:group>Unspecified</rpm:group>
    <rpm:buildhost>7e6d7c9478f2</rpm:buildhost>
    <rpm:sourcerpm>agua-1.0-1.fc29.src.rpm</rpm:sourcerpm>
    <rpm:header-range start="4504" end="6525"/>
    <rpm:provides>
      <rpm:entry name="agua" flags="EQ" epoch="0" ver="1.0" rel="1.fc29"/>
      <rpm:entry name="agua(x86-64)" flags="EQ" epoch="0" ver="1.0" rel="1.fc29"/>
    </rpm:provides>
  </format>
</package>
<package type="rpm">
  <name>planta</name>
  <arch>x86_64</arch>
  <version epoch="0" ver="1.0" rel="1.fc29"/>
  <checksum type="sha256" pkgid="YES">7686cdfd416a33058736d4887ef7632dd653133f1643efe90967866c05849a5d</checksum>
  <summary>Made up package</summary>
  <description>planta description</description>
  <packager></packager>
  <url>None</url>
  <time file="1690201478" build="1690201478"/>
  <size package="6146" installed="0" archive="124"/>
  <location href="x86_64/planta-1.0-1.fc29.x86_64.rpm"/>
  <format>
    <rpm:license>GPLv3+</rpm:license>
    <rpm:vendor></rpm:vendor>
    <rpm:group>Unspecified</rpm:group>
    <rpm:buildhost>7e6d7c9478f2</rpm:buildhost>
    <rpm:sourcerpm>planta-1.0-1.fc29.src.rpm</rpm:sourcerpm>
    <rpm:header-range start="4504" end="6097"/>
    <rpm:provides>
      <rpm:entry name="planta" flags="EQ" epoch="0" ver="1.0" rel="1.fc29"/>
      <rpm:entry name="planta(x86-64)" flags="EQ" epoch="0" ver="1.0" rel="1.fc29"/>
    </rpm:provides>
    <rpm:requires>
      <rpm:entry name="/a/path/to/water"/>
    </rpm:requires>
  </format>
</package>
</metadata>

During loading we call addfileprovides_queue and it fills the queue with /a/path/to/water which is stored in REPOSITORY_ADDEDFILEPROVIDES.
(From my point of view this is a file dependency not a provide. Because only planta requires the file but nothing provides it.)

Then on the next run we also load filelists:

<?xml version="1.0" encoding="UTF-8"?>
<filelists xmlns="http://linux.duke.edu/metadata/filelists" packages="4">
<package pkgid="ade558f47b9c4ca4de9a53d90c48a84ef6c6007908e86bfb71f2e576b055ddb9" name="agua" arch="x86_64">
  <version epoch="0" ver="1.0" rel="1.fc29"/>
  <file>/a/path/to/water</file>
</package>
<package pkgid="7686cdfd416a33058736d4887ef7632dd653133f1643efe90967866c05849a5d" name="planta" arch="x86_64">
  <version epoch="0" ver="1.0" rel="1.fc29"/>
</package>

Here is the missing file provide but once addfileprovides_queue is called again it doesn't add it. I think because here its already included. Which results in not satisfied dependency breaking the repoclosure.

(Btw you shouldn't need to mess with the considered map in make_provides_ready() as you set the POOL_FLAG_WHATPROVIDESWITHDISABLED flag. Just saying...)

Thanks, I will look into it.

@mlschroe
Copy link
Member

So it may happen that the full filelist is not available when you call pool_addfileprovides()? That won't work. You have to options:

  1. do not set REPOSITORY_ADDEDFILEPROVIDES in that case as the input data was not complete.

  2. set POOL_FLAG_ADDFILEPROVIDESFILTERED when creating the pool. This will make pool_addfileprovides only add the filtered file provides of the primary file, so the complete file list is not needed. It will also setup the provides data so that the entries not included in the primary file are added lazily on demand.

Option 2 makes sense if the filelist is only needed in rare cases. I don't know if this is true for your use cases.

@j-mracek
Copy link
Contributor

2. FILEPROVIDESFILTERED when creating the pool. This will make pool_addfileprovides only add the filtered file provides of the primary file, so the complete file list is not needed. It will also setup the provides data so that the entries not included in the primary file are added lazily on demand.

May I ask you what you mean by on demand? When transaction fails because file provide not available or the demand is triggered by API call?

Additionally I have to mention that DNF5 does not even download file lists by default. The download of file lists is optional and configurable.

@mlschroe
Copy link
Member

On demand means when libsolv needs to know what packages contain the file provides. I.e. when it is asked to resolve a file dependency.

The same mechanism is used when something like foo > 2.3 is resolved, it first checks if the result is cached and if not it goes through the packages providing 'foo' and matches the version part.

We could do that all the time for file provides, but it is much faster to first go through all packages and collect the needed file dependencies and then do one search through the filelist. That's exactly what pool_addfileprovides() does.

Now if only a few packages contain file dependencies outside of the primary data, it makes sense to use a hybrid scheme: resolve all file dependencies from primary right away and do searches for the missing ones. That way the complete filelist is not needed most of the time.

This can be used to solve the problem you have, because in this mode it does not matter if the complete filelist is available or not when pool_addfileprovides() is called.

@j-mracek
Copy link
Contributor

@mlschroe Thank you for the nice explanation

@kontura
Copy link
Contributor Author

kontura commented Jul 26, 2023

Yes, thanks for the help!
I made a dnf5 PR: rpm-software-management/dnf5#777 to add the POOL_FLAG_ADDFILEPROVIDESFILTERED.

@kontura kontura closed this Jul 26, 2023
github-merge-queue bot pushed a commit to rpm-software-management/dnf5 that referenced this pull request Aug 10, 2023
kontura added a commit to kontura/libdnf that referenced this pull request Jun 25, 2024
Since dnf4 now also conditionally load filelists it ran into the same
problem as dnf5 here: rpm-software-management/dnf5#520

Additional details in: openSUSE/libsolv#531
kontura added a commit to kontura/libdnf that referenced this pull request Jun 25, 2024
Since dnf4 now also conditionally load filelists it ran into the same
problem as dnf5 here: rpm-software-management/dnf5#520

Additional details in: openSUSE/libsolv#531
ppisar pushed a commit to rpm-software-management/libdnf that referenced this pull request Jul 2, 2024
Since dnf4 now also conditionally load filelists it ran into the same
problem as dnf5 here: rpm-software-management/dnf5#520

Additional details in: openSUSE/libsolv#531
ppisar pushed a commit to rpm-software-management/libdnf that referenced this pull request Aug 30, 2024
Since dnf4 now also conditionally load filelists it ran into the same
problem as dnf5 here: rpm-software-management/dnf5#520

Additional details in: openSUSE/libsolv#531
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants