Skip to content

Commit

Permalink
Support conflicting URL in separate forks
Browse files Browse the repository at this point in the history
## Introduction

We support forking the dependency resolution to support conflicting registry requirements for different platforms, say on package range is required for an older python version while a newer is required for newer python versions, or dependencies that are different per platform. We need to extend this support to direct URL requirements.

```toml
dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/ef/a6/62565a6e1cf69e10f5727360368e451d4b7f58beeac6173dc9db836a5b46/iniconfig-2.0.0-py3-none-any.whl ; python_version >= '3.12'",
  "iniconfig @ https://files.pythonhosted.org/packages/9b/dd/b3c12c6d707058fa947864b67f0c4e0c39ef8610988d7baea9578f3c48f3/iniconfig-1.1.1-py2.py3-none-any.whl ; python_version < '3.12'"
]
```

This did not work because `Urls` was built on the assumption that there is a single allowed URL per package. We collect all allowed URL ahead of resolution by following direct URL dependencies (including path dependencies) transitively, i.e. a registry distribution can't require a URL.

## Registry and URL requirement in the same package

Consider the following two cases:

requirements.in:
```text
werkzeug==2.0.0
werkzeug @ https://files.pythonhosted.org/packages/ff/1d/960bb4017c68674a1cb099534840f18d3def3ce44aed12b5ed8b78e0153e/Werkzeug-2.0.0-py3-none-any.whl
```
pyproject.toml:
```toml
dependencies = [
  "iniconfig == 1.1.1 ; python_version < '3.12'",
  "iniconfig @ git+https://github.com/pytest-dev/iniconfig@93f5930e668c0d1ddf4597e38dd0dea4e2665e7a ; python_version >= '3.12'",
]
```

In the first case, we want the URL to override the registry dependency, in the second case we want to fork and have one branch use the registry and the other the URL. We have to know about this in `PubGrubRequirement::from_requirement`, but we only fork after the current method. Instead, we can check if the registry and the URL requirement for the same name are disjoint: If they are, they must end up in different branches on forking, if they aren't, we know the URL overrides the registry.

An alternative implementation would be splitting in the lookahead resolver so that all URLs are already tagged by the markers of their fork. This would require duplicating the forking logic while making sure that we follow the exact same precedence rules, which is why i didn't do it.

## Determining the URL of a Requirement

For a registry requirement:
* If there is a URL override: Use that override, done.
* If there is a fork-specific URL: Use that, done.
* If there is another requirement in the same package with a URL requirement find a matching URL in `Urls`, use it to canonicalize the current URL.
* After forking: Check that this URL is the only URL in this fork.

For a URL requirement:
* If there is a URL override: Use that override, done.
* Find a matching URL in `Urls`, use it to canonicalize the current URL.
* After forking: Check that this URL is the only URL in this fork.

## URL conflicts

pyproject.toml (invalid):
```toml
dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/44/39/e96292c7f7068e58877f476908c5974dc76c37c623f1fa332fe4ed6dfbec/iniconfig-1.1.0.tar.gz",
  "iniconfig @ https://files.pythonhosted.org/packages/9b/dd/b3c12c6d707058fa947864b67f0c4e0c39ef8610988d7baea9578f3c48f3/iniconfig-1.1.1-py2.py3-none-any.whl ; python_version < '3.12'",
  "iniconfig @ https://files.pythonhosted.org/packages/ef/a6/62565a6e1cf69e10f5727360368e451d4b7f58beeac6173dc9db836a5b46/iniconfig-2.0.0-py3-none-any.whl ; python_version >= '3.12'",
]
```

On the fork state, we keep `ForkUrls` that check for conflicts after forking, rejecting the third case because we added two packages of the same name with different URLs.

We need to flatten out the requirements before transformation into pubgrub requirements to get the full list of other requirements which may contain a URL, which was changed in a previous PR: #4430.

## Complex Example

a:
```toml
dependencies = [
  # Force a split
  "anyio==4.3.0 ; python_version >= '3.12'",
  "anyio==4.2.0 ; python_version < '3.12'",
  # Include URLs transitively
  "b"
]
```
b:
```toml
dependencies = [
  # Only one is used in each split.
  "b1 ; python_version < '3.12'",
  "b2 ; python_version >= '3.12'",
  "b3 ; python_version >= '3.12'",
]
```
b1:
```toml
dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/9b/dd/b3c12c6d707058fa947864b67f0c4e0c39ef8610988d7baea9578f3c48f3/iniconfig-1.1.1-py2.py3-none-any.whl",
]
```
b2:
```toml
dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/ef/a6/62565a6e1cf69e10f5727360368e451d4b7f58beeac6173dc9db836a5b46/iniconfig-2.0.0-py3-none-any.whl",
]
```
b3:
```toml
dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/44/39/e96292c7f7068e58877f476908c5974dc76c37c623f1fa332fe4ed6dfbec/iniconfig-1.1.0.tar.gz",
]
```

In this example, all packages are url requirements (directory requirements) and the root package is `a`. We first split on `a`, `b` being in each split. In the first fork, we reach `b1`, the fork URLs are empty, we insert the iniconfig 1.1.1 URL, and then we skip over `b2` and `b3` since the mark is disjoint with the fork markers. In the second fork, we skip over `b1`, visit `b2`, insert the iniconfig 2.0.0 URL into the again empty fork URLs, then visit `b3` and try to insert the iniconfig 1.1.0 URL. At this point we find a conflict for the iniconfig URL and error.

## Closing

Best reviewed commit-by-commit, the tests are verbose.

The git tests are slow, but they make the best example for different URL types i could find.

Part of #3927. This PR does not handle `Locals` yet.
  • Loading branch information
konstin committed Jun 21, 2024
1 parent bc19961 commit 28f075c
Show file tree
Hide file tree
Showing 11 changed files with 511 additions and 294 deletions.
64 changes: 63 additions & 1 deletion crates/pypi-types/src/requirement.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@ use pep508_rs::{MarkerEnvironment, MarkerTree, RequirementOrigin, VerbatimUrl, V
use uv_git::{GitReference, GitSha};
use uv_normalize::{ExtraName, PackageName};

use crate::{ParsedUrl, VerbatimParsedUrl};
use crate::{
ParsedArchiveUrl, ParsedDirectoryUrl, ParsedGitUrl, ParsedPathUrl, ParsedUrl, VerbatimParsedUrl,
};

/// A representation of dependency on a package, an extension over a PEP 508's requirement.
///
Expand Down Expand Up @@ -251,6 +253,66 @@ impl RequirementSource {
}
}

pub fn to_verbatim_parsed_url(&self) -> Option<VerbatimParsedUrl> {
Some(match &self {
Self::Registry { .. } => {
return None;
}
Self::Url {
subdirectory,
location,
url,
} => VerbatimParsedUrl {
parsed_url: ParsedUrl::Archive(ParsedArchiveUrl::from_source(
location.clone(),
subdirectory.clone(),
)),
verbatim: url.clone(),
},
Self::Path {
install_path,
lock_path,
url,
} => VerbatimParsedUrl {
parsed_url: ParsedUrl::Path(ParsedPathUrl::from_source(
install_path.clone(),
lock_path.clone(),
url.to_url(),
)),
verbatim: url.clone(),
},
Self::Directory {
install_path,
lock_path,
editable,
url,
} => VerbatimParsedUrl {
parsed_url: ParsedUrl::Directory(ParsedDirectoryUrl::from_source(
install_path.clone(),
lock_path.clone(),
*editable,
url.to_url(),
)),
verbatim: url.clone(),
},
Self::Git {
repository,
reference,
precise,
subdirectory,
url,
} => VerbatimParsedUrl {
parsed_url: ParsedUrl::Git(ParsedGitUrl::from_source(
repository.clone(),
reference.clone(),
*precise,
subdirectory.clone(),
)),
verbatim: url.clone(),
},
})
}

/// Returns `true` if the source is editable.
pub fn is_editable(&self) -> bool {
matches!(self, Self::Directory { editable: true, .. })
Expand Down
17 changes: 12 additions & 5 deletions crates/uv-resolver/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ use rustc_hash::{FxHashMap, FxHashSet};

use distribution_types::{BuiltDist, IndexLocations, InstalledDist, SourceDist};
use pep440_rs::Version;
use pep508_rs::Requirement;
use pep508_rs::{MarkerTree, Requirement};
use uv_normalize::PackageName;

use crate::candidate_selector::CandidateSelector;
Expand Down Expand Up @@ -48,11 +48,18 @@ pub enum ResolveError {
#[error(transparent)]
PubGrubSpecifier(#[from] PubGrubSpecifierError),

#[error("Requirements contain conflicting URLs for package `{0}`:\n- {1}\n- {2}")]
ConflictingUrlsDirect(PackageName, String, String),
#[error("Overrides contain conflicting URLs for package `{0}`:\n- {1}\n- {2}")]
ConflictingOverrideUrls(PackageName, String, String),

#[error("There are conflicting URLs for package `{0}`:\n- {1}\n- {2}")]
ConflictingUrlsTransitive(PackageName, String, String),
#[error("Requirements contain conflicting URLs for package `{0}`:\n- {}", _1.join("\n- "))]
ConflictingUrls(PackageName, Vec<String>),

#[error("Requirements contain conflicting URLs for package `{package_name}` in split `{fork_markers}`:\n- {}", urls.join("\n- "))]
ConflictingUrlsInFork {
package_name: PackageName,
urls: Vec<String>,
fork_markers: MarkerTree,
},

#[error("Package `{0}` attempted to resolve via URL: {1}. URL dependencies must be expressed as direct requirements or constraints. Consider adding `{0} @ {1}` to your dependencies or constraints file.")]
DisallowedUrl(PackageName, String),
Expand Down
56 changes: 56 additions & 0 deletions crates/uv-resolver/src/fork_urls.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
use std::collections::hash_map::Entry;

use rustc_hash::FxHashMap;

use distribution_types::Verbatim;
use pep508_rs::MarkerTree;
use pypi_types::VerbatimParsedUrl;
use uv_normalize::PackageName;

use crate::ResolveError;

/// See [`crate::resolver::SolveState`].
#[derive(Default, Clone)]
pub(crate) struct ForkUrls(FxHashMap<PackageName, VerbatimParsedUrl>);

impl ForkUrls {
/// Get the URL previously used for a package in this fork.
pub(crate) fn get(&self, package_name: &PackageName) -> Option<&VerbatimParsedUrl> {
self.0.get(package_name)
}

/// Check that this is the only URL used for this package in this fork.
pub(crate) fn insert(
&mut self,
package_name: &PackageName,
url: &VerbatimParsedUrl,
fork_markers: &MarkerTree,
) -> Result<(), ResolveError> {
match self.0.entry(package_name.clone()) {
Entry::Occupied(previous) => {
if previous.get() != url {
let conflicting_url = vec![
previous.get().verbatim.verbatim().to_string(),
url.to_string(),
];
return if fork_markers == &MarkerTree::And(Vec::new()) {
Err(ResolveError::ConflictingUrls(
package_name.clone(),
conflicting_url,
))
} else {
Err(ResolveError::ConflictingUrlsInFork {
package_name: package_name.clone(),
urls: conflicting_url,
fork_markers: fork_markers.clone(),
})
};
}
}
Entry::Vacant(vacant) => {
vacant.insert(url.clone());
}
}
Ok(())
}
}
1 change: 1 addition & 0 deletions crates/uv-resolver/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ mod error;
mod exclude_newer;
mod exclusions;
mod flat_index;
mod fork_urls;
mod lock;
mod manifest;
mod marker;
Expand Down
38 changes: 32 additions & 6 deletions crates/uv-resolver/src/manifest.rs
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,16 @@ impl Manifest {
&'a self,
markers: Option<&'a MarkerEnvironment>,
mode: DependencyMode,
) -> impl Iterator<Item = &Requirement> + 'a {
self.requirements_no_overrides(markers, mode)
.chain(self.overrides(markers, mode))
}

/// Like [`Self::requirements`], but without the overrides.
pub fn requirements_no_overrides<'a>(
&'a self,
markers: Option<&'a MarkerEnvironment>,
mode: DependencyMode,
) -> impl Iterator<Item = &Requirement> + 'a {
match mode {
// Include all direct and transitive requirements, with constraints and overrides applied.
Expand All @@ -119,19 +129,35 @@ impl Manifest {
self.constraints
.requirements()
.filter(move |requirement| requirement.evaluate_markers(markers, &[])),
)
.chain(
self.overrides
.requirements()
.filter(move |requirement| requirement.evaluate_markers(markers, &[])),
),
),
// Include direct requirements, with constraints and overrides applied.
DependencyMode::Direct => Either::Right(
self.overrides
.apply(&self.requirements)
.chain(self.constraints.requirements())
.chain(self.overrides.requirements())
.filter(move |requirement| requirement.evaluate_markers(markers, &[])),
),
}
}

/// Only the overrides from [`Self::requirements`].
pub fn overrides<'a>(
&'a self,
markers: Option<&'a MarkerEnvironment>,
mode: DependencyMode,
) -> impl Iterator<Item = &Requirement> + 'a {
match mode {
// Include all direct and transitive requirements, with constraints and overrides applied.
DependencyMode::Transitive => Either::Left(
self.overrides
.requirements()
.filter(move |requirement| requirement.evaluate_markers(markers, &[])),
),
// Include direct requirements, with constraints and overrides applied.
DependencyMode::Direct => Either::Right(
self.overrides
.requirements()
.filter(move |requirement| requirement.evaluate_markers(markers, &[])),
),
}
Expand Down
Loading

0 comments on commit 28f075c

Please sign in to comment.