Skip to content

Commit

Permalink
Support conflicting URL in separate forks (#4435)
Browse files Browse the repository at this point in the history
Downstack PR: #4481

## Introduction

We support forking the dependency resolution to support conflicting
registry requirements for different platforms, say on package range is
required for an older python version while a newer is required for newer
python versions, or dependencies that are different per platform. We
need to extend this support to direct URL requirements.

```toml
dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/ef/a6/62565a6e1cf69e10f5727360368e451d4b7f58beeac6173dc9db836a5b46/iniconfig-2.0.0-py3-none-any.whl ; python_version >= '3.12'",
  "iniconfig @ https://files.pythonhosted.org/packages/9b/dd/b3c12c6d707058fa947864b67f0c4e0c39ef8610988d7baea9578f3c48f3/iniconfig-1.1.1-py2.py3-none-any.whl ; python_version < '3.12'"
]
```

This did not work because `Urls` was built on the assumption that there
is a single allowed URL per package. We collect all allowed URL ahead of
resolution by following direct URL dependencies (including path
dependencies) transitively, i.e. a registry distribution can't require a
URL.

## The same package can have Registry and URL requirements

Consider the following two cases:

requirements.in:
```text
werkzeug==2.0.0
werkzeug @ https://files.pythonhosted.org/packages/ff/1d/960bb4017c68674a1cb099534840f18d3def3ce44aed12b5ed8b78e0153e/Werkzeug-2.0.0-py3-none-any.whl
```
pyproject.toml:
```toml
dependencies = [
  "iniconfig == 1.1.1 ; python_version < '3.12'",
  "iniconfig @ git+https://github.com/pytest-dev/iniconfig@93f5930e668c0d1ddf4597e38dd0dea4e2665e7a ; python_version >= '3.12'",
]
```

In the first case, we want the URL to override the registry dependency,
in the second case we want to fork and have one branch use the registry
and the other the URL. We have to know about this in
`PubGrubRequirement::from_registry_requirement`, but we only fork after
the current method.

Consider the following case too:

a:
```
c==1.0.0
b @ https://b.zip
```
b:
```
c @ https://c_new.zip ; python_version >= '3.12'",
c @ https://c_old.zip ; python_version < '3.12'",
```

When we convert the requirements of `a`, we can't know the url of `c`
yet. The solution is to remove the `Url` from `PubGrubPackage`: The
`Url` is redundant with `PackageName`, there can be only one url per
package name per fork. We now do the following: We track the urls from
requirements in `PubGrubDependency`. After forking, we call
`add_package_version_dependencies` where we apply override URLs, check
if the URL is allowed and check if the url is unique in this fork. When
we request a distribution, we ask the fork urls for the real URL. Since
we prioritize url dependencies over registry dependencies and skip
packages with `Urls` entries in pre-visiting, we know that when fetching
a package, we know if it has a url or not.

## URL conflicts

pyproject.toml (invalid):
```toml
dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/44/39/e96292c7f7068e58877f476908c5974dc76c37c623f1fa332fe4ed6dfbec/iniconfig-1.1.0.tar.gz",
  "iniconfig @ https://files.pythonhosted.org/packages/9b/dd/b3c12c6d707058fa947864b67f0c4e0c39ef8610988d7baea9578f3c48f3/iniconfig-1.1.1-py2.py3-none-any.whl ; python_version < '3.12'",
  "iniconfig @ https://files.pythonhosted.org/packages/ef/a6/62565a6e1cf69e10f5727360368e451d4b7f58beeac6173dc9db836a5b46/iniconfig-2.0.0-py3-none-any.whl ; python_version >= '3.12'",
]
```

On the fork state, we keep `ForkUrls` that check for conflicts after
forking, rejecting the third case because we added two packages of the
same name with different URLs.

We need to flatten out the requirements before transformation into
pubgrub requirements to get the full list of other requirements which
may contain a URL, which was changed in a previous PR: #4430.

## Complex Example

a:
```toml
dependencies = [
  # Force a split
  "anyio==4.3.0 ; python_version >= '3.12'",
  "anyio==4.2.0 ; python_version < '3.12'",
  # Include URLs transitively
  "b"
]
```
b:
```toml
dependencies = [
  # Only one is used in each split.
  "b1 ; python_version < '3.12'",
  "b2 ; python_version >= '3.12'",
  "b3 ; python_version >= '3.12'",
]
```
b1:
```toml
dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/9b/dd/b3c12c6d707058fa947864b67f0c4e0c39ef8610988d7baea9578f3c48f3/iniconfig-1.1.1-py2.py3-none-any.whl",
]
```
b2:
```toml
dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/ef/a6/62565a6e1cf69e10f5727360368e451d4b7f58beeac6173dc9db836a5b46/iniconfig-2.0.0-py3-none-any.whl",
]
```
b3:
```toml
dependencies = [
  "iniconfig @ https://files.pythonhosted.org/packages/44/39/e96292c7f7068e58877f476908c5974dc76c37c623f1fa332fe4ed6dfbec/iniconfig-1.1.0.tar.gz",
]
```

In this example, all packages are url requirements (directory
requirements) and the root package is `a`. We first split on `a`, `b`
being in each split. In the first fork, we reach `b1`, the fork URLs are
empty, we insert the iniconfig 1.1.1 URL, and then we skip over `b2` and
`b3` since the mark is disjoint with the fork markers. In the second
fork, we skip over `b1`, visit `b2`, insert the iniconfig 2.0.0 URL into
the again empty fork URLs, then visit `b3` and try to insert the
iniconfig 1.1.0 URL. At this point we find a conflict for the iniconfig
URL and error.

## Closing

The git tests are slow, but they make the best example for different URL
types i could find.

Part of #3927. This PR does not handle `Locals` or pre-releases yet.
  • Loading branch information
konstin authored Jun 26, 2024
1 parent ca92b55 commit d9dbb8a
Show file tree
Hide file tree
Showing 15 changed files with 1,206 additions and 606 deletions.
5 changes: 5 additions & 0 deletions crates/pep508-rs/src/marker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1586,6 +1586,11 @@ impl MarkerTree {
parse_markers(markers, reporter)
}

/// Whether the marker is `MarkerTree::And(Vec::new())`.
pub fn is_universal(&self) -> bool {
self == &MarkerTree::And(Vec::new())
}

/// Does this marker apply in the given environment?
pub fn evaluate(&self, env: &MarkerEnvironment, extras: &[ExtraName]) -> bool {
self.evaluate_optional_environment(Some(env), extras)
Expand Down
63 changes: 62 additions & 1 deletion crates/pypi-types/src/requirement.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@ use pep508_rs::{MarkerEnvironment, MarkerTree, RequirementOrigin, VerbatimUrl, V
use uv_git::{GitReference, GitSha};
use uv_normalize::{ExtraName, PackageName};

use crate::{ParsedUrl, VerbatimParsedUrl};
use crate::{
ParsedArchiveUrl, ParsedDirectoryUrl, ParsedGitUrl, ParsedPathUrl, ParsedUrl, VerbatimParsedUrl,
};

/// A representation of dependency on a package, an extension over a PEP 508's requirement.
///
Expand Down Expand Up @@ -251,6 +253,65 @@ impl RequirementSource {
}
}

/// Convert the source to a [`VerbatimParsedUrl`], if it's a URL source.
pub fn to_verbatim_parsed_url(&self) -> Option<VerbatimParsedUrl> {
match &self {
Self::Registry { .. } => None,
Self::Url {
subdirectory,
location,
url,
} => Some(VerbatimParsedUrl {
parsed_url: ParsedUrl::Archive(ParsedArchiveUrl::from_source(
location.clone(),
subdirectory.clone(),
)),
verbatim: url.clone(),
}),
Self::Path {
install_path,
lock_path,
url,
} => Some(VerbatimParsedUrl {
parsed_url: ParsedUrl::Path(ParsedPathUrl::from_source(
install_path.clone(),
lock_path.clone(),
url.to_url(),
)),
verbatim: url.clone(),
}),
Self::Directory {
install_path,
lock_path,
editable,
url,
} => Some(VerbatimParsedUrl {
parsed_url: ParsedUrl::Directory(ParsedDirectoryUrl::from_source(
install_path.clone(),
lock_path.clone(),
*editable,
url.to_url(),
)),
verbatim: url.clone(),
}),
Self::Git {
repository,
reference,
precise,
subdirectory,
url,
} => Some(VerbatimParsedUrl {
parsed_url: ParsedUrl::Git(ParsedGitUrl::from_source(
repository.clone(),
reference.clone(),
*precise,
subdirectory.clone(),
)),
verbatim: url.clone(),
}),
}
}

/// Returns `true` if the source is editable.
pub fn is_editable(&self) -> bool {
matches!(self, Self::Directory { editable: true, .. })
Expand Down
33 changes: 26 additions & 7 deletions crates/uv-resolver/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,12 @@ use rustc_hash::{FxHashMap, FxHashSet};

use distribution_types::{BuiltDist, IndexLocations, InstalledDist, SourceDist};
use pep440_rs::Version;
use pep508_rs::Requirement;
use pep508_rs::{MarkerTree, Requirement};
use uv_normalize::PackageName;

use crate::candidate_selector::CandidateSelector;
use crate::dependency_provider::UvDependencyProvider;
use crate::fork_urls::ForkUrls;
use crate::pubgrub::{
PubGrubPackage, PubGrubPackageInner, PubGrubReportFormatter, PubGrubSpecifierError,
};
Expand Down Expand Up @@ -48,11 +49,18 @@ pub enum ResolveError {
#[error(transparent)]
PubGrubSpecifier(#[from] PubGrubSpecifierError),

#[error("Requirements contain conflicting URLs for package `{0}`:\n- {1}\n- {2}")]
ConflictingUrlsDirect(PackageName, String, String),
#[error("Overrides contain conflicting URLs for package `{0}`:\n- {1}\n- {2}")]
ConflictingOverrideUrls(PackageName, String, String),

#[error("There are conflicting URLs for package `{0}`:\n- {1}\n- {2}")]
ConflictingUrlsTransitive(PackageName, String, String),
#[error("Requirements contain conflicting URLs for package `{0}`:\n- {}", _1.join("\n- "))]
ConflictingUrls(PackageName, Vec<String>),

#[error("Requirements contain conflicting URLs for package `{package_name}` in split `{fork_markers}`:\n- {}", urls.join("\n- "))]
ConflictingUrlsInFork {
package_name: PackageName,
urls: Vec<String>,
fork_markers: MarkerTree,
},

#[error("Package `{0}` attempted to resolve via URL: {1}. URL dependencies must be expressed as direct requirements or constraints. Consider adding `{0} @ {1}` to your dependencies or constraints file.")]
DisallowedUrl(PackageName, String),
Expand Down Expand Up @@ -156,8 +164,16 @@ fn collapse_proxies(
}
}

impl From<pubgrub::error::PubGrubError<UvDependencyProvider>> for ResolveError {
fn from(value: pubgrub::error::PubGrubError<UvDependencyProvider>) -> Self {
impl ResolveError {
/// Convert an error from PubGrub to a resolver error.
///
/// [`ForkUrls`] breaks the usual pattern used here since it's part of one the [`SolveState`],
/// not of the [`ResolverState`], so we have to take it from the fork that errored instead of
/// being able to add it later.
pub(crate) fn from_pubgrub_error(
value: pubgrub::error::PubGrubError<UvDependencyProvider>,
fork_urls: ForkUrls,
) -> Self {
match value {
// These are all never type variant that can never match, but never is experimental
pubgrub::error::PubGrubError::ErrorChoosingPackageVersion(_)
Expand All @@ -178,6 +194,7 @@ impl From<pubgrub::error::PubGrubError<UvDependencyProvider>> for ResolveError {
index_locations: None,
unavailable_packages: FxHashMap::default(),
incomplete_packages: FxHashMap::default(),
fork_urls,
})
}
pubgrub::error::PubGrubError::SelfDependency { package, version } => {
Expand All @@ -200,6 +217,7 @@ pub struct NoSolutionError {
index_locations: Option<IndexLocations>,
unavailable_packages: FxHashMap<PackageName, UnavailablePackage>,
incomplete_packages: FxHashMap<PackageName, BTreeMap<Version, IncompletePackage>>,
fork_urls: ForkUrls,
}

impl std::error::Error for NoSolutionError {}
Expand All @@ -222,6 +240,7 @@ impl std::fmt::Display for NoSolutionError {
&self.index_locations,
&self.unavailable_packages,
&self.incomplete_packages,
&self.fork_urls,
) {
write!(f, "\n\n{hint}")?;
}
Expand Down
62 changes: 62 additions & 0 deletions crates/uv-resolver/src/fork_urls.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
use std::collections::hash_map::Entry;

use rustc_hash::FxHashMap;

use distribution_types::Verbatim;
use pep508_rs::MarkerTree;
use pypi_types::VerbatimParsedUrl;
use uv_normalize::PackageName;

use crate::ResolveError;

/// See [`crate::resolver::SolveState`].
#[derive(Default, Debug, Clone)]
pub(crate) struct ForkUrls(FxHashMap<PackageName, VerbatimParsedUrl>);

impl ForkUrls {
/// Get the URL previously used for a package in this fork.
pub(crate) fn get(&self, package_name: &PackageName) -> Option<&VerbatimParsedUrl> {
self.0.get(package_name)
}

/// Whether we use a URL for this package.
pub(crate) fn contains_key(&self, package_name: &PackageName) -> bool {
self.0.contains_key(package_name)
}

/// Check that this is the only URL used for this package in this fork.
pub(crate) fn insert(
&mut self,
package_name: &PackageName,
url: &VerbatimParsedUrl,
fork_markers: &MarkerTree,
) -> Result<(), ResolveError> {
match self.0.entry(package_name.clone()) {
Entry::Occupied(previous) => {
if previous.get() != url {
let mut conflicting_url = vec![
previous.get().verbatim.verbatim().to_string(),
url.verbatim.verbatim().to_string(),
];
conflicting_url.sort();
return if fork_markers.is_universal() {
Err(ResolveError::ConflictingUrls(
package_name.clone(),
conflicting_url,
))
} else {
Err(ResolveError::ConflictingUrlsInFork {
package_name: package_name.clone(),
urls: conflicting_url,
fork_markers: fork_markers.clone(),
})
};
}
}
Entry::Vacant(vacant) => {
vacant.insert(url.clone());
}
}
Ok(())
}
}
1 change: 1 addition & 0 deletions crates/uv-resolver/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ mod error;
mod exclude_newer;
mod exclusions;
mod flat_index;
mod fork_urls;
mod lock;
mod manifest;
mod marker;
Expand Down
38 changes: 32 additions & 6 deletions crates/uv-resolver/src/manifest.rs
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,16 @@ impl Manifest {
&'a self,
markers: Option<&'a MarkerEnvironment>,
mode: DependencyMode,
) -> impl Iterator<Item = &Requirement> + 'a {
self.requirements_no_overrides(markers, mode)
.chain(self.overrides(markers, mode))
}

/// Like [`Self::requirements`], but without the overrides.
pub fn requirements_no_overrides<'a>(
&'a self,
markers: Option<&'a MarkerEnvironment>,
mode: DependencyMode,
) -> impl Iterator<Item = &Requirement> + 'a {
match mode {
// Include all direct and transitive requirements, with constraints and overrides applied.
Expand All @@ -120,19 +130,35 @@ impl Manifest {
self.constraints
.requirements()
.filter(move |requirement| requirement.evaluate_markers(markers, &[])),
)
.chain(
self.overrides
.requirements()
.filter(move |requirement| requirement.evaluate_markers(markers, &[])),
),
),
// Include direct requirements, with constraints and overrides applied.
DependencyMode::Direct => Either::Right(
self.overrides
.apply(&self.requirements)
.chain(self.constraints.requirements())
.chain(self.overrides.requirements())
.filter(move |requirement| requirement.evaluate_markers(markers, &[])),
),
}
}

/// Only the overrides from [`Self::requirements`].
pub fn overrides<'a>(
&'a self,
markers: Option<&'a MarkerEnvironment>,
mode: DependencyMode,
) -> impl Iterator<Item = &Requirement> + 'a {
match mode {
// Include all direct and transitive requirements, with constraints and overrides applied.
DependencyMode::Transitive => Either::Left(
self.overrides
.requirements()
.filter(move |requirement| requirement.evaluate_markers(markers, &[])),
),
// Include direct requirements, with constraints and overrides applied.
DependencyMode::Direct => Either::Right(
self.overrides
.requirements()
.filter(move |requirement| requirement.evaluate_markers(markers, &[])),
),
}
Expand Down
Loading

0 comments on commit d9dbb8a

Please sign in to comment.