Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make empty/default fields in the index metadata optional #14506

Closed
kornelski opened this issue Sep 6, 2024 · 8 comments
Closed

Make empty/default fields in the index metadata optional #14506

kornelski opened this issue Sep 6, 2024 · 8 comments
Labels
A-registries Area: registries C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review

Comments

@kornelski
Copy link
Contributor

Problem

It is a bit surprising that the index metadata needs explicit "optional": false and "default_features":true for every dependency, even though these fields have defaults and are optional in manifests.

It's also odd that features: {} must still be present in the serialized JSON even when using only features2 field.

Making these fields optional would make it a bit easier to serialize index metadata, and can make the index files a bit smaller (for packages with high enough MSRV).

Proposed Solution

#14491

--- a/src/cargo/sources/registry/index/mod.rs
+++ b/src/cargo/sources/registry/index/mod.rs
@@ -206,6 +206,7 @@ pub struct IndexPackage<'a> {
     #[serde(borrow)]
     pub deps: Vec<RegistryDependency<'a>>,
     /// Set of features defined for the package, i.e., `[features]` table.
+    #[serde(default)]
     pub features: BTreeMap<InternedString, Vec<InternedString>>,
     /// This field contains features with new, extended syntax. Specifically,
     /// namespaced features (`dep:`) and weak dependencies (`pkg?/feat`).
@@ -268,10 +269,13 @@ pub struct RegistryDependency<'a> {
     #[serde(borrow)]
     pub req: Cow<'a, str>,
     /// Set of features enabled for this dependency.
+    #[serde(default)]
     pub features: Vec<InternedString>,
     /// Whether or not this is an optional dependency.
+    #[serde(default)]
     pub optional: bool,
     /// Whether or not default features are enabled.
+    #[serde(default = "default_true")]
     pub default_features: bool,
     /// The target platform for this dependency.
     pub target: Option<Cow<'a, str>>,
@@ -292,6 +296,10 @@ pub struct RegistryDependency<'a> {
     pub lib: bool,
 }

+fn default_true() -> bool {
+    true
+}
+

Notes

No response

@kornelski kornelski added C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-triage Status: This issue is waiting on initial triage. labels Sep 6, 2024
@epage epage added A-registries Area: registries S-needs-team-input Status: Needs input from team on whether/how to proceed. and removed S-triage Status: This issue is waiting on initial triage. labels Sep 6, 2024
@epage
Copy link
Contributor

epage commented Sep 6, 2024

Making these fields optional would make it a bit easier to serialize index metadata,

What makes it easier?

can make the index files a bit smaller (for packages with high enough MSRV).

This is where things get tricky.

We could only do this when we know the MSRV or a registry looks at Cargo.toml to infer an MSRV (e.g. from package.edition). That means this is opt-in which can limit the benefit registries would see from it.

This would also negatively affect error reporting, like #10623. When Cargo can't parse an Index entry, it ignores it and users get errors about not finding a version of a package that works for your version requirement. Ideally, we'd update the code to track the entries that failed to parse and report those as candidates that couldn't be used (and list their MSRV if we can at least extract that).

I believe even older versions of Cargo fail even more catastrophically when an Index entry can't be parsed (e.g. #14237).

One route for handling that is we update the Cargo code to handle the fields being optional and let it sit that way for a significant period of time (1 year? 1 edition?) and then update crates.io to elide these fields only for new publishes and if the MSRV is high enough.

Speaking of those conditions for when to elide the fields, that makes this a bit more complicated for registries. They can't just use serde to handle this. Ideally, we share these data structures, so any complication that gets added to the data structures for these cases will also be applied to Cargo.

@Eh2406
Copy link
Contributor

Eh2406 commented Sep 6, 2024

Most of the existing fields are in fact optional, to handle index entries that meet the description provided by some old version of cargo. (Or more generally, for registries that do not retroactively update their index entries and were published by an old version of cargo.) The main exception are the ones that have been required since the very beginning. I would love for us to have clear documentation about what the impact of missing fields are across cargo versions. I think there are many more fields we could elided in crate.io without causing a problem. Third-party registries, that have a significantly higher minimum compatible cargo could elided many more. I would love to see some of the original ones made optional, with appropriate documentation, even though I don't think it's likely that crates.io can take advantage of it. (We still do our best to support cargo from 1.0 - 1.19 and we keep the git and sparse index the same.)

Selfishly, in my own testing of resolution behavior I have a custom format based on the index where all fields are optional if they equal their default value. This makes it much easier to minimize a failing test case and only show the parts of the index entries that cause the problem. If making the fields optional in Cargo made the test code easier to write that would be extremely helpful. As would a way to take a Cargo::Summary data structure and converted to an index entry.

@kornelski
Copy link
Contributor Author

kornelski commented Sep 6, 2024

What makes it easier?

It's more consistent, and less picky about exact shape of the JSON. Currently there's an arbitrary mix of optional and non-optional fields.

One route for handling that is we update the Cargo code to handle the fields being optional and let it sit that way for a significant period of time (1 year? 1 edition?) and then update crates.io to elide these fields only for new publishes and if the MSRV is high enough.

I think that's a good approach. The sooner Cargo makes these fields optional the easier it will be for crates.io to take advantage of it (eventually).

@ehuss
Copy link
Contributor

ehuss commented Oct 8, 2024

The cargo team talked about this today, and decided to accept this proposal with the exception to not do this for crates.io for the indefinite future. This would potentially be useful for other registries, or for test files (such as in use in cargo's testsuite or pubgrub). The problem with crates.io is the way older cargos treat these entries as being ignored (and may need more investigation on exact behaviors), which severely impacts error reporting. We could potentially revisit that in the distant future, possibly tied to msrv or edition, past the point where the number of people using old cargos is extremely small. This would also need to be very clearly documented to indicate the version where this support was added.

@ehuss ehuss added S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review and removed S-needs-team-input Status: Needs input from team on whether/how to proceed. labels Oct 8, 2024
kornelski added a commit to kornelski/cargo that referenced this issue Nov 19, 2024
kornelski added a commit to kornelski/cargo that referenced this issue Nov 19, 2024
kornelski added a commit to kornelski/cargo that referenced this issue Nov 19, 2024
github-merge-queue bot pushed a commit that referenced this issue Nov 19, 2024
Applies the patch from #14506 + documentation update.

The `kind` field has already been optional in the implementation, but
documented as possibly missing due to bugs. I've changed the
documentation to simply state it's optional.
@epage
Copy link
Contributor

epage commented Nov 19, 2024

CC @Turbo87 as as this proposes adjusting the Index schema (merged in #14838) though don't think there is any immediate crates.io impact.

@Turbo87
Copy link
Member

Turbo87 commented Nov 19, 2024

yeah, unfortunately it seems unlikely for us to be able to take advantage of this any time soon due to the reasons mentioned above, but thanks for the ping.

/cc @rust-lang/crates-io

@weihanglo
Copy link
Member

Closed via #14838

@epage
Copy link
Contributor

epage commented Dec 12, 2024

The problem with crates.io is the way older cargos treat these entries as being ignored (and may need more investigation on exact behaviors), which severely impacts error reporting.

#14927 fixes the reporting and will start the (long) clock for potentially switching the index if changing the index won't break very old versions of Cargo that aren't already broken.

epage added a commit to epage/cargo that referenced this issue Jan 8, 2025
This was inspired by a recent Cargo team discussion on whether we should
generally elide default values.
This will also help with https://rust-lang.github.io/rust-project-goals/2025h1/cargo-plumbing.html

Case studies in schema design:
- rust-lang#14506
- rust-lang#10543
epage added a commit to epage/cargo that referenced this issue Jan 8, 2025
This was inspired by a recent Cargo team discussion on whether we should
generally elide default values.
This will also help with https://rust-lang.github.io/rust-project-goals/2025h1/cargo-plumbing.html

Case studies in schema design:
- rust-lang#14506
- rust-lang#10543
epage added a commit to epage/cargo that referenced this issue Jan 8, 2025
This was inspired by a recent Cargo team discussion on whether we should
generally elide default values.
This will also help with https://rust-lang.github.io/rust-project-goals/2025h1/cargo-plumbing.html

Case studies in schema design:
- rust-lang#14506
- rust-lang#10543
epage added a commit to epage/cargo that referenced this issue Jan 9, 2025
This was inspired by a recent Cargo team discussion on whether we should
generally elide default values.
This will also help with https://rust-lang.github.io/rust-project-goals/2025h1/cargo-plumbing.html

Case studies in schema design:
- rust-lang#14506
- rust-lang#10543
github-merge-queue bot pushed a commit that referenced this issue Jan 9, 2025
### What does this PR try to resolve?

This was inspired by a recent Cargo team discussion on whether we should
generally elide default values.
This will also help with
https://rust-lang.github.io/rust-project-goals/2025h1/cargo-plumbing.html

Case studies in schema design:
- #14506
- #10543

### How should we test and review this PR?

### Additional information
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-registries Area: registries C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review
Projects
None yet
Development

No branches or pull requests

6 participants