Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0088] Nixpkgs Breaking Change Policy #88

Merged
merged 7 commits into from
Nov 17, 2021

Conversation

kevincox
Copy link
Contributor

@kevincox kevincox commented Mar 13, 2021

This specifies the expected workflow as a base for workflows and future automation.

Rendered

Copy link
Member

@FRidh FRidh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good initiative to take this further. Please do limit the line length to make it easier to comment.

The RFC does not go sufficiently into size/scope of PR's this RFC covers nor how it relates to the different development branches (master/staging/staging-next) we have. Motivations are missing for the specific steps in the procedure. Maybe they should be added there or in a separate section.

When talking about what is an important and unimportant change it would be very helpful to first have defined what is considered core and what not. I think it is difficult to define the policy well in this RFC without defining package tiers.

rfcs/0000-nixpkgs-breaking-change-policy.md Outdated Show resolved Hide resolved
rfcs/0000-nixpkgs-breaking-change-policy.md Outdated Show resolved Hide resolved
rfcs/0000-nixpkgs-breaking-change-policy.md Outdated Show resolved Hide resolved
# Unresolved questions
## Critical Packages and Tests

What if a breaking change breaks NixOS tests? There must be packages and tests so critical that we can not merge without them passing? In that case do we leave the PR open until fixed? or use the `staging` branch?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the scope of the change we are talking about? Staging is not for testing or dumping changes in the hope they get fixed; one is expected to follow their change and fix regressions especially when it has reached staging-next.

There are important NixOS tests and tests that are pretty unimportant.

Copy link
Contributor Author

@kevincox kevincox Jun 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are your suggestions? Should we have a list of tests that can't be marked broken? In that case I think we would need to block the initial PR.

@kevincox
Copy link
Contributor Author

Good initiative to take this further. Please do limit the line length to make it easier to comment.

If that is desired I can hard-wrap it on the next push.

The RFC does not go sufficiently into size/scope of PR's this RFC covers nor how it relates to the different development branches (master/staging/staging-next) we have. Motivations are missing for the specific steps in the procedure. Maybe they should be added there or in a separate section.

Right now I am focused on master and the -unstable channels which follow it. I am welcome to feedback on how this should apply to other branches but I am less familiar with them, (although IIUC they already require stability)

When talking about what is an important and unimportant change it would be very helpful to first have defined what is considered core and what not. I think it is difficult to define the policy well in this RFC without defining package tiers.

This is being discussed here: #88 (comment)

This specifies the expected workflow as a base for workflows and future automation.
@kevincox
Copy link
Contributor Author

Updated:

  • Hard wrap lines.
  • Remove many instances of master and refer to "target branch" according to existing policy for picking the target branch.
  • Update number in file name to the PR number.

dependent packages involved in the review to review if the breakage is
justified.
4. The maintainer will contact the maintainers of all dependent, broken
packages (herein called sub-maintainers).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
packages (herein called sub-maintainers).
packages (herein called sub-maintainers).
1. The sub-maintainers will be contacted preferably by the `github` field,
followed by the `email` field of [maintainer-list.nix
](https://github.com/NixOS/nixpkgs/blob/master/maintainers/maintainer-list.nix).
If neither of those fields are set than contacting the maintainer is not
required and the package should be marked as broken after 7 days.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we wait? Or remove this maintainer from the blocking list right away? What is the odds that they randomly notice that we are breaking their package?

Copy link
Member

@sternenseemann sternenseemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I'd say what this RFC lacks most of all is a clear definition of a “breaking change”. From reading it, I get the impression that it is used as “a change which causes a build failure in a reverse dependency”. Clarification of this would be important in my eyes.

Very important would be to clarify that this policy doesn't apply to the introduction of (partial) evaluation errors by e. g. changing an internal interface since we can't work around such a problem using meta attributes and the channel would be held back until all breakage would be fixed.

Additionally, I think we can not just ignore breakage for users, as outlined in a comment.

rfcs/0088-nixpkgs-breaking-change-policy.md Show resolved Hide resolved
6. The maintainer must merge the target branch into their PR branch.
7. The maintainer must mark all still-broken packages as
[broken](https://nixos.org/manual/nixpkgs/stable/#sec-standard-meta-attributes).
8. The maintainer can now merge to the target branch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there should be someone with commit access assigned to managing this process if the maintainer of the derivation that causes the breakage doesn't have commit access, so we can guarantee these relatively strict time frames.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am slightly hesitant to giving special privileges to changes that require dependent updates but maybe this is justified as packages with more dependents are likely more important.

I am also hesitant to put the full burden of merging these changes on an individual.

I guess this is blocking if this policy puts additional difficulties to getting the change merged. But as I see the additional burden is on the author/maintainer, not on the actual process of getting it merged.

rfcs/0088-nixpkgs-breaking-change-policy.md Show resolved Hide resolved
rfcs/0088-nixpkgs-breaking-change-policy.md Show resolved Hide resolved
rfcs/0088-nixpkgs-breaking-change-policy.md Show resolved Hide resolved
@lheckemann lheckemann added status: open for nominations Open for shepherding team nominations and removed status: new labels Apr 1, 2021
@edolstra edolstra changed the title Create RFC for nixpkgs Breaking Change Policy [RFC 0088] Create RFC for nixpkgs Breaking Change Policy Apr 29, 2021
@spacekookie
Copy link
Member

spacekookie commented May 27, 2021

The RFC is open for shepherd nominations btw

kevincox and others added 2 commits May 28, 2021 16:28
@lheckemann
Copy link
Member

Nominations for the shepherd team are welcome! @kevincox maybe you have someone in mind who you think could help move this forward?

@kevincox
Copy link
Contributor Author

@blaggacao would you be interested and able?

@blaggacao
Copy link
Contributor

blaggacao commented Jun 10, 2021

From RFC36 about Shepherd Team:

A team of 3-4 community members defined unanimously by the RFC Steering Committee, responsible for accepting or rejecting a specific RFC.

The resposibility of the team is to guide the discussion as long as it is constructive, new points are brought up and the RFC is iterated on and from time to time summarise the current state of discussion. If this is the case no longer, then the Shepherd Team shall step in with a motion for FCP.

I think I can do that and accept nomination.

@blaggacao
Copy link
Contributor

blaggacao commented Jun 10, 2021

@nrdxp you have valuable experience (bors, hydra) and also afaics aligned interests to further this RFC, would you accept nomination for Shepherd?

@Mic92
Copy link
Member

Mic92 commented Jun 24, 2021

We still need more nomination for this RFC to proceed.

@blaggacao
Copy link
Contributor

blaggacao commented Aug 29, 2021

Shepherds Meeting, Monday, 23th of August 2021

  • How to practically derive new failures? @jonringer
    • Open question for implementation of tooling. This RFC tries to agree on a policy that can then be implemented. @kevincox
  • Maybe start by suggesting a sample to build? @blaggacao
    • 3 hours of build time / 100 packages. Only very general guidelines when interpreting the policy. Effectiveness of this policy depends heavily on tooling support. @kevincox
  • Special role of passtru test? -> codify specific testing contracts in a complementary RFC. @jonringer
    • e.g. "Define passtru tests to secure detection of a "breaking" change according to this RFC"
  • Notification: Noise vs Signal ratio? @jonringer
    • Tooling will have to account for this problem in appropriate ways. The policy in this RFC can be interpreted as a "principle right to be notified" and a 7 day guarantee under certain scenarios. @kevincox
    • Maybe fundamentally conceptualize notification as a publisher endpoint that users can subscribe to and provide some optional additional notification middleware on top of that. @blaggacao
  • Automate detection of notifiable downstream packages shouldn't be too hard. @nrdxp
    • Though currently, none seem readily available.
  • Until tooling is in place, it might be impractical to hard-enforce this policy. However, that does not hinder adoption of this RFC. @kevincox
  • Do we observe different message classes, such as "build failure", "test failure" that we should react/notify differently to? @blaggacao
    • Possibly, but for the sake of simplicity they should all triggering the same 7 day grace period, if deemed eligible. @kevincox

In addition to the above discussion the existing feedback was summarily reviewed. Most of it was reported as addressed. I'd encourage the parties that have given feedback in the past to revise the document under the light of the above discussion and the recent changes and potentially renew their questions / observations to the author.

Otherwise, in my opinion, we might be slowly approaching FCP.

Please also consider expanding on the broader context of this RFC in the author's recent blog post: https://kevincox.ca/2021/08/21/nixpkgs-ci/

@jonringer
Copy link
Contributor

I think most contributors who have tried to push for large changes, and have needed to wait for input from many individuals will agree that some convention about a grace period is needed. In general, this should only affect PRs against staging, as most PRs against master can generally be tested more thoroughly, and the effects of the changes usually carry much lower risk of breaking downstream packages (because there's significantly less of them).

I think we should move this into FCP.

Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
@edolstra edolstra added status: in discussion and removed status: open for nominations Open for shepherding team nominations labels Sep 2, 2021
Co-authored-by: Théo Zimmermann <theo.zimmi@gmail.com>
@lheckemann
Copy link
Member

@jonringer has the shepherd team agreed to move this to FCP with disposition to accept? If so, let us know and we'll update the label.

@blaggacao
Copy link
Contributor

@lheckemann I confirm the shepherd team has agreed to move this to FCP with disposition to accept.

@jonringer
Copy link
Contributor

sorry, got buried under other notifications, yes. We would like to move forward, with disposition to accept.

@Mic92
Copy link
Member

Mic92 commented Nov 3, 2021

We are now in the Final Comment Period (FCP). Unless any major objections are raised, this will be accepted and merged in 10 days from now, on November 13th!

Please post any final comments on this pull request.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfc-0088-fcp-nixpkgs-breaking-change-policy/15876/1

Copy link

@rnhmjoj rnhmjoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with pretty much everything proposed here. It's good to have some formal guidelines and documentation to point users to. Two considerations:

  1. I think the grace period shouldn't be applied strictly. It should be a guideline or made flexible depending on the importance of the change and the sub-package: if a sub-package is not essential but important (some popular library, application, etc.) I'd rather delay a bit the update than to break it.

  2. I fear this could increase the burden on maintainers of mid to large packages (I mean those between being a leaf and core package). Those maintaining core packages are probably already well equipped and eat mass-rebuilds at breakfast, but not everyone else. It would be good if ofBorg, nixpkgs-review or some other new tool would automate doing some of the step in the proposed workflow before this RFC is enforced:

    • detecting broken packages
    • producing a list of contacts to ping the sub-maintainers
    • keeping track of who replied
    • marking packages as broken

5. The maintainer must provide a grace period of at least 7 days from when the
sub-maintainers were notified.
6. The maintainer must merge the target branch into their PR branch.
7. The maintainer must mark all still-broken packages as
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me if "still-broken" packages refers to all the packages or just those in the sample.
In any case, marking many broken packages (even if <100) for every PR seems like a lot of work: it would be good if this could be facilitated by a tool like nixpkgs-review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually it should be all packages. My not-so-secret goal is that one day every single package will be considered "channel blocking" unless it is marked broken, basically meaning that there are no unintentional breakages.

It is expected that tooling will manage this. This policy is not going to be enforced until we have made it reasonably easy to comply. This RFC is a first step to agree agree upon a policy first, then we can build up the tooling to implement the policy.

that a critical output file was not forgotten) Note that sometimes it **is**
necessary to break all dependent packages, and the maintainer is not required
to avoid this.
1. A sample of at least 100 packages or 3h of build time is recommend as a guideline.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a bit too demanding (for potato hardware like mine), but I guess it's ok for a guideline.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with committers and active contributors to use my server: https://github.com/jonringer/server-configuration . If you're worried about compute. Since I'm not opening this up to the general public, it's not a solution, just a mitigation; but the offer is there.

Obviously, this is guideline is meant "within expectations" of a user's ability. Having some sort nixpkgs-review does make it a lot easier for reviewers to merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why we added the time limit and called it a guideline. The idea is even if you only own a potato we hope that you can spare 3h of your potato or find someone else to do so for big changes. If you think 3h is still to muck to ask I'd be happy to consider other thresholds, it was picked mostly arbitrarily to provide a rough target.

@7c6f434c
Copy link
Member

Sounds like nothing FCP-breaking has happened.


- Maintainers of dependencies have a clear framework for handling changes that
break dependants.
- Maintainers of dependants have a clear framework for how dependency breacking
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Maintainers of dependants have a clear framework for how dependency breacking
- Maintainers of dependants have a clear framework for how dependency breaking

- Avoid putting more burden than necessary on the dependency maintainer. If the
maintainers of core derivations face toil proportionally to the number of
transitive dependencies they will quickly become overloaded. These
maintainers are arguably the most critical to Nixpkgs and their load needs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
maintainers are arguably the most critical to Nixpkgs and their load needs
maintainers are arguably the most critical to Nixpkgs and their load needs

@spacekookie spacekookie merged commit a36c8e9 into NixOS:master Nov 17, 2021
@infinisil infinisil added status: accepted and removed status: FCP in Final Comment Period labels Aug 23, 2023
KAction pushed a commit to KAction/rfcs that referenced this pull request Apr 13, 2024
* Create RFC for nixpkgs Breaking Change Policy

This specifies the expected workflow as a base for workflows and future automation.

* Update rfcs/0088-nixpkgs-breaking-change-policy.md

Co-authored-by: asymmetric <lorenzo@mailbox.org>

* Reformat metadata

* Update and clarify.

- Highlight the high-level goal.
- Highlight the low-level goal of this specific RFC.
- Wording fixes.

* Implement suggestions from shepherds meeting.

* Set shepherd leader

Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>

* Typo fix

Co-authored-by: Théo Zimmermann <theo.zimmi@gmail.com>

Co-authored-by: asymmetric <lorenzo@mailbox.org>
Co-authored-by: Linus Heckemann <git@sphalerite.org>
Co-authored-by: Eelco Dolstra <edolstra@gmail.com>
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
Co-authored-by: Théo Zimmermann <theo.zimmi@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.