-
-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC 0172] Mergebot #172
base: master
Are you sure you want to change the base?
[RFC 0172] Mergebot #172
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
**Feature:** Enhance package meta fields to dictate mergebot behavior | ||
**Start Date:** `[Today's Date in YYYY-MM-DD format]` | ||
**Author:** Lassulus | ||
**Co-authors:** Scriptkiddi (additional co-author to be determined) | ||
**Shepherd Team:** `[To be nominated and accepted by the RFC Steering Committee]` | ||
**Shepherd Leader:** `[To be appointed by the RFC Steering Committee]` | ||
**Related Issues:** `[Links to implementation PRs will be added here]` | ||
|
||
## Summary | ||
|
||
This RFC proposes the introduction of four new merge strategies for `nixpkgs`, aiming to provide more nuanced merge permissions. | ||
One of the main goals of this RFC is to extend the capabilities of the mergebot beyond PRs of just r-ryantm. | ||
These strategies facilitate automatic updates and backports from trusted sources and allow maintainers the choice between requiring full consensus or just a single approval for merges. | ||
|
||
## Motivation | ||
|
||
![](https://pad.lassul.us/uploads/f24e0ff3-117a-4167-8f60-79ca508b8c1c.png) | ||
|
||
The current `nixpkgs` contribution model relies heavily on a group of *committers* (approximately 200 at the time of writing), who possess exclusive rights to merge PRs. | ||
This model has remained unchanged since its inception and poses limitations on community contributions and maintainer workload. | ||
Implementing granular merge permissions will enable a broader contribution base and lessen the burden on existing committers by empowering package maintainers with more control over their respective packages. | ||
|
||
## Detailed Design | ||
|
||
The current design of the mergebot is limited in functionality. To tackle this issue we propose to implement per package strategies. These would be set inside a meta attribute in the package. | ||
|
||
We propose augmenting package meta fields with a new attribute set that specifies a list of merge strategies. | ||
The mergebot will execute all listed strategies in parallel and will proceed with the merge upon one successful strategy, contingent on passing ofborg checks as well. | ||
|
||
![](https://pad.lassul.us/uploads/965545d8-8575-41a8-9746-6bcc66f8bb0e.png) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would like for this section/diagram to discuss/handle the situation of the PR contents changing, e.g. if the owner of the PR force-pushes. Currently the diagram is written under the assumption that the PR contents don't change at all after
In general, the potential threat should be acknowledged that the merge bot makes a decision on a PR state, and then a malicious PR author force-pushes a malicious change to trick the bot to merge something else. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The current implementation takes the current revision of the PR at the point in time when the merge is triggered. So for example: someone creates a PR, someones triggers the merge action, then after the merge action is triggered but before the merge is done, the author force pushes to the PR again with some malicious changes. In that case the merge would fail and buildbot would report that (github has this feature, that you can specify a revision in a merge action) |
||
|
||
### Proposed `mergeBot` Attribute Structure | ||
|
||
```nix | ||
meta = { | ||
mergeBot.strategies = ["automerge-updates", "full_consensus", "..."]; | ||
} | ||
``` | ||
### Merge Strategies | ||
Currently supported strategies by nixpkgs-merge-bot as of the time of writing: | ||
- **maintainer_update:** Currently, this strategy allows package maintainers to merge updates from r-ryantm, triggered by mention of the bot. | ||
|
||
Proposed strategies are: | ||
|
||
- **automerge_updates:** Merges PRs from r-ryantm automatically if ofBorg checks pass, triggered by PR creation or mention. We strongly recommend to have at least one passthrough test. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it make sense to fail the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is something we can decide by a case by case basis? like if there is no passthrough test defined, why was the strategy enabled for this package? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That's exactly my point - is there any case where this would be ok? If not, then making that strategy fail by design if no passthrough test is defined seems like a good choice to avoid this kind of mistake happening. Given that this only ever automerges r-ryantm updates, I'm ok with the current design, but I think being stricter here helps avoid mistakes and gives an incentive for package maintainers to make Nixpkgs more automated by contributing passthrough tests (this is a good thing IMO). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there are certain packages where a delayed update would be more of a liability than an automerge. Like maybe the package is just some icon theme where there really is no testing needed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sometimes the tests of the package itself are exhaustive enough too. So I think it makes sense to not require anything extra per say. |
||
- **automerge_backport:** Automatically merges backport PRs if ofBorg checks pass, triggered by PR creation or mention. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think backports could often use an extra pair of eyes. However, they're also often pretty neglected... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. They should still follow the criteria for what is acceptable to backport. I think that requires a person to check. https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#changes-acceptable-for-releases There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for certain packages it seems to be fine? like security relevant leave packages could always auto backport |
||
- **full_consensus:** Requires approval from all maintainers before merging. The bot will notify which maintainers' approvals are pending, triggered by PR approval. | ||
- **single_maintainer:** Allows a single maintainer's approval to trigger a merge, facilitated by a mention of the bot. | ||
Comment on lines
+41
to
+48
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These need to be specified more clearly. From just these it sounds like only There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes the idea is to not restrict the author for the other strategies. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh that is very easy to miss, you have to read between the lines to even notice it. I don't think it should be done like this and would definitely need to be justified. Honestly this RFC should primarily be just about the lifting of that restriction, because it's the only truly controversial part. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah, okay, I thought that was pretty clear. But yeah, that's the part worth discussing the most. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added now a sentence to the summary to make the goal clearer There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure how to feel about this. Wouldn't Comparatively, to become a maintainer, you have to "just" get a package accepted into nixpkgs. Once that's done, what's to stop anyone from slipping in a bad change unnoticed, due to the huge number of PRs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. well the maintainer has to introduce that code into his own package. So not sure what the attack vector would be here. like the maintainer probably maintains that package alone (otherwise there would be probably a different merge strategy) and probably introduced it to nixpkgs, so slipping in code is possible through other a bit more obfuscated means. In my opinion the attack vector from outdated packages is far bigger than the ones from maintainers backdooring the packages they maintain. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Worst case scenario I could think of from the top off my head is someone submits a package for a new crypto wallet, then once it's adopted by a number of users, "secretly" (in a way that no one notices, anyway) introduces code that exposes private keys or siphons off funds. Something similar has happened before in the snap store, so it's not a very far-fetched scenario. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We discussed the requirements to become a package maintainer as well. They will probably also need to prove some track record. But their scope will be also more limited than full committers, so they can actual less damage than us having to give out commit rights. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that we currently have no policy in place to prevent people from adding themselves to maintainer entries, even if they make no other meaningful changes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure. This policies we need is what this RFC is about. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think such "vetted maintainers" should be in addition to the type of maintainers we have now, not in replacement. It should stay possible to be a random driveby contributor, as those are also very useful |
||
|
||
### Limitation to `pkgs/by-name` Directory | ||
|
||
To ensure accuracy in processing PRs, only those within the `pkgs/by-name` directory will be considered, due to the challenge of verifying if a PR solely affects the intended files. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should be stated somewhere: Does the nix language guarantee that a limitation to a certain file (or subtree) prevents me from overriding packages somewhere else in the evaluated package attribute tree? Or could a malicious users pull tricks to write magic trickery ala { pkgs, lib, ...}:
(some Nix builtin that allows injection into the final global attrset here, or some nixpkgs magic attributes that nixpkgs interprets)
(mkDerivation
# ...
) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. afaik pkgs/by-name limits exactly that. so there are no side effects on other packages. |
||
|
||
### Future extensions | ||
|
||
For now we propose that new merge strategies or significant changes to current ones go through their own RFCs. The mergebot team is deciding what is deemed significat in that case. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe add that even minor changes of interpretations of the existing strategies should give some amount of notice (repository issues/thread on the dominant discussion platform) before going live? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, probably new releases on the mergebot repository would be a good idea for this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mention Discourse because it reduces the need to consciously subscribe to yet another thing elsewhere to get notifications. Another thing is — I think a tracking issue+heads-up thread when the development is starting is better because you can promise to give ten days for feedback with less slowdown (development and testing and preparations for deployment are not just an instant). Also if a change is not universally perceived as small, when the discussion is known to have time to happen we get a better chance to avoid high-tension discussions. |
||
|
||
## Examples and Interactions | ||
|
||
1. **Single Maintainer Mode:** A maintainer of the `nano` package, set with `strategies = ["single_maintainer"]`, can initiate a merge by commenting with a specific bot command. The bot merges the PR if the ofBorg checks successed. | ||
|
||
2. **AutoMerge Mode:** An update PR for `ttyplot` created by r-ryantm, with `meta.mergeBot.strategies` set to `[ "automerge_updates" ]`, gets automatically merged after successful ofBorg checks. | ||
|
||
## Drawbacks | ||
|
||
- **Security Risks:** There's a potential for malicious actors to manipulate the merge bot to authorize merges improperly. A mitigation for these pull requests could be running validity checks (e.g file size limits, valid nix files, ..) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this risk should require more careful consideration of how merges get done from authors other than r-ryantm. My initial proposal is:
The main concern is "this will make merges take longer!". My view is that the mergebot helps us increase throughput of changes merged, which also has the effect of decreasing latency (they're somewhat connected), and it's ok to have the latency at a minimum of 24h (due to the hold period) because the throughput is still increased and changes are still automatically getting merged. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we can create a label which the bot tags the PR with and then would do a merge after the holding period. But I think that also conflicts with other peoples workflow. Maybe this is something we want to enable on a per package basis? but not sure how to expose the interface to configure something like that without making it complex There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you elaborate on how this conflicts with existing workflows? From my perspective, this mergebot only creates entirely new workflows (non-committers being able to merge changes), so there's no way this can create a conflict. If a committer wants to get the change merged straight away without waiting for the holding period, they can just merge themselves without the help of the bot, which is the same as the current workflow. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I didn't mean existing workflows, but what people told me they would expect by the mergebot. So it wouldn't disrupt current operations but would be a point where opinions will differ :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Security will almost always be a trade-off with convenience. I think adding this holding period is a good approach to mitigate the risk that the RFC points out, while not being too complicated to implement. The suggestion of file size limits and valid nix files doesn't seem to be as flexible and cover as many cases. I'd like to get more opinions on this, hoping more people end up commenting. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmm, depending on how the bot is manipulated the attacker could just merge the PR directly? So it would be better to have a review log of all the actions the bot has taken instead of relying on a waiting period which could have no effect in an attack? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you elaborate on what attack vector you have in mind, and what bot you're talking about (mergebot or r-ryantm)? I think if there's some case where the mergebot can be manipulated to merge the PR directly when it shouldn't, it should definitely be addressed! I had the following assumption during all my other comments: r-ryantm's PRs are "safe". There are ways that code upstream can be malicious, but this is a problem outside r-ryantm. r-ryantm itself introducing malicious changes that don't come from upstream is possible, but not something this RFC should worry about. As for other people introducing malicious changes:
I don't think that is better, but a complementary approach, since one concerns changes before they're merged and the other concerns them after they're merged, and the actual merging is what's the issue here.
I think most people would agree that delaying a possible attack by at least 24h seems like a pretty good mitigation. Unless you have something to back your claim, I think it is weak. Having said all of that, the strongest argument that I can see against the holding period is this: In case something malicious gets committed, I see two approaches to "fix" it:
Given this, I'd say anything we can do to prevent this from even happening is good. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not against the holding period, it just doesn't fix the security issue addressed in this part of the RFC. What you describe is another issue, that of maintainers merging malicious changes with this bot. What the issue describes is, for example, someone faking the PR in a way that the bot thinks it's a r-ryantm PR (with only pkgs/by-name changes) and merging it therefore directly. If maintainers merge malicious code, a 24h period could help against that, but then someone would have to check those PRs in the 24h period. The impact also shouldn't be that in big in that case. since the strategy has to be explicitly enabled for a package and if we see malicious action, we either would disable that strategy again on that package, remove the malicious actor from maintainers (and probably all of nixpkgs?) or both. Also we probably would enable maintainer strategies only on low impact leaf packages. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've scraped multiple attempts at a comment on this because none actually described what feels off about holding period.
I think this is the crux of the issue. If 99% PRs are totally benign and get merged with maintainer approval, that already means people are going to learn to ignore those, because there's nothing to see, really. In case of malicious maintainer, they'll have to make it look benign, possibly collude with upstream, at which point a curious person actively has to go look in the package changelog within the holding period. Even that is sure enough to fail because of the sheer number of times I've seen commit message with 'fix'. In short, I do not think a holding period is going to stop a malicious maintainer. It will be de-facto annoyance to the rest of the folks who are just waiting for the period to be over. What the mergebot is intended to do in this RFC, IMO does not get any help from a holding period, whether 1h or 1w. Just my 2c There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, my interpretation of that part of the RFC wasn't the interpretation you intended. I think calling out the issue you describe doesn't make sense. It's essentially saying "there could be a bug in the code". Obviously the code will try to be as robust as possible :) Regarding the holding period: I still think it has value, but my motivation to argue for it is gone. My last breath: everyone who will benefit from the non-r-ryantm merging strategies already waits a long time in their PRs. I don't have the data (and would love if someone else had), but from anecdotes all over, it already takes over 24h for people's PRs to get merged (which is what I argued before - reducing the >24h wait to ~24h will still make it fast enough and increase throughput). This bot isn't replacing committers, it's increasing merging throughput. Saying that this will be an annoyance given that this won't make any existing PR workflow slower is silly. |
||
- **Reliability Concerns:** A malfunction in the merge bot could disrupt numerous contributors' workflows. | ||
- **Code Quality** As no review is done by seasoned commiters, the code quality could drop. We want to face that by holding monthly meetings and review the merged PRs to see if the code quality drops and improving on the bot in that case. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe some public audit log could be useful for this purpose. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe we can use the github api to show every invocation of the mergebot. Or maybe we should collect all of them in the mergebot directly and show them on a webpage? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Showing them on a webpage or so could be handy, I think. It makes it easy to quickly review some of the actions taken. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be very useful to do that in a way that it does not query the Github API live, but stores the audit log in some way independent of the nixpkgs repo. In the xz backdoor situation, Github took down the xz repo, which had the effect that people were hindered investigating the compromise. If the merge bot is abused, it would be great to be able to have those logs even if the Github live API is not available for this (just saving the audit log somewhere else in some reasonably frequent interval would be enough to do that). |
||
|
||
## Alternatives | ||
|
||
- **Maintain Current Model:** This approach may not be sustainable long-term due to scalability issues. | ||
- **External Tools:** Utilize third-party GitHub actions or apps, like Mergify, and adapt them to our needs. | ||
|
||
## Prior Art | ||
|
||
https://github.com/NixOS/rfcs/pull/50 | ||
https://github.com/NixOS/rfcs/pull/128 | ||
|
||
## Unresolved Questions | ||
|
||
*To be completed* | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If a rev-dep maintainer thinks that the dep's mergebot policy is too lax — do we recommend having a pin-like copy with a different policy, or tightening the policy for the main package? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would suggest tightening it for the main package? like all people depending on a package should be fine with the strategy enabled? and duplicating a package seems like overhead There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that this is probably the best trade-off, at least in the beginning, but then it needs to be spelled out in the written policy, as it basically says potentially many people have the right to unilaterally deprive a maintainer of a useful tool. |
||
|
||
## Future Work | ||
|
||
- **Strategy Expansion:** Developing additional merge strategies and drafting RFCs to introduce them. | ||
- **Enhanced Documentation:** Creating comprehensive documentation for the new merge strategies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your nominations!