-
-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC 0087] Promote aarch64-linux to Tier 1 support #87
Conversation
Suggested in [nixos-org-configurations#142](NixOS/infra#142). `aarch64-linux` gathers enough attention to receive a promotion to a supported architecture? Rendered: TBA
Thanks @grahamc Co-authored-by: Graham Christensen <graham@grahamc.com>
It's mostly perceived coverage due to channel bumps waiting for stuff as @samueldr noted
Co-authored-by: Graham Christensen <graham@grahamc.com>
Thank you for helping me with this RFC ✨
Co-authored-by: Graham Christensen <graham@grahamc.com>
rfcs/0087-aarch64-tier1.md
Outdated
to block channel advances in case of failures. | ||
|
||
Merging this RFC should happen simultaneously with the merging of documentation and perhaps | ||
a NixOS module for configuring qemu-binfmt as an aarch64 builder on x86_64 machines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At one time in the past, building with qemu-user+binfmt was iffy. It might have been specific to armv6 though. I personally wouldn't consider suggesting qemu-user+binfmt until it has been shown that a full rebuild without nixos.org cache works for aarch64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes it is iffy for me on aarch64 or armv7l IIRC. I remember not being able to build some packages with emulation, having to resort to natively building them. Sadly I don't remember which ones 😢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add a note about setting up some sort of sub-project to track down and debug these issues, but I'm afraid this might be out-of-scope for this RFC. Should I?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a recommendation that is unrelated to the RFC?
Relatedly, I would prefer that the wording doesn't suggest qemu-user+binfmt as an approved method of testing aarch64-linux. But I don't know how to phrase any of this :/.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, I'm also confused around precise phrasing for this, but I'll probably need to change the wording to decrease the importance of that binfmt bit in the RFC manuscript somehow. Let me try something...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, I don't know either, let's hear what other people think about how to approach suggesting-but-not-ratifying-qemu+binfmt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a note in Future work suggesting setting up an effort to track down emulation errors: 6c46f9e
Note that we have a strong precedent in favor of this: In 2017: In 2018: Which was changed to limited support due to technical issues with the hydra evaluator. Limited support is our current status on the situation. Note that the hydra evaluator has been rewritten since. It is extremely likely it will now evaluate fine as a supported system. |
There are build failures sightings when using emulation, these need to be tracked down (but that's out-of-scope for this RFC).
@samueldr I added the prior art to the Motivation section. Please tell me if it's more appropriate elsewhere. |
I forgot it!!!
rfcs/0087-aarch64-tier1.md
Outdated
# Alternatives | ||
[alternatives]: #alternatives | ||
|
||
Create an aarch64-focused channel that would build same things current `unstable` does, but for aarch64 only. This has a significant drawback: it is possible for the x86_64 channel and the aarch64 channels to never pass on the same commit, making deployment to a heterogeneous cluster of x86_64 and aarch64 machines very challenging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't find this argument very persuasive. Track master
instead and let your own CI perform tests that are relevant for you. The amount of rebuilding you get on master will typically be small. Same goes for stable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where would one run this CI? Not everyone has a Raspberry Pi or a whole cluster connected to SSDs for fast builds (I don't build on my Raspberry Pi for this reason - the builds are slower than if I'd do it on my laptop via emulation, and Hydra would be simply a lot faster than any laptop in existence, speeding it up further), and those Raspberry Pi users who run just one or two would be part of the target audience of this RFC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asking people to run their own CI for what is supposedly still a Linux distribution for users it not a good argument. Many of us that are involved in the project do that for their person stuff. I started the discussion about having the channel because someone that just started out with NixOS (on a RPi) was having issues with lots and lots of rebuilds on their little machine.
Making adoption easier is one of the goals we should be aiming for. Asking for huge leaps (no FP background to full custom Nix CI...) is not a good way.
|
||
# Alternatives | ||
[alternatives]: #alternatives | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am of the opinion a separate channel should be the first step. Show there are enough resources and commitment to keep it green for half a year to a year. If that is the case, we can upgrade it to Tier 1. I don't think generally there is enough understanding in how much day-to-day effort goes into actually keeping the channels green. At the same time, I have no idea how few breakage there is with aarch64 nowadays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added your suggestion to the manuscript.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The channel already blocks on aarch64-linux since late 2018 for the limited support set. So I guess we're ready to upgrade it to Tier 1 since things were kept green for half a year to a year.
It's all about having the jobs being tried to be built. Not about adding new blockers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth specifying which channel we are talking about here. If you mean the unstable channel: it mostly works. If you mean the stable channel: It has been blocked for a few days when I opened the channel PR and probably nobody noticed yet.
A separate channel could let us gauge how much work we'll need to keep the channel green. Co-authored-by: Frederik Rietdijk <fridh@fridh.nl>
Added a note that we didn't fully disable builds, just demoted the architecture to partial support. Co-authored-by: Samuel Dionne-Riel @samueldr
I think the note in Detailed design about getting AWS instances in case we need burst capacity solves the question about having enough CPU power.
@dhess The "advance notice" is part of the amended manuscript I'm working on right now - I'm planning on adding a proposal to ping everyone on the PR that would implement the RFC, but we could also bring the people in charge on board right here in this pull request by pinging them, if this is acceptable. (My first thought would be to ping our maintainer team but then I realized it has 1630 people in it and quickly scrapped that idea - targeted pings on |
A small detour into Nixpkgs maintainer lists has given me the following snippet that will print out a list of people we might need to ping first, comprised of maintainers of packages I remember being a part of stdenv and all the packages included with a default NixOS install (see Nix snippet (launch with
|
- Added suggestions on how to alleviate increased work on maintainers of critical packages in case of a platform-specific breakage based on prior art in RFCs - Added an alternative way to solve the problem discussed in NixOS#87 (comment) - Added counterarguments to the drawback section discussed in the first meeting
I'm certainly not the expert! Maybe @domenkozar or @grahamc can weigh in? My non-expert opinion: it seems likely that most of the packages you've identified there won't significantly be impacted by The ones that that I would personally start with are compilers, languages, and associated tooling (looks like |
First round of updates based on meeting notes seems to be done. Please pester me if I missed anything. @dhess language-specific components do seem like a good starting point. I will see if the list can be updated to include things like Rust and compilers for other languages I know of and Nixpkgs has in its collection. |
Yeah, that's a good point. Even though Python, Rust, Go, & Haskell (just to name a few examples) aren't part of |
Given the silence, may I advise to the shepherd team (@samueldr @kloenk @dhess @grahamc) to hold the next meeting sometime soon? On the agenda will probably be polishing the list of critical packages and their maintainers. I took the liberty to create a when2meet page for it: https://www.when2meet.com/?13805346-cScpg. My availability is slightly uncertain and I will probably adjust if it turns out that people agree on some other time as more preferable. I will also advertise the RFC in the relevant chatroom to bring slightly more attention to it - I'm afraid some might've forgotten about its existence due to my unfortunate radio silence (sorry!) |
Oh. Additionally, I've had a brilliant idea and I need to write it down so 1) I don't forget it; 2) so we may discuss it! How about just before stabilizing, we'll hold a one-off ZHF targeted on the |
To the shepherd team (@samueldr @grahamc @kloenk @dhess): the when2meet page seems to have some good options for a meeting on the Dec 11, 12, 18 and 19, from 4:30PM to 9:00 PM (all times are UTC+0300). Are any of the datetimes mentioned above ok for us? We probably need to pick one for the second meeting (not urgent, just would be nice to know for planning) |
Any of those times work for me. |
Did the meeting happen? |
Nope. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/future-of-channels-and-channels-nixos-org-in-a-flakes-world/11563/13 |
Any updates? |
If there's been any progress on this issue, I'm unaware of it. |
I would like to step down as the main author of this RFC, as I am unable
to fulfill my role due to personal circumstances.
…---
Vika
|
If anyone is willing to take over this RFC please feel free do do so. Especially @grahamc as you are listed as a co-author. Note that there may be some permissions issues due to GitHub write permissions. If that is the case it may be easier to close this PR and open a new one. If no one steps up the best option is probably to close this PR for now and it can always be used as a base if someone finds time to drive it. |
I think this RFC is really important, and I'm happy to continue to participate as a shepherd, but unfortunately I don't have the time to drive it. |
Hi, we've decided to close this RFC for now. If somebody wants to step up to drive this RFC forward, please let us know and we can reopen it! |
Suggested in nixos-org-configurations#142.
Did
aarch64-linux
gather enough attention to receive a promotion to a supported architecture? Maybe it should block channels now, since a lot of users are requesting a channel for aarch64 builds and support is improving. The only question is: isaarch64-linux
stable and build times are swift enough to not be a deadweight onx86_64-linux
-based channels?Rendered: https://github.com/kisik21/rfcs/blob/patch-1/rfcs/0087-aarch64-tier1.md
Accepting this RFC will make NixOS/infra#142 and NixOS/nixpkgs#83049 obsolete.