Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/stage-1: fix predictable interface names in initrd #79532

Merged
merged 5 commits into from
Mar 2, 2020

Conversation

fpletz
Copy link
Member

@fpletz fpletz commented Feb 8, 2020

Motivation for this change

This makes predictable interfaces names available as soon as possible with udev by adding the default network link units to initrd which are read by udev. Also adds some udev rules that are needed but which would normally loaded from the udev store path which is not included in the initrd.

Also ensures that the interfaces are unconfigured before switching to stage 2, switches to klibc's ipconfig due to more features and make boot.initrd.network.postCommands run regardless if network setup has succeeded.

Fixes #74328 #75314 #71447.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@fpletz fpletz requested review from flokli and arianvp February 8, 2020 12:47
@fpletz fpletz added 0.kind: enhancement Add something new 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS labels Feb 8, 2020
@fpletz fpletz added this to the 20.03 milestone Feb 8, 2020
@fpletz
Copy link
Member Author

fpletz commented Feb 8, 2020

@GrahamcOfBorg test predictable-interface-names initrdNetwork initrd-network-ssh

@ofborg ofborg bot added the 8.has: module (update) This PR changes an existing module in `nixos/` label Feb 8, 2020
This makes predictable interfaces names available as soon as possible
with udev by adding the default network link units to initrd which are read
by udev. Also adds some udev rules that are needed but which would normally
loaded from the udev store path which is not included in the initrd.
Depending on the network management backend being used, if the interface
configuration in stage 1 is not cleared, there might still be some old
addresses or routes from stage 1 present in stage 2 after network
configuration has finished.
@fpletz fpletz force-pushed the fix-predictable-ifnames-in-initrd branch from ab02ac4 to ea7d024 Compare February 8, 2020 13:05
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Feb 8, 2020
@flokli
Copy link
Contributor

flokli commented Feb 8, 2020

From a quick glace, this looks like the right way to do (at least until we have systemd in initrd).

Can somebody with a "push luks keys over ssh in initrd" setup test this? I don't currently have such a setup around, but we should make sure it now works as intended.

Also, we should add something to the release notes, so people who are using such a setup and implemented their workarounds are aware names have changed and don't lock themselves out.

Copy link
Member

@Ma27 Ma27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a setup which is fairly similar to nixos/tests/initrd-network-ssh and tested the first commit on multiple servers, didn't find any regressions :)

@flokli
Copy link
Contributor

flokli commented Feb 8, 2020

@petabyteboy @schmittlauch @erikarvstedt @emilazy could you also give this a very firm look?

This apparently has features that the version from Arch's
mkinitcpio-nfs-utils does not have. Fixes #75314.
As outlined in #71447, postCommands should always be run if networking
in initrd is enabled. regardless if the configuration actually
succeeded.
@fpletz
Copy link
Member Author

fpletz commented Feb 8, 2020

cc @mweinelt for #71447

@emilazy
Copy link
Member

emilazy commented Feb 8, 2020

I'll try and take a look at this soon -- for what it's worth, I personally successfully use this module and it's the closest thing I've seen to a Correct™ solution for this issue.

@fpletz
Copy link
Member Author

fpletz commented Feb 8, 2020

@emilazy Those changes are actually included here but with some more fixes for stuff I encountered while reviewing the code. Just didn't notice @arianvp's PR earlier unfortunately.

@emilazy
Copy link
Member

emilazy commented Feb 8, 2020

I'm referring to @ju1m's module here, not @arianvp's PR, which IIRC didn't quite work for me.

@flokli flokli added the 1.severity: blocker This is preventing another PR or issue from being completed label Feb 8, 2020
@flokli
Copy link
Contributor

flokli commented Feb 8, 2020

Marked this as a blocker, we should really try to get this fixed for 20.03.

@ofborg ofborg bot added 8.has: package (new) This PR adds a new package 11.by: package-maintainer This PR was created by the maintainer of the package it changes 10.rebuild-linux: 11-100 labels Feb 8, 2020
@flokli
Copy link
Contributor

flokli commented Mar 2, 2020 via email

@Ma27
Copy link
Member

Ma27 commented Mar 2, 2020

and I'd also have preferred if those were part of the initial PR

Sorry for that, hit the PR-button too early :/

I'd personally say that's fine to backport as well

You're right, just missed it!

@worldofpeace
Copy link
Contributor

Backported the following patches:

* [332d731](https://github.com/NixOS/nixpkgs/commit/332d731a7a48e79acda2e4ce79f1245999a5a2e4) (nixos/stage-1: fix predictable interfaces names)

* [3206aa9](https://github.com/NixOS/nixpkgs/commit/3206aa985af95d48fde39c3613461882106cc7d4) (klibc: 2.0.4 -> 2.0.7)

* [173c771](https://github.com/NixOS/nixpkgs/commit/173c7715dedc9c88f54e970d70464319fe93c04d) (nixos/initrd-network: use ipconfig from klibc)

* [0bdf352](https://github.com/NixOS/nixpkgs/commit/0bdf352a0589252cf63ebbf50944d213f4cd6447) (nixos/release-notes: mention fix for predictable network-interfaces in initrd)

After re-reading @fpletz previous comment, I figured that the ipconfig-related fix is also part of the bugfixes and the main breaking change that should not be backported is the flush of the interfaces after stage-1.

Btw, please PR larger backports.

@Ma27 Ma27 added 8.has: changelog 8.has: port to stable A PR already has a backport to the stable release. and removed 9.needs: changelog 9.needs: reporter feedback This issue needs the person who filed it to respond labels Mar 2, 2020
@Ma27
Copy link
Member

Ma27 commented Mar 4, 2020

Sorry for the delay, currently preparing a PR to backport 5897899.

Ma27 pushed a commit to Ma27/nixpkgs that referenced this pull request Mar 4, 2020
As outlined in NixOS#71447, postCommands should always be run if networking
in initrd is enabled. regardless if the configuration actually
succeeded.

(cherry picked from commit 5897899)

The backport of this patch has been requested in NixOS#79532[1]. The diff is
slightly off the original commit since some changes from
ea7d024 were needed, however this
commit shouldn't be backported as it potentially breaks existing setups.

[1] NixOS#79532 (comment)
@Frostman
Copy link
Member

@fpletz does it move building bridges to stage-1 as well? I have .useDHCP=true only for br0 and trying to understand if it'll continue working after switching to 20.03. Thanks.

@Nadrieril
Copy link
Member

Hum, this seems to impose predictable interface names in stage-1 even if the main OS has predictableInterfaceNames = false. This has just cause my server to fail to get the interface up in initrd because I expected eth0 to exist. This should at least be mentioned in the release notes I think, it's really breaking

@flokli
Copy link
Contributor

flokli commented Jul 11, 2020

@Nadrieril can you show us a minimal NixOS configuration reproducing this?

There's nixos/tests/predictable-interface-names.nix which should discover exactly this, and it's not failing…

@Nadrieril
Copy link
Member

Hmm, this happened on two of my servers so it's not just a fluke and probably not hardware-related either. But it's difficult to minimize, I can't just break my servers.
The one thing that's weird is that I did not have usePredictableInterfaceNames = false, yet the servers used to use eth0-like interfaces in normal use. So it could also be that I was relying on an old bug that has been fixed in the past year. If no one else has that problem then I won't bother trying to understand this mess.

@Ma27
Copy link
Member

Ma27 commented Jul 14, 2020

That's most likely the case: without this PR, NixOS had eth0-like interfaces when configuring a network in the initrd.

@Nadrieril
Copy link
Member

Ohh, you mean the interface set up in initrd would carry on to the booted system? That explains a lot!

@ju1m
Copy link
Contributor

ju1m commented Jul 14, 2020

@Nadrieril yes the interface names are kept from stage-1 to stage-2, AFAIU there's no more than a pivot_root between the two, no rmmod and modprobe that would reset the interface names. I've explained this in detail in this proposal #68953 (comment) ; note that I'm still using this code and it still works for me.

@flokli
Copy link
Contributor

flokli commented Jul 14, 2020

So this PR actually fixed a bug we didn't know about - even with networking.usePredictableInterfaceNames set to true (the default), doing things with them in initrd caused them to stick with their non-predictable name.

I don't think that's a bug we should consider a feature and apply workarounds for ;-)

We should however update the release note entry explaining the possibly changed behaviour, so people reading those are made aware.

@flokli
Copy link
Contributor

flokli commented Jul 14, 2020

Ah, nvm - we already refer to this PR in the release notes, and as this landed in 20.03 already, the amount of people still running into it while upgrading should be somewhat small.

@Ma27
Copy link
Member

Ma27 commented Jul 14, 2020

So this PR actually fixed a bug we didn't know about - even with networking.usePredictableInterfaceNames set to true (the default), doing things with them in initrd caused them to stick with their non-predictable name.

Well, actually this was one of the main issues before this PR (and one of the biggest challenges I had while trying to upgrade to 19.09).

The problem was that [they] did not have usePredictableInterfaceNames = false, so AFAICS everything works as expected now, right?

@leo60228
Copy link
Member

Am I doing something wrong or does this break setting interface names with systemd link units?

@mweinelt
Copy link
Member

mweinelt commented Aug 22, 2020

A common pitfall with using network interfaces in initrd is that you need to set them admin down, because you cannot rename interfaces that are admin up. Is the interface properly set to down?

Does enabling this block fix your issue? https://github.com/NixOS/nixpkgs/pull/79532/files#diff-5c6eca5d6ffea3065ed16561bec3ca9dR139

rnhmjoj added a commit that referenced this pull request Feb 20, 2021
systemd-udev-settle is a terrible hack[1] and should never[2] ever[3]
used, seriously it's very bad. It was used as a stop-gap solution for
issue #39069, but thanks to PR #79532 it can be removed now.

[1]: systemd/systemd#7293 (comment)
[2]: #73095
[3]: #107341
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: enhancement Add something new 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: changelog 8.has: module (update) This PR changes an existing module in `nixos/` 8.has: package (new) This PR adds a new package 8.has: port to stable A PR already has a backport to the stable release. 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 11-100 11.by: package-maintainer This PR was created by the maintainer of the package it changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Initrd vs predictable interface names