Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/{networkd,dhcpcd}: remove udev-settle hack #107382

Merged
merged 6 commits into from
Feb 20, 2021

Conversation

rnhmjoj
Copy link
Contributor

@rnhmjoj rnhmjoj commented Dec 22, 2020

Motivation for this change

systemd-udev-settle is a terrible hack1 and should never2 ever3
used, seriously it's very bad. It was used as a stop-gap solution for
issue #39069, but thanks to PR #79532 it can be removed now.

Things done
  • Tested via:
    • nixosTests.networking.networkd
    • nixosTests.networking.scripted
    • nixosTests.predictable-interface-names
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@rnhmjoj rnhmjoj requested review from fpletz, flokli, srhb and xeji December 22, 2020 10:03
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/boot-faster-by-disabling-udev-settle-and-nm-wait-online/6339/4

@ofborg ofborg bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Dec 22, 2020
@flokli
Copy link
Contributor

flokli commented Dec 22, 2020

Shower thought: Do we copy and apply all udev rules into initrd?

I'm a bit worried about 70-persistent-net.rules-style udev rules not getting applied before dhcpcd starts doing its thing. Such a rule is mentioned as an example in nixos/modules/services/hardware/udev.nix for services.udev.extraRules

The systemd-udev-settle.service might be what's preventing this from happening for now.

We should add a test for this, to ensure it keeps working.

@peterhoeg
Copy link
Member

There is some additional information here: #25311

@rnhmjoj
Copy link
Contributor Author

rnhmjoj commented Jan 7, 2021

I run more tests today and everything looks fine:

  • nixosTests.containers-restart_networking
  • nixosTests.initrdNetwork
  • nixosTests.installer.bcache
  • nixosTests.installer.btrfsSimple
  • nixosTests.installer.btrfsSubvolDefault
  • nixosTests.installer.btrfsSubvols
  • nixosTests.installer.encryptedFSWithKeyfile
  • nixosTests.installer.grub1
  • nixosTests.installer.luksroot
  • nixosTests.installer.luksroot-format1
  • nixosTests.installer.luksroot-format2
  • nixosTests.installer.lvm
  • nixosTests.installer.separateBoot
  • nixosTests.installer.separateBootFat
  • nixosTests.installer.simple
  • nixosTests.installer.simpleLabels
  • nixosTests.installer.simpleProvided
  • nixosTests.installer.simpleSpecialised
  • nixosTests.installer.simpleUefiGrub
  • nixosTests.installer.simpleUefiGrubSpecialisation
  • nixosTests.installer.simpleUefiSystemdBoot (fails as of 7b72e86, pass after rebase)
  • nixosTests.installer.swraid
  • nixosTests.installer.zfsroot
  • nixosTests.networking.networkd.bond
  • nixosTests.networking.networkd.bridge
  • nixosTests.networking.networkd.dhcpOneIf
  • nixosTests.networking.networkd.dhcpSimple
  • nixosTests.networking.networkd.link
  • nixosTests.networking.networkd.loopback
  • nixosTests.networking.networkd.privacy
  • nixosTests.networking.networkd.routes
  • nixosTests.networking.networkd.sit
  • nixosTests.networking.networkd.static
  • nixosTests.networking.networkd.virtual
  • nixosTests.networking.networkd.vlan
  • nixosTests.networking.scripted.bond
  • nixosTests.networking.scripted.bridge
  • nixosTests.networking.scripted.dhcpOneIf
  • nixosTests.networking.scripted.dhcpSimple
  • nixosTests.networking.scripted.link
  • nixosTests.networking.scripted.loopback
  • nixosTests.networking.scripted.macvlan
  • nixosTests.networking.scripted.privacy
  • nixosTests.networking.scripted.routes
  • nixosTests.networking.scripted.sit
  • nixosTests.networking.scripted.static
  • nixosTests.networking.scripted.virtual
  • nixosTests.networking.scripted.vlan

@flokli
Copy link
Contributor

flokli commented Jan 9, 2021

@rnhmjoj can you add a test for the services.udev.extraRules rename example into nixos/tests/networking.nix? I assume you could just add another interface (via the vlan argument), and ensure it gets renamed via the udev rule.

@rnhmjoj
Copy link
Contributor Author

rnhmjoj commented Jan 9, 2021

@flokli Thank you for the suggestion: there is indeed a problem here. I wrote this small test:

 {
   rename = {
      name = "RenameInterface";
      machine = { pkgs, ... }: {
        virtualisation.vlans = [ 1 ];
        networking = {
          useNetworkd = networkd;
          useDHCP = false;
        };
        services.udev.extraRules = ''
          SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:12:01:01", KERNEL=="eth*", NAME="custom_name"
        '';
      };
      testScript = ''
        machine.succeed("udevadm settle")
        print(machine.succeed("ip link show dev custom_name"))
      '';
    };
}

The "scripted" variant works, but the renaming fails when netword is enabled:

systemd-udevd[481]: eth1: Failed to rename network interface 3 from 'eth1' to 'custom_name': Device or resource busy

I'm not sure what this means...

@flokli
Copy link
Contributor

flokli commented Jan 10, 2021

The linked #79532 kinda does something similar this:

We copy ${udev}/lib/systemd/network/*.link into the initrd, and also some udev rules related to interface renaming, so the renaming happens early, before interfaces are up (and we don't race with networkd or scripted networking also configuring interfaces).

I feel like we need to copy more of the .link files into initrd, and the udev extraRules as well. cc-ing @fpletz, @Ma27, @emilazy and @mweinelt from #79532 for opinions.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-in-distress/3604/39

systemd-udev-settle is a terrible hack[1] and should never[2] ever[3]
used, seriously it's very bad. It was used as a stop-gap solution for
issue NixOS#39069, but thanks to PR NixOS#79532 it can be removed now.

[1]: systemd/systemd#7293 (comment)
[2]: NixOS#73095
[3]: NixOS#107341
Renaming an interface must be done in stage-1: otherwise udev will
report the interface as ready and network daemons (networkd, dhcpcd,
etc.) will bring it up. Once up the interface can't be changed and the
renaming will fail.

Note: link files are read directly by udev, so they can be used even
without networkd enabled.
@ofborg ofborg bot added 8.has: changelog 8.has: documentation This PR adds or changes documentation labels Feb 18, 2021
nixos/doc/manual/configuration/renaming-interfaces.xml Outdated Show resolved Hide resolved
nixos/doc/manual/configuration/renaming-interfaces.xml Outdated Show resolved Hide resolved
nixos/doc/manual/configuration/renaming-interfaces.xml Outdated Show resolved Hide resolved
nixos/doc/manual/configuration/renaming-interfaces.xml Outdated Show resolved Hide resolved
nixos/doc/manual/release-notes/rl-2105.xml Outdated Show resolved Hide resolved
nixos/modules/services/hardware/udev.nix Outdated Show resolved Hide resolved
nixos/modules/services/hardware/udev.nix Show resolved Hide resolved
@rnhmjoj rnhmjoj force-pushed the no-udev-settle branch 3 times, most recently from b85dce1 to 9d34123 Compare February 19, 2021 00:02
Note: this moves the example rule used to rename network interfaces in
the new udev.initrdRules option, which is required since 115cdd1c.
@rnhmjoj rnhmjoj merged commit 19d715c into NixOS:master Feb 20, 2021
@erictapen
Copy link
Member

erictapen commented Feb 21, 2021

@rnhmjoj I'm using this for a while now ontop of nixos-20.09. What do you think about backporting this?

Edit: Nevermind, I just read the addition to the release notes. Doesn't sound safe to backport.

@rnhmjoj
Copy link
Contributor Author

rnhmjoj commented Feb 21, 2021

@erictapen Yes, unfortunately it can't be easily backported.
If you're not using gdm or zfs, it should be mostly safe to do systemd-udev-settle.enable = false and disable udev-settle entirely.

@@ -202,12 +202,26 @@ in
'';
};

extraRules = mkOption {
initrdRules = mkOption {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmph not particularily fond of having this option defined here. This module is for configuring systemd-udevd in stage 2; not stage 1

how about instead of having an initrdRules here; we move this option to stage-1.nix and name it boot.initrd.extraUdevRules ? It's a bit weird to have the the option for initrd defined here as udevd in initrd and udevd in stage-2 might be two completely different udevd implementations at this moment.

I'm working on an alternative initrd implementation using systemd. But it's hard to introduce this change as now the udev module in stage-2 has a hard dependency on the stage-1.nix module. Because now the stage-2 udev module calls out to stage-1.nix I can not disable stage-1.nix in disabledModules

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll send a draft PR with these suggested changes and i'll backlink it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think renaming this option is fine - as long as we do this before the 20.05 branchoff, it probably doesn't even need the mkRenamedOption alias.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: changelog 8.has: documentation This PR adds or changes documentation 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants