Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos: losing network after "big" nixos-rebuild switch #198267

Open
bjornfor opened this issue Oct 28, 2022 · 5 comments
Open

nixos: losing network after "big" nixos-rebuild switch #198267

bjornfor opened this issue Oct 28, 2022 · 5 comments
Labels
0.kind: bug Something is broken 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS

Comments

@bjornfor
Copy link
Contributor

Describe the bug

Running sudo nixos-rebuild switch on a "big" nixpkgs upgrade (e.g. from ac20a86 to c132d08) leaves my server without networking connectivity. When in this state, I can get the network back up with sudo systemctl restart network-setup.

Steps To Reproduce

Steps to reproduce the behavior:

  1. sudo nixos-rebuild switch with "big" nixpkgs update (see above).
  2. Network interface no longer has IP address.
  3. systemctl status dhcpcd says no valid interfaces found. (Normally it's clear that it finds an interface and obtains an IP address.)

Expected behavior

Doing nixos-rebuild switch shouldn't take down network interfaces and leaving them without IP address and require manually running sudo systemctl restart network-setup to fix it.

Additional context

I'm using an ethernet bridge on my server, if that matters:

  # Add a bridge interface to be able to put libvirt/QEMU/KVM VMs directly on
  # the LAN.
  networking.bridges = {
    br0 = { interfaces = [ lan0 ]; };
  };
  # TODO: shouldn't have to turn off useDHCP just because dhcpcd doesn't enable
  # dhcp for bridges by default (that should be handled by the next line).
  # Ref. https://github.com/NixOS/nixpkgs/pull/82295
  networking.useDHCP = lib.mkForce false;
  networking.interfaces.br0.useDHCP = true;

Notify maintainers

Metadata

NixOS 22.05

@bjornfor bjornfor added the 0.kind: bug Something is broken label Oct 28, 2022
@bjornfor
Copy link
Contributor Author

Maybe this is another instance of #182449?

@bjornfor
Copy link
Contributor Author

bjornfor commented Nov 4, 2022

X-ref: #180175

@veprbl veprbl added the 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS label Nov 5, 2022
@bjornfor
Copy link
Contributor Author

Next time this happens, try #195777 (comment), as it should narrow down the issue compared to the sudo systemctl restart network-setup workaround mentioned in the original post.

@ToxicFrog
Copy link
Contributor

I think the issue here (which I just ran face-first into) is that dhcpcd does not assign addresses to bridges by default, and networking.interfaces.foo.useDHCP relies on dhcpcd's interface autodiscovery, which means it doesn't work on bridges.

Specifically:

  • including lan0 in the bridge disables DHCP for lan0, because the addresses are meant to get assigned to the bridge
  • setting networking.interfaces.br0.useDHCP = true requires dhcpcd to be enabled, even if system-wide DHCP is off
  • if system-wide DHCP is off, br0 will get explicitly listed in dhcpcd.conf
  • if it's on, it won't get explicitly listed and dhcpcd is assumed to do the right thing, except—
  • per dhcpcd(8), "Non-ethernet interfaces and some virtual ethernet interfaces such as TAP and bridge are ignored by default", even if they aren't listed in denyInterfaces

I think the solution is probably to emit interface foo\n\n at the end of dhcpcd.conf for each interface with DHCP explicitly enabled (note: the double newline is load-bearing). In the meantime, a workaround is:

networking.dhcpcd.extraConfig = ''
  interface br0

'';

@ToxicFrog
Copy link
Contributor

ToxicFrog commented Oct 25, 2023

Upon further testing, this does not work reliably; to actually get it to work you sometimes need to dhcpcd -n br0 after the rebuild.

Explicitly telling it to enable dhcp on the interface (not just listing it as the man page says) seems to work, so far:

networking.dhcpcd.extraConfig = ''
       interface br0
       dhcp
       dhcp6
       ipv4
       ipv6

'';

Along with making sure that dhcpcd doesn't start until after br0 is created:

  systemd.services.dhcpcd.after = [ "br0-netdev.service" ];

but we'll see if that remains working long-term.

Update: I've been doing a lot of big updates, which means lots of testing this, which means I can now say this doesn't work reliably either.

Explicitly telling dhcpcd about the bridge with dhcpcd -n br0 doesn't work either. systemctl restart network-setup still does, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS
Projects
None yet
Development

No branches or pull requests

3 participants