Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add systemd-networkd packages back into FCOS #574

Closed
wants to merge 1 commit into from

Conversation

jdoss
Copy link
Contributor

@jdoss jdoss commented Aug 21, 2020

This PR re-adds the systemd-networkd packages on FCOS. It is totally fine that the project wants to use NetworkManager is the de-facto networking configuration implementation so there is a single supported way to configure the network on boot via Ignition. This PR won't change that fact.

Removing the systemd-networkd packages limits end user choices on how they want to configure the network on their FCOS deployments after the system is booted. Fedora doesn't remove systemd-networkd and it can live beside NetworkManager without any issues. Adding it back in allows people to consume FCOS as they see fit for their use cases. We shouldn't be dictating how users setup and configure their networks post boot.

systemd-networkd was removed because NetworkManager was chosen as the de-facto
networking configuration implementation as discussed in
coreos/fedora-coreos-tracker#24 but removing it entirely
restricts end user choices to fit their use cases.

Since systemd-networkd can live along side of NetworkManager this PR adds it back in.
Copy link

@jamescassell jamescassell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, especially since it's not a separate package.

@bgilbert
Copy link
Contributor

Removing the systemd-networkd packages limits end user choices on how they want to configure the network on their FCOS deployments after the system is booted. Fedora doesn't remove systemd-networkd and it can live beside NetworkManager without any issues. Adding it back in allows people to consume FCOS as they see fit for their use cases. We shouldn't be dictating how users setup and configure their networks post boot.

Removing networkd does limit end-user choices, which is intentional. Fedora Workstation, Server, etc., are intended as more general-purpose distros, where choice is absolutely a benefit. On the other hand, FCOS is aiming for a more opinionated model, where ideally we'd deliver one mechanism for each thing someone might want to do in the host OS. (There are exceptions such as container runtimes, where, since that's the whole purpose of the distro, we should be as flexible as possible.) This helps reduce the API surface we're on the hook for keeping stable; if we ship networkd as a non-default option and it regresses, automatic updates could still break people who are using it.

Networking, in particular, has so many hooks into the rest of the distro that maintaining two sets of network management tools isn't practical. For example, Afterburn would need to support configuring both NetworkManager and networkd (we still haven't ported that functionality over from Container Linux), and coreos-installer would need to support copying configs for both services. There are also NetworkManager assumptions in our Dracut modules. And networkd works differently enough from NetworkManager that compatibility glue would be non-trivial. See coreos/fedora-coreos-tracker#24 for a long discussion on this topic.

@jamescassell
Copy link

In that case, I'd suggest to split the systemd package to avoid stripping the files by path.

@jdoss
Copy link
Contributor Author

jdoss commented Aug 21, 2020

This helps reduce the API surface we're on the hook for keeping stable; if we ship networkd as a non-default option and it regresses, automatic updates could still break people who are using it.

Let the end users be on the hook for this. We can make it clear in the documentation that NetworkManager is the blessed method for configuring networks and end users will accept this fact. Why should FCOS be on the hook for their choices to use a systemd-networkd over NetworkManager? Why limit a set of features that ship within systemd across the other Fedora distros without any major issues? What does this dogmatic philosophy of limiting choice gain the FCOS project as a whole?

Networking, in particular, has so many hooks into the rest of the distro that maintaining two sets of network management tools isn't practical. For example, Afterburn would need to support configuring both NetworkManager and networkd (we still haven't ported that functionality over from Container Linux), and coreos-installer would need to support copying configs for both services. There are also NetworkManager assumptions in our Dracut modules. And networkd works differently enough from NetworkManager that compatibility glue would be non-trivial. See coreos/fedora-coreos-tracker#24 for a long discussion on this topic.

Afterburn doesn't have to support anything with systemd-networkd right now or ever. FCOS as a project doesn't have do anything to support systemd-networkd on boot right now for these packages to be added back in. systemd-networkd can be totally configured after boot by the end users if they wish. Adding these packages back in allows for users to have a choice. Why is the FCOS project imposing limits on users that want to consume this distro for its other features and use tools that come with other Fedora builds?

I totally understand the desire to keep the total package count as slim as possible but you are making cuts on packages that already ship in other Fedora distros without major problems doesn't make a lot of sense. This PR doesn't try to replace NetworkManager at all. The users that want to use systemd-networkd over NetworkManager understand the tradeoffs and risks involved with using it.

Also if the answer is we should just layer this in if we want systemd-networkd we can't easily do that because its bundled with systemd and adding in packages is actively discouraged right now because it risks breaking updates. There has been plenty of other cases of end users asking for systemd-networkd and that's why I feel this PR should be merged so we can use FCOS to fit our uses cases.

@bgilbert
Copy link
Contributor

What does this dogmatic philosophy of limiting choice gain the FCOS project as a whole?

We aren't trying to replicate Fedora Workstation or Server; we're trying to test a different hypothesis about building and shipping a Linux distro. Part of that hypothesis is that features and choices aren't always good, and so we try to be selective about the components and functionality we deliver. That also helps us focus development effort on the parts of the OS (e.g. Ignition) where we think we can bring something new to the table.

As a result, Fedora CoreOS is not a general-purpose operating system and will not meet everyone's needs. If it doesn't meet your needs or those of any other user, we'd always like to hear about it! But we might choose not to address the issue, in which case you may be better served by a different Edition or a different OS.

Afterburn doesn't have to support anything with systemd-networkd right now or ever.

Native images on DigitalOcean and Packet, when we eventually implement those, will use Afterburn to query the platform metadata service for network addressing and configure the network service.

Also if the answer is we should just layer this in if we want systemd-networkd we can't easily do that because its bundled with systemd and adding in packages is actively discouraged right now because it risks breaking updates.

If the systemd package maintainers would accept it, I think splitting systemd-networkd to its own subpackage would be an excellent solution. As you point out, layering packages is discouraged because it risks breaking updates (though see coreos/fedora-coreos-tracker#401), which is exactly the sort of messaging we'd want for something like this.

Let's discuss at the next community meeting. It'd be useful to hear more about potential use cases, since there might be another way to address your issue.

@dustymabe
Copy link
Member

+1 for bringing this up in the context of the issue tracker (broader audience) and tagging with meeting so we can hash things out there.

@bgilbert
Copy link
Contributor

bgilbert commented Aug 21, 2020

Filed coreos/fedora-coreos-tracker#610, which redirects discussion back here.

@cgwalters
Copy link
Member

We also just went through a huge exercise in chrony/NetworkManager integration over here: #412
Almost everything done in networking for OpenShift for example is focused on NM too.

I think indeed networkd should be sub-packaged as systemd-networkd as a first step though; rpm-ostree doesn't support "redownload package from the base image without exclusions" today although it might not be actually hard to do.

Can you elaborate on exactly why you want this? Is there a missing feature or is it more of e.g. "I like .link files"?

@jdoss
Copy link
Contributor Author

jdoss commented Aug 21, 2020

We also just went through a huge exercise in chrony/NetworkManager integration over here: #412
Almost everything done in networking for OpenShift for example is focused on NM too.

I totally understand that FCOS works on OpenShift needs as a top priority but not everyone wants to be using NM to fit their needs and I can tell you I am not using FCOS at all with Openshift or OKD.

Can you elaborate on exactly why you want this? Is there a missing feature or is it more of e.g. "I like .link files"?

It is easier to use systemd-networkd for WireGuard related things for me personally than NetworkManager and even wgquick@.service. Other colleagues that have tried to make the CoreOS Container Linux switch ended right back on Flatcar Linux because figuring out NetworkManager was not on their high list of things to do. I would like this to be a distro that is easy for non OpenShift users to consume and having systemd-networkd helps with that.

@Conan-Kudo
Copy link
Contributor

As a result, Fedora CoreOS is not a general-purpose operating system

This is a very unusual statement to make, @bgilbert. The literal marketing around Fedora CoreOS does not indicate that it isn't a general-purpose operating system beyond the preference for running containerized workloads on it.

Fedora CoreOS is an automatically-updating, minimal operating system for running containerized workloads securely and at scale.

This statement does not indicate that you can't use it for whatever purpose you want. From my perspective, if I want to run containerized workloads with an orchestration layer that takes advantage of networkd, I would want to be able to do so. What you're telling me with your statement is that FCOS isn't suitable for containerized workloads, just your idea of containerized workloads, which is fine, but there's nothing that indicates what that specifically is,

Now, I personally don't care much for systemd-networkd, but I have used it, and it's... fine for simpler network setups. I see no value in chainsawing that component out of Fedora CoreOS, especially for enabling more adoption of Fedora CoreOS for more containerized and hypervisor workloads.

@goochjj
Copy link

goochjj commented Aug 22, 2020

CoreOS, before it was bought out, was completely based on systemd-networkd.
CoreOS was already opinionated in several regards.
CoreOS, for instance, didn't let you overlay packages on top of the os, like rpmtree does.

If you're going to be fully opinionated, you shouldn't allow pollution w/ rpmtree. You should also be fully prepared to alienate the entire community that built up CoreOS to where it is. If your only intention was to provide the Fedora and Redhat opinionated model in a containerized, atomic updates fashioned, more people would have used Atomic Container Linux. If that failed and required you to purchase CoreOS, I'm not sure why that same tactic would now work again. Don't you want to make it easy for people already using CoreOS to switch? Or do you think it's fine that their ignition scripts won't work, the user management is different, and their entire network configuration needs to be completely redone?

Linux has always been about choice. If you don't want to support it out of the box, the LEAST you could do is split your rpms apart. If Fedora is going to be that opinionated, why isn't the F32 RPM ALREADY split apart?

Going from the opinionated model, why did you bother implementing NetworkManager at all, when you already had a perfectly serviceable networking stack in the os? Instead you deleted what was already provided and added more clutter and dead weight.

Personally, I'm good enough that I can redo a fedora-coreos config myself, I can rip out the opinionated stuff that's completely wrong, and implement what I want, even if it means ripping apart your RPMs to do it myself. But I'm not your target market, that's going to take me time, and that means right now I wouldn't touch FCOS with a 10 ft pole, I'm sticking with Flatcar. Otherwise you're forcing me to retool your distro because you can't be bothered to provide systemd tools, which your company already spent YEARS making the defacto standard in the Linux world, AND already provide in F32, which means you've broken the contract of "You can just install Fedora 32 RPMs over top" because... well. which RPM would we install? Systemd is "supposedly" already installed, it's just kneecapped.

@dustymabe
Copy link
Member

all - please keep comments productive and not accusatory. It's the best way to get things done.

@Conan-Kudo

As a result, Fedora CoreOS is not a general-purpose operating system

This is a very unusual statement to make, @bgilbert. The literal marketing around Fedora CoreOS does not indicate that it isn't a general-purpose operating system beyond the preference for running containerized workloads on it.

Fedora CoreOS is an automatically-updating, minimal operating system for running containerized workloads securely and at scale.

This statement does not indicate that you can't use it for whatever purpose you want. From my perspective, if I want to run containerized workloads with an orchestration layer that takes advantage of networkd, I would want to be able to do so.

I think this is slightly flawed. You can't take Fedora Workstation today and run with sysV init. You could maybe build your own Frankenstein that did use sysV init, but it wouldn't be Fedora Workstation.

@bgilbert
Copy link
Contributor

@Conan-Kudo I'd say it's included in the line you quoted: FCOS is a minimal OS for running containerized workloads. Of course you're free to do whatever you want with it, but we won't necessarily include functionality for every use case.

@jlebon
Copy link
Member

jlebon commented Aug 24, 2020

If the systemd package maintainers would accept it, I think splitting systemd-networkd to its own subpackage would be an excellent solution. As you point out, layering packages is discouraged because it risks breaking updates (though see coreos/fedora-coreos-tracker#401), which is exactly the sort of messaging we'd want for something like this.

Yeah, I think this is the cleanest solution. Our stance today is already that we don't/can't support arbitrary package layering. So IMO this is mostly a packaging issue (or to say another way: if it were a separate subpackage, we wouldn't go out of our way to try to prevent layering it). The only reason we had to resort to remove-from-packages was because it's confusing to bake in a whole other network stack from the one we chose.

@jdoss
Copy link
Contributor Author

jdoss commented Aug 24, 2020

The only reason we had to resort to remove-from-packages was because it's confusing to bake in a whole other network stack from the one we chose.

How is it confusing? Fedora Workstation and Server both leave it as part of systemd. Users can chose to use it or not. FCOS can chose to make NetworkManager their goto networking stack for Afterburn, Ignition, etc and they don't have to do anything on their end to support it.

I would argue that leaving networkd alone would help people moving from Container Linux over to FCOS and that would be a net win for the project. Instead we are making more work for the systemd people and reducing choice for users.

@goochjj
Copy link

goochjj commented Aug 24, 2020

If the systemd package maintainers would accept it, I think splitting systemd-networkd to its own subpackage would be an excellent solution. As you point out, layering packages is discouraged because it risks breaking updates (though see coreos/fedora-coreos-tracker#401), which is exactly the sort of messaging we'd want for something like this.

Yeah, I think this is the cleanest solution. Our stance today is already that we don't/can't support arbitrary package layering. So IMO this is mostly a packaging issue (or to say another way: if it were a separate subpackage, we wouldn't go out of our way to try to prevent layering it). The only reason we had to resort to remove-from-packages was because it's confusing to bake in a whole other network stack from the one we chose.

You could have

  1. Left the binaries, and masked the systemd units. The binaries are less than 1M.
  2. Split the packages in upstream
  3. Made the systemd package relocatable, and relocated the binaries somewhere else (or just mv'd them)
  4. Provided/spun off FCOS-only packages for systemd - perhaps in a separate repo, that does nothing but use CI to pull the actually repackage the already built systemd binary RPM into separate RPMs (also doing things like stripping out the unneeded documentation and manfiles)

Any one of those would have given people the ability to choose what they want to use.

Instead what I'd have to do is either

  1. Fork your rpmtree profiles and retool them
  2. tar.gz the binaries out of the rpm and convince ostree to accept a tar.gz,
  3. Make my own RPM putting BACK the stuff you removed (i.e. no 4 above)
  4. Dump the binaries in /usr/local/bin and resurrect copies of the base files (which means I don't get updates)

Which ultimately seems a lot of work to get back to where I already am.

I FULLY EXPECTED that the units would be masked - you use networkmanager. I get it. Delete the units for all I care. But to delete the binaries is just... well, disappointing.

@dustymabe
Copy link
Member

@goochjj - thanks for participating in the discussion. We're planning to bring this topic up at the weekly meeting on September 2nd. Unfortunately we already have plans for tomorrow's meeting.

If you'd like to be pinged in the #fedora-coreos channel on freenode when we start the meeting you can add your name to https://github.com/coreos/fedora-coreos-tracker/blob/master/meeting-people.txt.

@cgwalters
Copy link
Member

Left the binaries, and masked the systemd units. The binaries are less than 1M.

True. There is the "workstation/server do this" angle, though they also simply don't have the ability to remove the binaries without subpackaging (everything's a package to them).

@Conan-Kudo
Copy link
Contributor

Conan-Kudo commented Aug 26, 2020

Left the binaries, and masked the systemd units. The binaries are less than 1M.

True. There is the "workstation/server do this" angle, though they also simply don't have the ability to remove the binaries without subpackaging (everything's a package to them).

That's a lie. All variants of Fedora have a way to chainsaw stuff if need be. But by policy, we don't, because chain-sawing stuff out does not enable a good user experience.

@dustymabe
Copy link
Member

@Conan-Kudo. There is a difference between "that's a lie" and the statement being not true. I for one wasn't aware of the ability to chain-saw stuff out in other variants, so I could have easily made the same statement. Either way, you could have the same effect by just removing "That's a lie" from what you wrote, which can be interpreted as provocative.

For this topic whether or not other variants do or don't allow "chainsawing" doesn't progress the discussion so let's drop it for now. I'm still aiming to have the discussion about this ticket amongst the group in the meeting next Wednesday. Sorry it didn't happen today because of the previously planned material.

@Conan-Kudo
Copy link
Contributor

@dustymabe I do expect @cgwalters to know, given how much research into the matter he did before creating RPM-OSTree. That's a different expectation than I would have from you or someone else who I would not expect to know the full capabilities of Fedora's image creation pipeline.

@goochjj
Copy link

goochjj commented Aug 26, 2020

@dustymabe I do expect @cgwalters to know, given how much research into the matter he did before creating RPM-OSTree. That's a different expectation than I would have from you or someone else who I would not expect to know the full capabilities of Fedora's image creation pipeline.

I think it's fair to say anyone can do anything given enough time, ability and inclination to do so. I can overlay mount over top of /usr/lib/systemd... but I wouldn't recommend it nor would I expect you to support it. So we're not talking capability. We're talking about something a normal user can do.

The mechanics as far as a normal user doing something seems to be

  1. Overlay a package using rpm-ostree, which you don't want to provide support for. (Which I can understand up to a point)
  2. Dig into the internals of ostree (note ostree, NOT rpm-ostree) to place individual files
  3. Place the binaries manually somewhere else, not part of the immutable directory tree.

Note that 3 is problematic. I dropped F32 systemd-networkd binaries in /usr/local/bin but /usr/local isn't even mounted yet at the time systemd is trying to run it, AND updating the ostree changes the dynamic libraries and can break the binary, unless I compile it static.

Overlaying packages implies those packages will be updated (per rpm dependencies) when the underlying packages are updated and prevents this... I did a proof of concept (circuitously done with alien --to-deb -gv, rip out the binaries I don't want, rename to systemd-networkd, debuild, alien --to-rpm -gv and then removing all the errant %dirs) which worked perfectly, just reinforcing the options I last stated.

Considering removing the chainsaw from the FCOS config for the binaries, if not the unit files, just solves the problem. Sure, the user has to put the unit files in /etc/systemd/system (or use ignition to do so). That's way easier than any of the other options above.

@goochjj
Copy link

goochjj commented Aug 26, 2020

To be clear, at the minimum I feel this PR should remove these two lines

              /usr/lib/systemd/systemd-networkd,
              /usr/lib/systemd/systemd-networkd-wait-online,

Even if the others are left in place. I'd consider the others (/etc files, network/ files and units) "opinionated" and part of "Our system needs to boot and initialize the way we envision and support it".

Someone who wants networkd can certainly put those things in etc, or instead of chainsawing, just move them to another folder somewhere where they can be copied/resurrected.

@goochjj
Copy link

goochjj commented Aug 26, 2020

Or we could muddy the waters further, and talk about using netplan.io topside with different renderers, any takers? :-D

@Conan-Kudo
Copy link
Contributor

@goochjj Actually, I had packaged Netplan for Fedora a while ago, but never bothered to submit it for inclusion since nobody was interested in it...

@LorbusChris
Copy link
Contributor

Personally, I'd prefer to solve this by splitting up the systemd RPM into more fine-grained subpackes, with networkd being one of them. It could then be installed manually as overlay with rpm-ostree like any other package.

The current way we're excluding the files from the ostree seems to make it exceptionally hard for users that want to use networkd.

@dustymabe
Copy link
Member

NOTE: cross posted with a comment in the tracker issue

We discussed this during the community meeting today.

The two options that were discussed were:

  • A. Ship networkd in the base layer. Stop removing the binary files during compose.
    • Also disable it by default using a systemd dropin and write documentation on how to enable it, along with accompanying documentation on the possible things that might not work as a result of enabling it.
  • B. Break out networkd into its own subpackage (if upstream systemd will accept the split) so it can be package layered easily.
    • Add a FAQ entry about using networkd.

There was lots of great discussion and creative ideas on how to solve the problem and the implications of both options. For now we have settled on:

13:35:07 dustymabe | #agreed we'll reach out to the systemd team to see how they
                   | feel about making a systemd-networkd subpackage. If they
                   | refuse or are not interested we will explore option A
                   | (including systemd-networkd in the base layer) but with a
                   | dropin that disables it by default.

The next step here is to reach out to the systemd team about the package split and base our future work on the outcome of that discussion.

@arizvisa
Copy link

arizvisa commented Sep 7, 2020

Came to this late because I'd planned to do the research for transitioning from former CoreOS to FCOS or Flatcar at the beginning of this month.

But, +10 for enabling users to have a choice. Right now the transition from former CoreOS to FCOS or Flatcar (which we're targetting to do in a month or two) is just now starting to look like a choice to either choose Flatcar and miss out on being able to drop rkt for libpod+friends and some other great features, or to choose FCOS and have to do development effort to transition to NetworkManager instead of systemd-networkd.

Systemd is pretty consistent with regards to configuring all of its components due to having a unified configuration, and thus we use the same system for configuring all of our production state. (That was one of the initial benefits of systemd to begin with I'd thought).

Historically NetworkManager has always been very non-explicit due to its flexibility with regards to its api, and its plugin capabilities. This is perfect on the desktop or other situations where it's necessary for applications to play with the network, but on a server the very first thing everyone does is the assignment of setting plugins="". Applying this to FCOS has unfortunately resulted in the "network" configuration section being pulled entirely from ignition and then an extra commandline option ended up being hacked into coreos-installer as a workaround.

NetworkManager's original goals and I quote from Fedora's own wiki:

NetworkManager provides automatic network detection and configuration for the system. Once enabled, the NetworkManager service also monitors the network interfaces, and may automatically switch to the best connection at any given time.```

Then for servers:

... We are trying to make NetworkManager as suitable for this task as possible...

I get this is totally a religious debate, and OpenShift needs a powerful API to accomplish its needs. But it's also important to recognize that there's a whole other category of users that don't depend on OpenShift/k8s. The "kneecapping" of systemd-networkd is explicitly excluding this whole subset and not giving users an easy way to workaround it (short of rebuilding). CoreOS was originally a minimalistic distribution for handling any kind of containerized work "optimized for Kubernetes but also great without it". At the very least docs could be updated to cull out the last 3 words if that's what's intended.

Nonetheless, I'm looking forward to seeing some of the suggested solutions applied, and this (what should be a simple issue) resolved.

@bgilbert
Copy link
Contributor

bgilbert commented Sep 8, 2020

@arizvisa Note that even if networkd became available in Fedora CoreOS via some mechanism (such as package layering), it would not generally work and we'd strongly advise against using it. The network management system has many integration points with the rest of the distro and we're not prepared to maintain those across multiple systems. Unfortunately, migrating from CoreOS Container Linux to FCOS does require migration effort in a variety of areas, and we don't expect that to change.

Applying this to FCOS has unfortunately resulted in the "network" configuration section being pulled entirely from ignition and then an extra commandline option ended up being hacked into coreos-installer as a workaround.

As it turns out, we would have removed that section anyway. The networkd section, unlike the systemd section, turned out to be entirely sugar over the files section.

You can use the files section to write a NetworkManager key file to configure NetworkManager in the real root. The coreos-installer option addresses a different case — applying custom network configuration to the initramfs on first boot — which was never addressed in CoreOS Container Linux.

The "kneecapping" of systemd-networkd is explicitly excluding this whole subset and not giving users an easy way to workaround it (short of rebuilding).

What's your use case that can be handled by networkd but not NetworkManager? Such cases do exist, but NetworkManager can handle many server scenarios without issue.

@arizvisa
Copy link

@arizvisa Note that even if networkd became available in Fedora CoreOS via some mechanism (such as package layering), it would not generally work and we'd strongly advise against using it. The network management system has many integration points with the rest of the distro and we're not prepared to maintain those across multiple systems. Unfortunately, migrating from CoreOS Container Linux to FCOS does require migration effort in a variety of areas, and we don't expect that to change.

Does this imply that you're not going to restore the existence of the systemd-networkd binaries? If so, there's no reason to keep this PR open. On a related note, what other places (other than initramfs) does network management integrate itself with? Is it a fedora-ism, or more than just moving network config around between stages?

Applying this to FCOS has unfortunately resulted in the "network" configuration section being pulled entirely from ignition and then an extra commandline option ended up being hacked into coreos-installer as a workaround.

As it turns out, we would have removed that section anyway. The networkd section, unlike the systemd section, turned out to be entirely sugar over the files section.

You can use the files section to write a NetworkManager key file to configure NetworkManager in the real root. The coreos-installer option addresses a different case — applying custom network configuration to the initramfs on first boot — which was never addressed in CoreOS Container Linux.

Yep. Was just pointing out that the copy-networking option to the installer seemed pretty out-of-band. Not debating this anyways.

The "kneecapping" of systemd-networkd is explicitly excluding this whole subset and not giving users an easy way to workaround it (short of rebuilding).

What's your use case that can be handled by networkd but not NetworkManager? Such cases do exist, but NetworkManager can handle many server scenarios without issue.

Specifically being able to have one universal configuration that can change depending on the virtualization platform that it's been deployed on. I have a project that generates clusters for fuzzing that can be distributed and is independant of platforms (aws, vmware, etc.). Each manager (which runs CoreOS) links up with all the managers that it can communicate with (across platform). All their configuration is being generated from json using systemd dropins to add customizations on top of the base systemd units which happens at build time. The managers are also responsible for all networking (via dhcp-server) for each client in their respective network.

I guess in short, the specific options for systemd-network are in "Match", there's "Virtualization=". In "DHCPV4", there's "ClientIdentifier=", and "DUIDType=". In "Network", there's "DHCPServer=", "Domains=", and "IPForward=". At some point I'm supposed to add IPsec too.

NetworkManager has plugins. So I'm sure what I mentioned is possible, but right now with the little-bit that I know about NetworkManager it looks like this'll be more of a refactor due to being unable to layer the config anymore unless I could maybe include another file in a keyfile to allow for layering, and then I'd need to maybe write a systemd unit that will symlink config files into NetworkManager if certain conditionals fire.

I had expected this to be just a simple migration to cool tech like rpm-ostree and podman, yet it definitely seems like it's more of a refactor to special-case the generation of the network configuration, containerize things, and really just fragment something that already was simple and elegant.

@Conan-Kudo
Copy link
Contributor

While NetworkManager supports limited configuration layering, it does not, to the best of my knowledge, support fuzzy network configuration in the same way that networkd does.

It is important to note that NetworkManager's configuration paradigm is oriented around connections rather than devices, and theoretically this can work the same way as networkd's configuration does. But it's limited and cannot be set up with the same kind of specificity that networkd match rules permit.

So I suppose it depends on how flexible you need fuzzy networking to be. Depending on how far you want to go, you may only be able to pull it off with networkd.

Regardless of all this, I still think that chainsawing binaries out of the image is a bad thing and reduces the potential of where Fedora CoreOS could be used to support hyperscale environments.

@bgilbert
Copy link
Contributor

Does this imply that you're not going to restore the existence of the systemd-networkd binaries?

The current status of this PR is in #574 (comment).

On a related note, what other places (other than initramfs) does network management integrate itself with? Is it a fedora-ism, or more than just moving network config around between stages?

CoreOS-specific pieces, off the top of my head:

  • coreos-installer install --copy-network and corresponding glue in the initramfs
  • Afterburn code for configuring the network from platform-specific metadata services
  • Chrony integration
  • initramfs glue to tear down the network when exiting the initramfs
  • Config fragments to set behavior for things like DHCP client IDs, link-local address generation, and ignored network interfaces

We also rely on NetworkManager's support for the legacy dracut ip= command-line parameters, which networkd doesn't have. And we continue to add more integration over time.

I guess in short, the specific options for systemd-network are in "Match"

Yup, that's a known shortcoming of NM right now.

@LorbusChris
Copy link
Contributor

I have a PR on the rpm spec to split networkd off into a sub-package: https://src.fedoraproject.org/rpms/systemd/pull-request/25

I'll post do fedora-devel and coreos-devel about this shortly.

@goochjj
Copy link

goochjj commented Sep 11, 2020

CoreOS-specific pieces, off the top of my head:

  • coreos-installer install --copy-network and corresponding glue in the initramfs
  • Afterburn code for configuring the network from platform-specific metadata services
  • Chrony integration
  • initramfs glue to tear down the network when exiting the initramfs
  • Config fragments to set behavior for things like DHCP client IDs, link-local address generation, and ignored network interfaces

We also rely on NetworkManager's support for the legacy dracut ip= command-line parameters, which networkd doesn't have. And we continue to add more integration over time.

I'm going to chime in here and say absolutely none of the above NetworkManager-specific integrations matter to me in the slightest. I can configure Chrony. My interfaces aren't going to transition up and down. Most of my net config is static, I neither need nor want Desktop-style automagic in my network configs, and even if I did do a DHCP based config, I'm sure I could configure chrony with ignition, or myself. And I see no reason for initramfs to need a network - especially with FCOS. Config should be coming from ignition, any custom packages I'd roll into my own image before deployment.

What I DO have is a significant investment in systemd-networkd configurations, generated by netplan.io or myself, with complicated scenarios including ipip tunnels, multiple routing tables and policy rules. I want them initialized, and then I want the network left alone as much as possible.

I'm sure it's possible, with time and development, for me to figure out the "NetworkManager Way". But I'm not interested. I can drop my systemd-networkd configs onto ANY system with systemd (except yours) and they'll work. THAT's what I need.

@bgilbert
Copy link
Contributor

And I see no reason for initramfs to need a network - especially with FCOS. Config should be coming from ignition, any custom packages I'd roll into my own image before deployment.

Ignition runs in the initramfs and fetches configs (and resources referenced by the configs) over the network.

@arizvisa
Copy link

On a related note, what other places (other than initramfs) does network management integrate itself with? Is it a fedora-ism, or more than just moving network config around between stages?

CoreOS-specific pieces, off the top of my head:
* coreos-installer install --copy-network and corresponding glue in the initramfs
* Afterburn code for configuring the network from platform-specific metadata services
* Chrony integration
* initramfs glue to tear down the network when exiting the initramfs
* Config fragments to set behavior for things like DHCP client IDs, link-local address generation, and ignored network interfaces

We also rely on NetworkManager's support for the legacy dracut ip= command-line parameters, which networkd doesn't have. And we continue to add more integration over time.

Thank you, @bgilbert, for listing these. It'd be good to know in advance what other things it might interfere with after bootup. Although I'd believe that after a server boots up, the last thing you'd want is something considering tinkering around with the network anyways. At the moment, with the items that were presented, this definitely seems doable without much (if any) changes once the systemd-networkd binaries are available again.

What I DO have is a significant investment in systemd-networkd configurations, generated by netplan.io or myself, with complicated scenarios including ipip tunnels, multiple routing tables and policy rules. I want them initialized, and then I want the network left alone as much as possible.

Similarly, because systemd was guaranteed to be used on all linux-based platforms, a significant amount of time was invested in developing a "unified" configuration. Having support for systemd's conditionals and layering made this unification possible. Thus, seeing the systemd package change like it did was very surprising, especially because systemd already implemented a number of things that were replaced with some desktop-related software.

I'm sure it's possible, with time and development, for me to figure out the "NetworkManager Way". But I'm not interested. I can drop my systemd-networkd configs onto ANY system with systemd (except yours) and they'll work. THAT's what I need.

I find it strange that so much work was done for some of these capabilities to avoid using systemd components, rather than improving systemd so that it satisfies everyone's needs better. Chrony itself appears to be replaceable with systemd-timesyncd even. It seems that NetworkManager was chosen because of its dbus api, and the logic was that it can incrementally improved so that it works in the way that it's needed.

However, can this not be done with systemd as well? Due to its youth, you'd think that it'd be more malleable as adding required features will likely end up with the addition of code, rather than a modification or removal of code. There's likely some legitimate history here wrt the avoidance of systemd components that I'm not aware of, and that's fine I guess...but I really hope that it's not political.

@arizvisa
Copy link

I'm going to chime in here and say absolutely none of the above NetworkManager-specific integrations matter to me in the slightest. I can configure Chrony. My interfaces aren't going to transition up and down. Most of my net config is static, I neither need nor want Desktop-style automagic in my network configs, and even if I did do a DHCP based config, I'm sure I could configure chrony with ignition, or myself. And I see no reason for initramfs to need a network - especially with FCOS. Config should be coming from ignition, any custom packages I'd roll into my own image before deployment.

Actually, the last thing you'd want in an airgapped network even is for somebody to be able to trigger a network-up/network-down transition post-configuration. I don't know if NetworkManager does this by default, but on desktops it's _very_ good at doing things by itself which is particularly concerning for a use-case where everything should be predictable.

First-boot makes sense to use NetworkManager for ignition, since the instance is still figuring out how it should act on said network, but afterwards nothing else should really matter as it should be "locked-in" until it's re-ignitioned.

@bgilbert
Copy link
Contributor

I find it strange that so much work was done for some of these capabilities to avoid using systemd components, rather than improving systemd so that it satisfies everyone's needs better. [,,,] There's likely some legitimate history here wrt the avoidance of systemd components that I'm not aware of, and that's fine I guess...but I really hope that it's not political.

My perspective here; others on the team may have different views.

In the end I'd say it was mainly a question of resources. The integration points I listed would have been necessary either way; some would have been easier under networkd, some harder. But we had to choose a network management system that can handle whatever we end up needing from it, for perhaps the next decade. As much as I like networkd, it's way too static for the modern world; we were already encountering that problem in the Container Linux days. Red Hat has a lot of NetworkManager experience and a team to maintain it, while (at the time of the decision, as far as I know,) networkd was neither widely used nor actively maintained. We could have chosen to dedicate some of our own resources to improving networkd as you suggest — we actually discussed it — but some other part of the distro would have suffered as a result. FCOS already shipped later than we wanted, on fewer platforms than we wanted, with a weaker migration plan than we wanted, so I don't think that would have been the right call.

I'm sympathetic to the extra work this imposes on people with complex configs. I know NetworkManager can't currently support the highly-conditionalized configs that networkd can. And I'm well aware that the migration process from Container Linux is so involved that some users may opt for another distro instead. But for better or worse, this is where we are. I'm hopeful that over time NetworkManager will grow to support some of the missing use cases; the effort required will benefit users far beyond Fedora.

If we can get networkd moved out to a subpackage for advanced users to layer in, great. If we end up shipping the binary and requiring an override dropin to enable it, okay. But without solid integration into the rest of the distro, I don't think we can recommend in good conscience that networkd be used in Fedora CoreOS. If the result is that FCOS doesn't meet the needs of some set of users, those users should switch to another Linux distro with my blessing — or, if so inclined, help us meet their needs within the practical constraints we have.

Hopefully that clarifies the situation a bit. #574 (comment) is still the plan of record.

@keszybz
Copy link

keszybz commented Oct 1, 2020

We also rely on NetworkManager's support for the legacy dracut ip= command-line parameters, which networkd doesn't have.

Minor correction here: https://www.freedesktop.org/software/systemd/man/systemd-network-generator.html was added in v245.

@dustymabe
Copy link
Member

Cross posted with coreos/fedora-coreos-tracker#610 (comment)

As part of the Fedora 33 rebase users will be able to package layer the systemd-networkd rpm on top of Fedora CoreOS because the package was split into a subpackage for Fedora 33. This work landed in the next stream in 33.20201006.1.0:

[core@fedora ~]$ sudo rpm-ostree install systemd-networkd
Checking out tree 13a9eed... done
Enabled rpm-md repositories: fedora-cisco-openh264 updates-testing updates fedora
Updating metadata for 'fedora-cisco-openh264'... done
rpm-md repo 'fedora-cisco-openh264'; generated: 2020-08-25T19:10:34Z
Updating metadata for 'updates-testing'... done
rpm-md repo 'updates-testing'; generated: 2020-10-12T22:32:57Z
Updating metadata for 'updates'... done
rpm-md repo 'updates'; generated: 2018-02-20T19:18:14Z
Updating metadata for 'fedora'... done
rpm-md repo 'fedora'; generated: 2020-10-12T10:43:21Z
Importing rpm-md... done
Resolving dependencies... done
Will download: 1 package (478.4 kB)
Downloading from 'updates-testing'... done
Importing packages... done
Checking out packages... done
Running pre scripts... done
Running post scripts... done
Running posttrans scripts... done
Writing rpmdb... done
Writing OSTree commit... done
Staging deployment... done
Added:
  systemd-networkd-246.6-3.fc33.x86_64
Run "systemctl reboot" to start a reboot
[core@fedora ~]$ 
[core@fedora ~]$ sudo rpm-ostree status
State: idle
Deployments:
  ostree://fedora:fedora/x86_64/coreos/next
                   Version: 33.20201006.1.0 (2020-10-07T17:48:23Z)
                BaseCommit: 13a9eedf16110c6b7b52cbd384bc8ad34501a8ee7fb2b9141d3cc64263f611ce
              GPGSignature: Valid signature by 963A2BEB02009608FE67EA4249FD77499570FF31
                      Diff: 1 added
           LayeredPackages: systemd-networkd

● ostree://fedora:fedora/x86_64/coreos/next
                   Version: 33.20201006.1.0 (2020-10-07T17:48:23Z)
                    Commit: 13a9eedf16110c6b7b52cbd384bc8ad34501a8ee7fb2b9141d3cc64263f611ce
              GPGSignature: Valid signature by 963A2BEB02009608FE67EA4249FD77499570FF31

@dustymabe dustymabe closed this Oct 13, 2020
@arizvisa
Copy link

Awesome.

c4rt0 pushed a commit to c4rt0/fedora-coreos-config that referenced this pull request Mar 27, 2023
overlay: combine reboots for FIPS and Ignition kargs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet