Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0108] NixOS Container rewrite #108

Merged
merged 20 commits into from
Jan 12, 2022
Merged

Conversation

Ma27
Copy link
Member

@Ma27 Ma27 commented Oct 5, 2021

rfcs/0000-nixos-containers.md Outdated Show resolved Hide resolved
@roosemberth
Copy link

There's something itching my brain: Maybe we could explore link-local addressing (especially when using zones). Instead of requiring a central network manager (be dhcpcd, systemd-networkd, radvd, ...), containers could simply use a scope (e.g. the interface with the name of the zone) and llmnr.

I'm unable to formalize this as a section (also, simply creating a veth pair does not seem to automatically assign ll addresses).
Perhaphs it would be better to discuss the actual implementation of something like this in a separate RFC (or probably just an MR, since I don't expect much semantics to discuss), but I do believe this document should mention it (or at least discuss this here).

@Ericson2314
Copy link
Member

Ericson2314 commented Oct 6, 2021

The plan sounds good to me (though I am really not up to speed with the details).

One thing I'd like to keep in mind, however, is that we should be able to do more immutable NixOS images than we can today. For example, there is my little NixOS/nixpkgs#120565 to disable Nix in the NixOS image, and @grahamc's more substantial work with netboots, initramfs-as-root (IIUC). I would to make sure that whatever we plan on today is forwards-comparable with such things --- ideally with some text in the RFC explicitly saying so.

@Ma27
Copy link
Member Author

Ma27 commented Oct 6, 2021

There's something itching my brain: Maybe we could explore link-local addressing (especially when using zones). Instead of requiring a central network manager (be dhcpcd, systemd-networkd, radvd, ...), containers could simply use a scope (e.g. the interface with the name of the zone) and llmnr.

We need systemd-networkd since it's possible then to let systemd configure the network while the container is starting up. Otherwise you'll end up with a non-functional uplink inside the container until it's fully up.

I'm unable to formalize this as a section (also, simply creating a veth pair does not seem to automatically assign ll addresses).

Not sure if I'm misunderstanding you, but this is not the case. This only happens because of Address=0.0.0.0/28 & DHCPServer=yes on the host-side. I haven't checked it in this case, but perhaps it's sufficient for your use-case to enable link-local IPv6 addressing on both veth interfaces and you're done.

Anyways, the whole point of this RFC is to provide sufficient flexibility for the user to implement what they want. Yes, there are some standard-usecases with some shorthands, but in the end everything is rendered to network/nspawn units that can be modified with the module-system. Does this make sense to you? :)

One thing I'd like to keep in mind, however, is that we should be able to do more immutable NixOS images than we can today. For example, there is my little NixOS/nixpkgs#120565 to disable Nix in the NixOS image, and @grahamc's more substantial work with netboots, initramfs-as-root (IIUC)

What you're referring to should be totally possible. By default, you'll need a read-only bind-mount of the store and you should be good to go. For both the declarative and imperative use-case the building happens outside of the container currently.

As a side-note, there's also this messy thing https://github.com/NixOS/nixpkgs/pull/140669/files#diff-d4796876ca68a9600f3d553d2009fe9fc0df69f46f84ba565d3a53654bb8f3f5R195-R209 which generates bind-mounts for only the store-paths that are really needed. I'm not sure how bad this idea really is which is why I left it out in the RFC on purpose and will probably vanish from this PR before it'll be merged, but I still thought I'd share it with you to get some more qualified opinions on it :)

I would to make sure that whatever we plan on today is forwards-comparable with such things --- ideally with some text in the RFC explicitly saying so.

Thanks, noted.

It's sufficient to have the store-paths available.
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
### Default Mode

If nothing else is specified, the [default settings of `systemd-nspawn`](https://github.com/systemd/systemd/blob/v247/network/80-container-ve.network) will
be used for networking. To briefly summarize, this means:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should be too specific with the networkd config format internals here, and just describe how it'll look like from a birds-eye perspective.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is this about linking the default network unit for containers here? While I made some changes regarding this paragraph, I think that linking this isn't too bad since this RFC assumes networkd network stack.

rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
Co-authored-by: Kevin Cox <kevincox@kevincox.ca>
@spacekookie spacekookie added status: open for nominations Open for shepherding team nominations and removed status: new labels Oct 7, 2021
@spacekookie
Copy link
Member

Excited about this RFC! It's now open for shepherd nominations!

@kloenk
Copy link
Member

kloenk commented Oct 7, 2021

I nominate @yu-re-ka for knowledge in systemd-nspawn and networkd/scripts.

@Mic92 Mic92 removed the status: open for nominations Open for shepherding team nominations label Dec 15, 2021
@Lassulus
Copy link
Member

Lassulus commented Dec 16, 2021

alright, seems like we have a date in the new year:

2022-01-02 13:00:00 CET

I suggest we meet on my jitsi instance (if its not broken on that day) https://jitsi.lassul.us/rfc-108

EDIT: please thumbs up this post as ACK, otherwise I will message you as a reminder!

Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
@Lassulus
Copy link
Member

Lassulus commented Jan 2, 2022

alright, seems like we have a date in the new year:

2022-01-02 13:00:00 CET

I suggest we meet on my jitsi instance (if its not broken on that day) https://jitsi.lassul.us/rfc-108

EDIT: please thumbs up this post as ACK, otherwise I will message you as a reminder!

@Ma27 @Mic92 @lheckemann @SuperSandro2000 @arianvp
this is today in ~2 hours

Copy link
Member

@lheckemann lheckemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, just some phrasing, nits and minor improvements which I'd suggest.

rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
rfcs/0108-nixos-containers.md Outdated Show resolved Hide resolved
@arianvp
Copy link
Member

arianvp commented Jan 2, 2022

@Ma27 would this allow us to do this, i.e.:

  • create network NS
  • control in which NS the container is created

There was a PR a while ago enabling this for the existing setup.

Btw systemd/systemd#14915 (the follow-up of systemd/systemd#11710) hasn't been merged yet - is this relevant?

It's not relevant. That is for when networkd is not running in the container. e.g. when the container is not a full OS container but a more classic application container. Given we have a full NixOS instance running in the container we can configure the network namespace from inside the container using networkd.

@Lassulus
Copy link
Member

Lassulus commented Jan 2, 2022

so all shepherds agreed this ready for FCP
@lheckemann

thanks for everyone involved, keep up the good work!

@Mic92 Mic92 added status: FCP in Final Comment Period and removed status: in discussion labels Jan 2, 2022
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/fcp-for-rfc-108-nixos-container-rewrite/16936/1

Ma27 and others added 2 commits January 3, 2022 22:18
Comment on lines +400 to +401
Additionally, the way how the container's new config will be activated can be specified
via `--reload` or `--restart` passed to `nixos-nspawn update`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From looking at the functionality within the POC, I think the "reload" and "none" strategies may be a little overloaded in the context of imperative containers managed like full systems of their own.

It's not clear if reload will be reloading configuration inside or outside of the container. I would expect it to be both, however the real behaviour is to switch-to-configuration within the container, and update the nspawn configuration on the host without actually applying any changes, since that would require a restart of the container. Also, since it's doing switch-to-configuration, all the actual services in the container will be reloaded or restarted irrespective of this strategy.

I had a couple of suggestions initially, but in typing this out I think the easiest thing that could be done here is to rename reload to switch, in reference to nixos-rebuild switch. The consequences of this are now much more obvious and predictable given you know how nixos-rebuild works, and it maintains at least some consistency between the two commands (even though it's a flag in this case).

With that clarification, "none" does make a bit more sense as an activation strategy, but it is still a little misleading. Something is being done to activate the config, it's just not immediately taking place. I think a name like "next-restart" or "boot" (again to be consistent with nixos-rebuild) would make more sense IMO.

I'm interested to hear your thoughts on this and how the strategy names came about in the first place :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm interested to hear your thoughts on this and how the strategy names came about in the first place :)

  • reload was to remain consistent with the underlying operation (systemctl reload) for declarative containers and I decided to use the same things for imperative ones. switch is IMHO a bit ambiguous as it isn't clear how it differs from restart.
  • none was used because switch-to-configuration.pl is effectively doing nothing here. If everyone else agrees, we may want to switch to boot though, that sounds reasonable as well :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reload was to remain consistent with the underlying operation (systemctl reload) for declarative containers and I decided to use the same things for imperative ones. switch is IMHO a bit ambiguous as it isn't clear how it differs from restart.

Ok yeah, I hadn't considered reading them as synonyms of systemd reload/restart, which does make far more sense now :) On that note, scratch my suggestion for that, however I would still be a fan of none -> boot, or possibly "onboot" to imply the event that will trigger activation.

@Ma27
Copy link
Member Author

Ma27 commented Jan 4, 2022

As discussed in Sunday's meeting, 58ab950 makes it explicit that the old implementation will be deprecated (previously, it wasn't that explicit).

The changes before were mostly about nitpicks & naming. If any of that means that the FCP-phase should be longer, please let me know :)

@edolstra edolstra merged commit 5fa45c9 into NixOS:master Jan 12, 2022
@edolstra
Copy link
Member

@Ma27 I don't think it warrants extending the RFC, so we've merged it. Thanks!

@edolstra edolstra added status: accepted and removed status: FCP in Final Comment Period labels Jan 12, 2022
@Ma27 Ma27 deleted the container-rfc branch January 12, 2022 14:46
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixcon-governance-workshop/32705/9

KAction pushed a commit to KAction/rfcs that referenced this pull request Apr 13, 2024
* Initial nixos container commit because I want to view diffs between changes

* Updates / Improvements

* Note regarding static networking

* Updates

* update

* Update.

* Update.

* Update

Thanks Linus for the proof-read :)

* Update.

Mention that nixops container backend is also affected as pointed out by
fpletz.

* fixup! wording, spelling

* Move document according to assigned RFC number

* Linkify related issues

As it's done in https://github.com/NixOS/rfcs/blob/master/rfcs/0004-replace-unicode-quotes.md

* Bind-mounting Nix state isn't strictly needed

It's sufficient to have the store-paths available.

* Wording

Co-authored-by: Kevin Cox <kevincox@kevincox.ca>

* Clarify "default mode", remove a few unnecessary implementation details

* rfc108: add shepherds

Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>

* Apply wording suggestions from @lheckemann

Co-authored-by: Linus Heckemann <git@sphalerite.org>

* Usage of `nsenter` to update a container's configuration is not relevant for the RFC

* Rename `config` to `system-config` to avoid ambiguities in the module system

* Explicitly talk about deprecation of current implementation

Co-authored-by: Erik Arvstedt <erik.arvstedt@gmail.com>
Co-authored-by: Kevin Cox <kevincox@kevincox.ca>
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
Co-authored-by: Linus Heckemann <git@sphalerite.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.