Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

networking: Inject implicit constraints on CNI plugins when using bridge mode #15473

Merged
merged 4 commits into from
Mar 27, 2024

Conversation

shoenig
Copy link
Member

@shoenig shoenig commented Dec 5, 2022

This PR adds a job mutator which injects constraints on the job taskgroups
that make use of bridge networking. Creating a bridge network makes use of the
CNI plugins: bridge, firewall, host-local, loopback, and portmap. Starting
with Nomad 1.5 these plugins are fingerprinted on each node, and as such we
can ensure jobs are correctly scheduled only on nodes where they are available,
when needed.

Targets v1.5.0

Closes #14022

@angrycub
Copy link
Contributor

angrycub commented Dec 5, 2022

Looks good to me.

Copy link
Contributor

@lgfa29 lgfa29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! It could be helpful to have a comment next to nomadCNIConfigTemplate with a reminder to add or remove these implicit constraints if the template changes.

This also makes me thing about mixed version clusters 🤔

If you happen to have clients with different nomadCNIConfigTemplate, these constraints may not necessarily be true since they are applied in the servers. But that's a very edge case.

It does make me wonder if we could generate these constraints dynamically from nomadCNIConfigTemplate 😅

@shoenig
Copy link
Member Author

shoenig commented Dec 6, 2022

Ooh the upgrade path, yeah that gets interesting. When we have added implicit constraints in the past it was for stuff like Consul version, where clients had long already been fingerprinted, or native service discovery, which required new clients anyway. This is a case where we need to maintain compatibility with old clients not yet being fingerprinted.

Not sure we can merge this until like 1.5.x, or if we backport #15473 first and wait a while.

@shoenig
Copy link
Member Author

shoenig commented Dec 1, 2023

Kinda missed the boat on 1.7; pushing to 1.8

@shoenig shoenig modified the milestones: 1.7.0, 1.8.0 Dec 1, 2023
…dge mode

This PR adds a job mutator which injects constraints on the job taskgroups
that make use of bridge networking. Creating a bridge network makes use of the
CNI plugins: bridge, firewall, host-local, loopback, and portmap. Starting
with Nomad 1.5 these plugins are fingerprinted on each node, and as such we
can ensure jobs are correctly scheduled only on nodes where they are available,
when needed.
@tgross
Copy link
Member

tgross commented Mar 27, 2024

I've rebased this PR on main, added handling of the early exit we introduced for NUMA, and added an upgrade note.

Copy link
Contributor

@lgfa29 lgfa29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Maybe worth adding a note to the bridge config that these constraints may need to be updated if the bridge spec is modified.

// Update website/content/docs/networking/cni.mdx when the bridge configuration
// is modified.
const nomadCNIConfigTemplate = `{

nomad/structs/job.go Outdated Show resolved Hide resolved
@tgross
Copy link
Member

tgross commented Mar 27, 2024

Maybe worth adding a note to the bridge config that these constraints may need to be updated if the bridge spec is modified.

Done!

@tgross tgross merged commit 6ad648b into main Mar 27, 2024
21 checks passed
@tgross tgross deleted the f-cni-plugins-constraints branch March 27, 2024 20:11
philrenaud pushed a commit that referenced this pull request Apr 18, 2024
…dge mode (#15473)

This PR adds a job mutator which injects constraints on the job taskgroups
that make use of bridge networking. Creating a bridge network makes use of the
CNI plugins: bridge, firewall, host-local, loopback, and portmap. Starting
with Nomad 1.5 these plugins are fingerprinted on each node, and as such we
can ensure jobs are correctly scheduled only on nodes where they are available,
when needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fingerprint: should be aware of available cointainernetworking plugin
4 participants