-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: expand documentation on node pools #18109
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great @lgfa29! I'm about halfway thru but wanted to leave comments and then come back to finish the rest of my read-thru later today.
Using affinities and constraints has the downside of only allowing allocations | ||
to gravitate towards certain nodes, but it does not prevent placements of jobs | ||
where the rules don't apply. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using affinities and constraints has the downside of only allowing allocations | |
to gravitate towards certain nodes, but it does not prevent placements of jobs | |
where the rules don't apply. | |
Using affinities and constraints has the downside of only allowing allocations | |
to gravitate towards certain nodes, but it does not prevent placements of other | |
jobs for which the rules don't apply. |
I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im a little confused here: constraints are hard requirements and "Job placement fails if a constraint cannot be satisfied." but at the same time " it does not prevent placements of jobs where the rules don't apply" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think?
Yup, that's right. Thanks!
Im a little confused here: constraints are hard requirements and "Job placement fails if a constraint cannot be satisfied." but at the same time " it does not prevent placements of jobs where the rules don't apply" ?
Yeah, it's kind of weird to describe in text, so maybe I should put an example.
But imagine you have two jobs, one with a constraint and one without any constraints:
job "app" {
constraint {
attribute = "${meta.env}"
value = "prod"
}
# ...
}
job "db" {
# ...
}
The app
job is only allowed to run in clients with metadata env=prod
, but the db
is free to run anywhere, including clients with env=prod
.
So if you want to reserve the env=prod
clients to only run app
then you need a negative constraint in all other jobs:
job "db" {
constraint {
attribute = "${meta.env}"
operator = "!="
value = "prod"
}
# ...
}
So constraints are kind of unidirectional in this sense: you can restrict a job to specific nodes, but you can't restrict a node to specific jobs.
Does this help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this paragraph as I moved this section to the Scheduler
section
nomad/website/content/docs/concepts/scheduling/placement.mdx
Lines 36 to 38 in 4f4ba9f
One restriction of using affinities and constraints is that they only express | |
relationships from jobs to nodes, so it is not possible to use them to restrict | |
a node to only receive allocations for specific jobs. |
Is this better or should I add an example to help illustrate the problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great @lgfa29! I've left a few more comments but I think if we resolve those we're good-to-go here.
|
||
A more generic term used to refer to machines running Nomad agents in client | ||
mode. Unless noted otherwise, it may be used interchangeably with | ||
[client](#client). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when can it be different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of disagree with noting that it may be used interchangeably (even though it often is). Node is intended to refer to the machine (and operating system, etc.) while Client refers to the Nomad agent running in client mode. We should avoid conflating the terms in our own docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when can it be different?
"Node" is a very generic term that different people use it to mean different things 😅
For example, a node in a network context can be anything sending or receiving traffic, like a computer, a router, a switch, or maybe even an allocation, like a sidecar proxy. So if you're talking about Nomad's network topology, "node" may not necessarily mean this specific definition.
But to @shoenig's point, I think I didn't express myself well here. I meant more like "you may see or hear people using them interchangeably", which I have in informal conversations, issues, community blog posts etc.
Maybe this sentence would be better like this:
Despite being different concepts, you may find the term "node" being used interchangeably with "client" in some materials.
Node is intended to refer to the machine (and operating system, etc.)
To the machine running a Nomad client right? I don't recall us using "node" to refer to machines running servers. For example, the nomad node
commands and /v1/node/
endpoints are only about clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ping-ponged a bit on how to phrase this and ended up here:
nomad/website/content/docs/concepts/architecture.mdx
Lines 116 to 120 in efc2e37
#### Node | |
A more generic term used to refer to machines running Nomad agents in client | |
mode. Despite being different concepts, you may find "node" being used | |
interchangeably with "client" in some materials and informal content. |
Is this a better description? Or maybe I should just leave this part out?
operators and job submitters to achieve more control over where allocations | ||
are placed. | ||
|
||
The steps to achieve this control consists of setting certain values in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this paragraph is used to introduce Affinities and Constraints, Datacenter, Nodepools... it would be worth mentioning them as the settings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum...I don't quite follow. I think the confusion may be that I'm using "setting" as the gerund of the verb "to set", not as "configuration".
But this paragraph is gone once I moved this section to the new Scheduling -> Placements page 😅
## Multi-region Clusters | ||
|
||
In federated multi-region clusters, node pools are automatically replicated | ||
from the authoritative region to all non-authoritative regions, and requests to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not directly related, but I couldn't find what authoritative regions are, maybe add a little phrase about it on the glossary on the region?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good point, I added an entry to the glossary about authoritative and non-authoritative regions. I used the same entry because they're opposites so it was kind of weird to explain one without explaining the other 😅
|
||
Node pools and namespaces share some similarities, with both providing a way to | ||
group resources in isolated logical units. Jobs are grouped into namespaces and | ||
clients into node pools. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This phrase is very clear and useful, I would add it in the node pools description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Which node pool description are you referring to?
Add new Concepts page about node pools and expand the Architecture page to note planning client configuration. Also include some smaller changes where node pools may affect the scheduling outcome.
Preview links for the biggest changes:
https://nomad-git-docs-node-pools-hashicorp.vercel.app/nomad/docs/concepts/architecture#client-organization
https://nomad-git-docs-node-pools-hashicorp.vercel.app/nomad/docs/concepts/node-pools