Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: expand documentation on node pools #18109

Merged
merged 5 commits into from
Aug 16, 2023
Merged

docs: expand documentation on node pools #18109

merged 5 commits into from
Aug 16, 2023

Conversation

lgfa29
Copy link
Contributor

@lgfa29 lgfa29 commented Aug 1, 2023

Add new Concepts page about node pools and expand the Architecture page to note planning client configuration. Also include some smaller changes where node pools may affect the scheduling outcome.

Preview links for the biggest changes:
https://nomad-git-docs-node-pools-hashicorp.vercel.app/nomad/docs/concepts/architecture#client-organization
https://nomad-git-docs-node-pools-hashicorp.vercel.app/nomad/docs/concepts/node-pools

@lgfa29 lgfa29 added backport/website This will backport PR changes to `stable-website` && the latest release-branch backport/1.6.x backport to 1.6.x release line labels Aug 1, 2023
@lgfa29 lgfa29 requested a review from a team August 1, 2023 00:37
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great @lgfa29! I'm about halfway thru but wanted to leave comments and then come back to finish the rest of my read-thru later today.

Comment on lines 209 to 211
Using affinities and constraints has the downside of only allowing allocations
to gravitate towards certain nodes, but it does not prevent placements of jobs
where the rules don't apply.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Using affinities and constraints has the downside of only allowing allocations
to gravitate towards certain nodes, but it does not prevent placements of jobs
where the rules don't apply.
Using affinities and constraints has the downside of only allowing allocations
to gravitate towards certain nodes, but it does not prevent placements of other
jobs for which the rules don't apply.

I think?

Copy link
Member

@Juanadelacuesta Juanadelacuesta Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im a little confused here: constraints are hard requirements and "Job placement fails if a constraint cannot be satisfied." but at the same time " it does not prevent placements of jobs where the rules don't apply" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think?

Yup, that's right. Thanks!

Im a little confused here: constraints are hard requirements and "Job placement fails if a constraint cannot be satisfied." but at the same time " it does not prevent placements of jobs where the rules don't apply" ?

Yeah, it's kind of weird to describe in text, so maybe I should put an example.

But imagine you have two jobs, one with a constraint and one without any constraints:

job "app"  {
  constraint {
    attribute = "${meta.env}"
    value      = "prod"
  }
  # ...
}
job "db"  {
  # ...
}

The app job is only allowed to run in clients with metadata env=prod, but the db is free to run anywhere, including clients with env=prod.

So if you want to reserve the env=prod clients to only run app then you need a negative constraint in all other jobs:

job "db"  {
  constraint {
    attribute = "${meta.env}"
    operator  = "!="
    value      = "prod"
  }
  # ...
}

So constraints are kind of unidirectional in this sense: you can restrict a job to specific nodes, but you can't restrict a node to specific jobs.

Does this help?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this paragraph as I moved this section to the Scheduler section

One restriction of using affinities and constraints is that they only express
relationships from jobs to nodes, so it is not possible to use them to restrict
a node to only receive allocations for specific jobs.

Is this better or should I add an example to help illustrate the problem?

website/content/docs/concepts/architecture.mdx Outdated Show resolved Hide resolved
website/content/docs/concepts/architecture.mdx Outdated Show resolved Hide resolved
website/content/docs/concepts/node-pools.mdx Show resolved Hide resolved
website/content/docs/concepts/node-pools.mdx Outdated Show resolved Hide resolved
website/content/docs/concepts/node-pools.mdx Outdated Show resolved Hide resolved
website/content/docs/concepts/node-pools.mdx Outdated Show resolved Hide resolved
website/content/docs/concepts/node-pools.mdx Outdated Show resolved Hide resolved
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @lgfa29! I've left a few more comments but I think if we resolve those we're good-to-go here.

website/content/docs/schedulers.mdx Outdated Show resolved Hide resolved
website/content/docs/concepts/node-pools.mdx Outdated Show resolved Hide resolved
website/content/docs/concepts/node-pools.mdx Show resolved Hide resolved

A more generic term used to refer to machines running Nomad agents in client
mode. Unless noted otherwise, it may be used interchangeably with
[client](#client).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when can it be different?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of disagree with noting that it may be used interchangeably (even though it often is). Node is intended to refer to the machine (and operating system, etc.) while Client refers to the Nomad agent running in client mode. We should avoid conflating the terms in our own docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when can it be different?

"Node" is a very generic term that different people use it to mean different things 😅

For example, a node in a network context can be anything sending or receiving traffic, like a computer, a router, a switch, or maybe even an allocation, like a sidecar proxy. So if you're talking about Nomad's network topology, "node" may not necessarily mean this specific definition.

But to @shoenig's point, I think I didn't express myself well here. I meant more like "you may see or hear people using them interchangeably", which I have in informal conversations, issues, community blog posts etc.

Maybe this sentence would be better like this:

Despite being different concepts, you may find the term "node" being used interchangeably with "client" in some materials.

Node is intended to refer to the machine (and operating system, etc.)

To the machine running a Nomad client right? I don't recall us using "node" to refer to machines running servers. For example, the nomad node commands and /v1/node/ endpoints are only about clients.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ping-ponged a bit on how to phrase this and ended up here:

#### Node
A more generic term used to refer to machines running Nomad agents in client
mode. Despite being different concepts, you may find "node" being used
interchangeably with "client" in some materials and informal content.

Is this a better description? Or maybe I should just leave this part out?

operators and job submitters to achieve more control over where allocations
are placed.

The steps to achieve this control consists of setting certain values in
Copy link
Member

@Juanadelacuesta Juanadelacuesta Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this paragraph is used to introduce Affinities and Constraints, Datacenter, Nodepools... it would be worth mentioning them as the settings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum...I don't quite follow. I think the confusion may be that I'm using "setting" as the gerund of the verb "to set", not as "configuration".

But this paragraph is gone once I moved this section to the new Scheduling -> Placements page 😅

## Multi-region Clusters

In federated multi-region clusters, node pools are automatically replicated
from the authoritative region to all non-authoritative regions, and requests to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly related, but I couldn't find what authoritative regions are, maybe add a little phrase about it on the glossary on the region?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point, I added an entry to the glossary about authoritative and non-authoritative regions. I used the same entry because they're opposites so it was kind of weird to explain one without explaining the other 😅

https://github.com/hashicorp/nomad/pull/18109/files#diff-81af9e6ca20054c572009d730ea51e6a6563ef34befcc20a398fac0c3a6d68d9R96-R106


Node pools and namespaces share some similarities, with both providing a way to
group resources in isolated logical units. Jobs are grouped into namespaces and
clients into node pools.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This phrase is very clear and useful, I would add it in the node pools description

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Which node pool description are you referring to?

@lgfa29 lgfa29 merged commit 01d71ca into main Aug 16, 2023
3 checks passed
@lgfa29 lgfa29 deleted the docs-node-pools branch August 16, 2023 15:16
@lgfa29 lgfa29 removed backport/website This will backport PR changes to `stable-website` && the latest release-branch backport/1.6.x backport to 1.6.x release line labels Aug 16, 2023
@lgfa29 lgfa29 added backport/website This will backport PR changes to `stable-website` && the latest release-branch backport/1.6.x backport to 1.6.x release line labels Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/website This will backport PR changes to `stable-website` && the latest release-branch backport/1.6.x backport to 1.6.x release line
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants