Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFD: Job Name #2248

Closed
sean- opened this issue Jan 26, 2017 · 4 comments
Closed

RFD: Job Name #2248

sean- opened this issue Jan 26, 2017 · 4 comments

Comments

@sean-
Copy link
Contributor

sean- commented Jan 26, 2017

Problem: Nomad's Job ID has become the canonical lookup mechanism and is an integral part of nomad's UX, not the Job Name. For background: the Job ID is name at the top of the Job file, for example, the Job ID in the following is abcd-pgbouncer-123 and the job name is pgbouncer:

job "abcd-pgbouncer-123" {
  region = "global"
  datacenters = ["dc1"]
  type = "service"
  name = "pgbouncer" # Name would be automatically set to `abcd-pgbouncer-123` if it were omitted
...

Nomad has support for a Job Name, which is automatically populated by the Job ID when the name is missing. At present, however, nothing uses the Job Name, making the parameter largely useless, and that's unfortunate because its existence and use could solve a number of problems in the area of release engineering and job management. We've been lucky in avoiding this being a problem because nomad status ${JOB_ID} actually does a prefix match and most of our job IDs are actually, and incorrectly, just the job name, or begin with the job name.

In an exaggerated and unrealistic world, the software should be usable if the Job ID were a UUID that that was generated by a software deployment bot.

Right now nomad status emits Job IDs and similarly does a prefix match on the Job ID. For example:

$ nomad status
ID                                      Type     Priority  Status
circonus-broker-bef89                   service  50        running
abcd-pgbouncer-123                      service  50        running
$ nomad status pgbouncer
No job(s) with prefix or id "pgbouncer" found
$ nomad status abcd-pgbouncer
ID            = abcd-pgbouncer-123
Name          = pgbouncer
Type          = service
Priority      = 50
Datacenters   = dc1
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
txn         0       3         0        0       0         0

Allocations
ID        Eval ID   Node ID   Task Group  Desired  Status   Created At
03c73c58  bd115bb9  496c9767  txn         run      pending  01/26/17 20:41:02 UTC
3fa9cab1  bd115bb9  b7d7483e  txn         run      pending  01/26/17 20:41:02 UTC
e9005b25  bd115bb9  60b30973  txn         run      pending  01/26/17 20:41:02 UTC

What we care about and want is continuity of the Job Name, not the Job ID. The Job ID currently contains both the Job Name and Git SHA making the Job ID semi-friendly, but also random when observed as a whole. In an idealized world, the UX would be Job Name centric, not Job ID centric. For example:

$ nomad status
Name         ID                            Type     Priority  Status
pgbouncer    abcd-pgbouncer-123            service  50        running
pgbouncer    abcd-pgbouncer-124            service  50        running
$ nomad status pgbouncer
Job Summary
ID                    Task Group  Queued  Starting  Running  Failed  Complete  Lost
abcd-pgbouncer-123    txn         0       0         1        0       3         0
abcd-pgbouncer-124    txn         0       0         3        0       3         0

Allocations
Job ID                Alloc ID  Eval ID   Node ID   Task Group  Desired  Status    Created At
abcd-pgbouncer-123    319f3f59  9c9a6317  496c9767  txn         run      running   01/26/17 20:58:57 UTC
abcd-pgbouncer-124    51e1675a  96b7d813  d6b60eb1  txn         run      running   01/26/17 20:53:57 UTC
abcd-pgbouncer-124    cad4e76f  a9a7b0ef  70ba3d96  txn         run      running   01/26/17 20:48:57 UTC
abcd-pgbouncer-123    03c73c58  bd115bb9  496c9767  txn         stop     complete  01/26/17 20:41:02 UTC
abcd-pgbouncer-123    3fa9cab1  bd115bb9  b7d7483e  txn         stop     complete  01/26/17 20:41:02 UTC
abcd-pgbouncer-122    e9005b25  bd115bb9  60b30973  txn         stop     complete  01/26/17 20:41:02 UTC
$ nomad status -id abcd-pgbouncer-123
Job Summary
ID                    Task Group  Queued  Starting  Running  Failed  Complete  Lost
abcd-pgbouncer-123    txn         0       0         1        0       3         0

Allocations
Job ID                Alloc ID  Eval ID   Node ID   Task Group  Desired  Status    Created At
abcd-pgbouncer-123    319f3f59  9c9a6317  496c9767  txn         run      running   01/26/17 20:58:57 UTC
abcd-pgbouncer-123    03c73c58  bd115bb9  496c9767  txn         stop     complete  01/26/17 20:41:02 UTC
abcd-pgbouncer-123    3fa9cab1  bd115bb9  b7d7483e  txn         stop     complete  01/26/17 20:41:02 UTC

The Job Name-centric view of the world solves a few problems, notably:

  • nomad status ${JOB_NAME} will return stopped allocs and stopped Job IDs until all stopped Job IDs have been reaped.
  • Developers can see the status of multiple variants of a single job in parallel (think blue/green deployments and experiments, but expanded to a larger scale of concurrency and deployments)
  • The output is grep(1) friendly
  • The development of automation around deployments becomes much easier to reason about.
  • A Job Name becomes the identifier used to provide service continuity across multiple and concurrent releases
  • A Job ID becomes the lookup mechanism to find information about a particular release contributing to the overall service referred to by the Job Name.

There's room for more refinement in those mocked up UIs, but the important take away from this RFD is that the Job Name becomes the primary point of aggregation and forward lookup mechanism for an aggregation of Jobs, not the Job ID.

@nanoz
Copy link
Contributor

nanoz commented Jan 30, 2017

Why don't you use task groups to manage versions, inside of a single job spec ?

@sean-
Copy link
Contributor Author

sean- commented Jan 30, 2017

@nanoz That presents a different set of challenges with regards to rolling updates and continuity of service.

@tgross
Copy link
Member

tgross commented Sep 30, 2022

Doing some issue clean up. Filter queries enable most of this today. It'd be possible to add logic to query by name and then ID on the write path as well, but realistically this doesn't have a high value for us in terms of the investment. I'm going to close this one as a wontfix.

@tgross tgross closed this as not planned Won't fix, can't repro, duplicate, stale Sep 30, 2022
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants