Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing score info in nomad alloc status -verbose #11117

Closed
martinmcnulty opened this issue Sep 1, 2021 · 2 comments · Fixed by #11128
Closed

Missing score info in nomad alloc status -verbose #11117

martinmcnulty opened this issue Sep 1, 2021 · 2 comments · Fixed by #11128
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/cli type/bug

Comments

@martinmcnulty
Copy link

Nomad version

Nomad v1.1.2 (60638a086ef9630e2a9ba1e237e8426192a44244)

Operating system and Environment details

CentOS 7

Issue

The "Placement Metrics" table in the output of nomad alloc status -verbose does not always include all the relevant scoring information. It seems like the list of scorerNames (and therefore the table header) are built only based on the first ScoreMetaData object. This means that, if a later ScoreMetaData object contains a score that is not in the first one, then that information will not be printed.

Reproduction steps

  • Create a job with an affinity for a particular node, but not one strong enough to actually place the allocation on that node.
  • Run nomad alloc status -verbose <allocID>

Expected Result

The "Placement Metrics" table should include a node-affinity column, as per the docs.

Actual Result

The "Placement Metrics" table only includes that column if the node the allocation was placed on is the one that the job had the affinity for.

Job file (if appropriate)

job "test-job" {                                                                                                                                                                                                                                                                                                         
                                                                                                                                                                                                                                                                                                                         
  type = "service"                                                                                                                                                                                                                                                                                                       
  datacenters = ["dc1"]                                                                                                                                                                                                                                                                                           
                                                                                                                                                                                                                                                                                                                         
  affinity {                                                                                                                                                                                                                                                                                                             
    attribute = "${node.unique.name}"                                                                                                                                                                                                                                                                                    
    value = "node1.xyz.com"                                                                                                                                                                                                                                                                                    
    weight = 50                                                                                                                                                                                                                                                                                                          
  }                                                                                                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                                                                                                         
  group "test-job" {                                                                                                                                                                                                                                                                                                     
                                                                                                                                                                                                                                                                                                                         
    task "test-job" {                                                                                                                                                                                                                                                                                                    
                                                                                                                                                                                                                                                                                                                         
      driver = "exec"                                                                                                                                                                                                                                                                                                    
                                                                                                                                                                                                                                                                                                                         
      config {                                                                                                                                                                                                                                                                                                           
        command = "/usr/bin/sleep"                                                                                                                                                                                                                                                                                       
        args = ["2200"]                                                                                                                                                                                                                                                                                                  
      }                                                                                                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                                                                                                         
    }                                                                                                                                                                                                                                                                                                                    
                                                                                                                                                                                                                                                                                                                         
  }                                                                                                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                                                                                                         
}

Nomad Server logs (if appropriate)

Nomad Client logs (if appropriate)

Notice there is no node-affinity column in the nomad alloc status output, but there is a node-affinity score in the second ScoreMetaData returned from the API:

$ nomad job run affinity.hcl
==> 2021-09-01T17:43:03+01:00: Monitoring evaluation "efcacf9a"
    2021-09-01T17:43:03+01:00: Evaluation triggered by job "test-job"
==> 2021-09-01T17:43:04+01:00: Monitoring evaluation "efcacf9a"
    2021-09-01T17:43:04+01:00: Evaluation within deployment: "04b3112a"
    2021-09-01T17:43:04+01:00: Allocation "6e9ff5c3" created: node "44624543", group "test-job"
    2021-09-01T17:43:04+01:00: Evaluation status changed: "pending" -> "complete"
==> 2021-09-01T17:43:04+01:00: Evaluation "efcacf9a" finished with status "complete"
==> 2021-09-01T17:43:04+01:00: Monitoring deployment "04b3112a"
  ⠙ Deployment "04b3112a" successful

    2021-09-01T17:43:22+01:00
    ID          = 04b3112a
    Job ID      = test-job
    Job Version = 5
    Status      = successful
    Description = Deployment completed successfully

    Deployed
    Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
    test-job    1        1       1        0          2021-09-01T17:53:21+01:00

$ nomad alloc status -verbose 6e9ff5c3-236a-329d-b5fb-035e05a83e52 | tail -n7
Placement Metrics
Node                                  binpack  job-anti-affinity  node-reschedule-penalty  final score
44624543-3927-cf70-c008-4ba93710edb3  0.911    0                  0                        0.911
b98989ee-f3d9-c7cc-da12-6a41985a9668  0.631    0                  0                        0.816
45517c29-f6b3-476f-5b7b-8d8ac648febf  0.81     0                  0                        0.81
1bc708ec-3396-c246-c629-3bcf74531b8f  0.667    0                  0                        0.667
f95cfa1f-4dbc-c0ab-d925-7ca38085a88d  0.338    0                  0                        0.338

$ curl -s -H "X-Nomad-Token: $NOMAD_TOKEN" http://localhost:4646/v1/allocation/6e9ff5c3-236a-329d-b5fb-035e05a83e52 | jq '.Metrics.ScoreMetaData'
[
  {
    "NodeID": "44624543-3927-cf70-c008-4ba93710edb3",
    "Scores": {
      "binpack": 0.9110766451979904,
      "job-anti-affinity": 0,
      "node-reschedule-penalty": 0
    },
    "NormScore": 0.9110766451979904
  },
  {
    "NodeID": "b98989ee-f3d9-c7cc-da12-6a41985a9668",
    "Scores": {
      "job-anti-affinity": 0,
      "node-reschedule-penalty": 0,
      "node-affinity": 1,
      "binpack": 0.6311708586442051
    },
    "NormScore": 0.8155854293221025
  },
  {
    "NodeID": "45517c29-f6b3-476f-5b7b-8d8ac648febf",
    "Scores": {
      "node-reschedule-penalty": 0,
      "binpack": 0.8101144236237597,
      "job-anti-affinity": 0
    },
    "NormScore": 0.8101144236237597
  },
  {
    "NodeID": "1bc708ec-3396-c246-c629-3bcf74531b8f",
    "Scores": {
      "job-anti-affinity": 0,
      "node-reschedule-penalty": 0,
      "binpack": 0.6674780218828612
    },
    "NormScore": 0.6674780218828612
  },
  {
    "NodeID": "f95cfa1f-4dbc-c0ab-d925-7ca38085a88d",
    "Scores": {
      "binpack": 0.33813019787067666,
      "job-anti-affinity": 0,
      "node-reschedule-penalty": 0
    },
    "NormScore": 0.33813019787067666
  }
]
@lgfa29 lgfa29 added stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/cli labels Sep 2, 2021
@lgfa29 lgfa29 added this to Needs Triage in Nomad - Community Issues Triage via automation Sep 2, 2021
@lgfa29 lgfa29 moved this from Needs Triage to In Progress in Nomad - Community Issues Triage Sep 2, 2021
@lgfa29
Copy link
Contributor

lgfa29 commented Sep 2, 2021

Thanks for report @martinmcnulty, your observation was spot-on 🙂

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/cli type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants