Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoscaler Policy Overrides Count each time a job is deployed #9839

Closed
idrennanvmware opened this issue Jan 18, 2021 · 3 comments
Closed

Autoscaler Policy Overrides Count each time a job is deployed #9839

idrennanvmware opened this issue Jan 18, 2021 · 3 comments
Labels
stage/waiting-reply theme/autoscaling Issues related to supporting autoscaling type/question

Comments

@idrennanvmware
Copy link
Contributor

idrennanvmware commented Jan 18, 2021

Output from nomad version
1.0.2

Issue

First, apologies if this is in the wrong repo. I wasn't sure if this fell under Nomad, or the Nomad autoscaler.

We have been experimenting with the scaling stanza in preparation for some work around the Autoscaling feature(s) in Nomad. In our work we found some behavior we didn't expect around the scaling stanza - we found that that our CI/CD pipeline would deploy Nomad jobs (due to other changes) but the count of the job would always revert to the MIN of the scaling stanza. We tried omitting the count at the group level as well as setting it - both scenarios had undesirable results. If we set a count for the group then the job would always get set to that, regardless of any scaling that had been applied. We weren't sure if this was due to us having enabled = false (which is actually what we want as we're using the UI to manually scale groups while we experiment.

Reproduction steps

  1. Create a job with a scaling stanza (disabled so the UI can
    scaling {
      enabled = false
      min = 1
      max = 5
      policy {
      }
    }
  1. Deploy the job

  2. Scale the job group to 2 in the UI

  3. Deploy the original job file (again)

Job scales back to the original count of 1.

@jrasell
Copy link
Member

jrasell commented Jan 18, 2021

Hi @idrennanvmware.

When count is omitted from a task group specification, Nomad sets a default value of 1. What I believe is happening is when you register the job again in step 4, the count is being defaulted to 1 and therefore updates the running count of the job from the scaled value. In order to avoid this, and instead keep the count as described by the Nomad cluster, you can use the preserve count api or run CLI parameter.

Testing this locally I got the behaviour you desire with the following steps:

  1. create the example Nomad init job and add the scaling stanza from your comment
  2. deploy the job and wait for it to complete successfully
  3. scale the job to 2
  4. modify the Docker image tag
  5. deploy the job again using the following command nomad job run -preserve-counts example.nomad
  6. observe the completed deployment produces 2 running allocations

I hope this helps. Please let me know if you have any follow up questions or comments.

@jrasell jrasell added stage/waiting-reply theme/autoscaling Issues related to supporting autoscaling labels Jan 18, 2021
@idrennanvmware
Copy link
Contributor Author

idrennanvmware commented Jan 18, 2021

@jrasell - thank you! I had no idea about the -preserve-counts flag. I'm sure that will do the trick. I do have some ideas I'd like to discuss like being able to preserve counts at a group level (but I'll save that for the discuss forum). Closing this for now.

Thanks again

tgross pushed a commit to fredwangwang/nomad that referenced this issue Jan 22, 2021
resolves hashicorp#9839
resolves hashicorp#6929
resolves hashicorp#6910

e2e: template env interpolation path testing
backspace pushed a commit that referenced this issue Jan 22, 2021
resolves #9839
resolves #6929
resolves #6910

e2e: template env interpolation path testing
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stage/waiting-reply theme/autoscaling Issues related to supporting autoscaling type/question
Projects
None yet
Development

No branches or pull requests

2 participants