Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reload process via sighup when autoscale policy files are added or changed #306

Closed
josegonzalez opened this issue Nov 2, 2020 · 5 comments

Comments

@josegonzalez
Copy link

Changing a policy file will not reload the autoscaler config, which isn't great if you're deploying nomad in a way that would automatically create/change policies when autoscale groups or similar change underneath.

@lgfa29
Copy link
Contributor

lgfa29 commented Nov 3, 2020

Hi @josegonzalez,

The Autoscaler should reload file policies on SIGHUP:

2020-11-02T19:54:58.641-0500 [INFO]  agent: caught signal: signal=hangup
2020-11-02T19:54:58.641-0500 [INFO]  file_policy_source: file policy source ID monitor received reload signal
2020-11-02T19:54:58.642-0500 [DEBUG] file_policy_source: starting file policy monitor: file=bin/policies/cluster.hcl policy_id=cf0c9617-92df-6960-834d-0e6a8711297a
2020-11-02T19:54:58.643-0500 [DEBUG] file_policy_source: starting file policy monitor: file=bin/policies/cluster2.hcl policy_id=2e9e2992-7296-cb13-34c4-66559419cfce

If this is not happening for you then it's a bug.

Would you be able to provide some logs from the Autoscaler around the time the SIGHUP signal is sent?

@josegonzalez
Copy link
Author

That works if you send an explicit SIGHUP after changing the file (or adding one I guess), not if a file was added/changed/removed in isolation.

If you are deploying the nomad-autoscaler via nomad - I'm assuming that is the preferred method? - then you'd probably want a single job to handle templating policies. Since we cannot have multiple policies in a single template - see #307 - using consul-template for more dynamic environments[1] is out.

The alternative would be for nomad-autoscaler to automatically follow a directory for policy changes, which would then allow us to have nomad-autoscaler to read policies from $NOMAD_ALLOC_DIR/policies and have a sidecar process write those policies out as appropriate. As far as I know, sidecar tasks won't be able to send signals to other tasks - maybe I'm wrong - which means we'd want nomad-autoscaler to SIGHUP itself if there are any changes in the directory holding policies.


[1] You might think, "this is dumb, why would anyone ever want this?". The use case is when you have a golden ami and roll out new versions of that AMI via Autoscale groups that are managed via cloudformation. You could architect it such that rolling out the new AMI would create a new ASG and would want nomad-autoscaler to scale up the new version when it gets created. A consul-template call would watch a key that holds the new ASG name for a given class, and then you'd start draining/terminating old nodes, which - via the magic of whatever query you are running - would trigger the nomad-autoscaler to scale up the new ASG without needing to redeploy nomad-autoscaler with changes to the job spec or associated policies.

@lgfa29
Copy link
Contributor

lgfa29 commented Nov 5, 2020

I think your use case is very valid, but the requirement for an explicit SIGHUP was a deliberate choice.

Having an explicit signal prevents the Autoscaler from reading files that are being edited, for example. It wouldn't happen in your example because you are doing the right thing 😄

But imagine editing a live policy file and your editor auto-saves as you type. This would trigger the Autoscaler to reload its policies constantly and have (potentially) invalid and unfinished policies read. So having an explicit signal gives more control over when the reload happens.

For the use case you described, I think the API I mentioned in #307 would work as well. When you create the ASG you would also create a new policy in this API.

@josegonzalez
Copy link
Author

In what cases are folks SSH'ing onto a box in production to edit an autoscaling policy? I think it's more of a theoretical concern - maybe something you've seen in local testing - but I think it's unlikely in production if you have these templated out.

@lgfa29
Copy link
Contributor

lgfa29 commented Nov 23, 2020

@josegonzalez with #312 and #313 merged I am going to close this one. Auto-reload for policy files is not something we plan on working right now. But thanks for the idea, it generated some great discussion and new features 🎉

@lgfa29 lgfa29 closed this as completed Nov 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants