Expose `consul-template` configuration parameters for operators to tune. #11707

DerekStrickland · 2021-12-20T10:27:44Z

Use-cases

When Nomad clusters are experiencing Consul/Vault system degradation, or in situations where Operators have a higher tolerance for latency or connectivity loss (edge), a lack of control over consul-template fault tolerance behaviors can lead to a significant amount of allocation churn. consul-template exposes configuration options for fine-tuning retries, blocking queries, and startup fault tolerance. Having the ability to configure these options would be useful for Nomad Operators, especially when experiencing latency or instability when communicating with Consul.

Proposal

Expose consul-template configuration parameters for operators to tune. The set of configuration options that will be exposed for consul-template, and their usage are:

max_stale - This is the maximum interval to allow "stale" data. By default, only the Consul leader will respond to queries. Requests to a follower or local agent will be forwarded to the leader. In large clusters with many requests, this is not as scalable, so this option allows any follower to respond to a query, so long as the last-replicated data is within these bounds. Higher values result in less cluster load but are more likely to have outdated data.
block_query_wait - This is the amount of time to perform a blocking query. Many endpoints in Consul support a feature known as "blocking queries". A blocking query is used to wait for a potential change using long polling. This reduces the load on Consul by avoiding making new requests to Consul when nothing has changed.
wait - This defines the minimum and maximum amount of time to wait for the cluster to reach a consistent state before rendering a template. This is useful to enable in systems that are experiencing a lot of flapping because it will reduce the number of times a template is rendered. This is configurable at both the client and the task level. The task-level setting can be used to override the global setting.

  wait {
    enabled = true
    min     = "5s"
    max     = "90s"
  }

wait_bounds - This is a Nomad-specific configuration that enables Nomad Operators to set client level constraints that set bounds on individual jobspec configuration. This setting defines lower and upper bounds for per-template wait configuration on a given client. If the individual template configuration has a min lower than wait_bounds.min or a max greater than the wait_bounds.max, the bounds will be enforced, and the template wait will be adjusted before being sent to consul-template.

wait_bounds {
  enabled = true
  min     = "5s"
  max     = "10s"
}

consul_retry - This controls the retry behavior when an error is returned from Consul. By default, Nomad will fail and reschedule an alloc when a template fails to render. This can lead to a significant eval -> alloc -> template render failure cycle in clusters where Consul is unstable.

 consul_retry {
   enabled = true
   # This specifies the number of attempts to make before giving up. Each
   # attempt adds the exponential backoff sleep time. Setting this to
   # zero will implement an unlimited number of retries.
   attempts = 12
   # This is the base amount of time to sleep between retry attempts. Each
   # retry sleeps for an exponent of 2 longer than this base. For 5 retries,
   # the sleep times would be: 250ms, 500ms, 1s, 2s, then 4s.
   backoff = "250ms"
   # This is the maximum amount of time to sleep between retry attempts.
   # When max_backoff is set to zero, there is no upper limit to the
   # exponential sleep between retry attempts.
   # If max_backoff is set to 10s and backoff is set to 1s, sleep times
   # would be: 1s, 2s, 4s, 8s, 10s, 10s, ...
   max_backoff = "1m"
 }

- `vault_entry` - This controls the retry behavior when an error is returned from Vault. By default, Nomad will fail and reschedule an alloc when a template fails to render. This can lead to a significant `eval` -> `alloc` -> `template` render failure cycle in clusters where Consul is unstable. 

```hcl
 vault_retry {
   # Same explanations as Consul
   enabled = true
   attempts = 12
   backoff = "250ms"
   max_backoff = "1m"
 }

Related Issues

Closes #3866
Closes #2623

The text was updated successfully, but these errors were encountered:

github-actions · 2022-10-12T02:45:03Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

DerekStrickland added the type/enhancement label Dec 20, 2021

DerekStrickland self-assigned this Dec 20, 2021

DerekStrickland linked a pull request Dec 20, 2021 that will close this issue

Expose Consul template configuration parameters #11606

Merged

DerekStrickland mentioned this issue Dec 20, 2021

Expose Consul template configuration parameters #11606

Merged

lgfa29 added stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/template labels Dec 20, 2021

jrasell mentioned this issue Jan 4, 2022

template: expose the max_stale consul-template config param to template blocks #11428

Closed

DerekStrickland closed this as completed in #11606 Jan 10, 2022

github-actions bot locked as resolved and limited conversation to collaborators Oct 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose `consul-template` configuration parameters for operators to tune. #11707

Expose `consul-template` configuration parameters for operators to tune. #11707

DerekStrickland commented Dec 20, 2021 •

edited by lgfa29

Loading

github-actions bot commented Oct 12, 2022

Expose consul-template configuration parameters for operators to tune. #11707

Expose consul-template configuration parameters for operators to tune. #11707

Comments

DerekStrickland commented Dec 20, 2021 • edited by lgfa29 Loading

Use-cases

Proposal

Related Issues

github-actions bot commented Oct 12, 2022

Expose `consul-template` configuration parameters for operators to tune. #11707

Expose `consul-template` configuration parameters for operators to tune. #11707

DerekStrickland commented Dec 20, 2021 •

edited by lgfa29

Loading