diff --git a/website/source/docs/telemetry/index.html.md b/website/source/docs/telemetry/index.html.md
index 3d33c27f2201..3f95ab7bf90b 100644
--- a/website/source/docs/telemetry/index.html.md
+++ b/website/source/docs/telemetry/index.html.md
@@ -3,812 +3,22 @@ layout: "docs"
page_title: "Telemetry"
sidebar_current: "docs-telemetry"
description: |-
- Learn about the telemetry data available in Nomad.
+ Telemetry docs home page
---
# Telemetry
-The Nomad agent collects various runtime metrics about the performance of
-different libraries and subsystems. These metrics are aggregated on a ten
-second interval and are retained for one minute.
+The Nomad client and server agents collect a wide range of runtime metrics
+related to the performance of the system. Operators can use this data to gain
+real-time visibility into their cluster and improve performance. Additionally,
+Nomad operators can set up monitoring and alerting based on these metrics in
+order to respond to any changes in the cluster state.
-This data can be accessed via an HTTP endpoint or via sending a signal to the
-Nomad process.
+Please refer to the documentation listed below or in the sidebar to learn more
+about how you can leverage the telemetry Nomad exposes.
-Via HTTP, as of Nomad version 0.7, this data is available at `/metrics`. See
-[Metrics](/api/metrics.html) for more information.
+* [Overview][overview]
+* [Metrics][metrics]
-
-To view this data via sending a signal to the Nomad process: on Unix,
-this is `USR1` while on Windows it is `BREAK`. Once Nomad receives the signal,
-it will dump the current telemetry information to the agent's `stderr`.
-
-This telemetry information can be used for debugging or otherwise
-getting a better view of what Nomad is doing.
-
-Telemetry information can be streamed to both [statsite](https://github.com/armon/statsite)
-as well as statsd based on providing the appropriate configuration options.
-
-To configure the telemetry output please see the [agent
-configuration](/docs/configuration/telemetry.html).
-
-Below is sample output of a telemetry dump:
-
-```text
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_blocked': 0.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.plan.queue_depth': 0.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.malloc_count': 7568.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.total_gc_runs': 8.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_ready': 0.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.num_goroutines': 56.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.sys_bytes': 3999992.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.heap_objects': 4135.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.heartbeat.active': 1.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_unacked': 0.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_waiting': 0.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.alloc_bytes': 634056.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.free_count': 3433.000
-[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.total_gc_pause_ns': 6572135.000
-[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.memberlist.msg.alive': Count: 1 Sum: 1.000
-[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.serf.member.join': Count: 1 Sum: 1.000
-[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.raft.barrier': Count: 1 Sum: 1.000
-[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.raft.apply': Count: 1 Sum: 1.000
-[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.nomad.rpc.query': Count: 2 Sum: 2.000
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.serf.queue.Query': Count: 6 Sum: 0.000
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.fsm.register_node': Count: 1 Sum: 1.296
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.serf.queue.Intent': Count: 6 Sum: 0.000
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.runtime.gc_pause_ns': Count: 8 Min: 126492.000 Mean: 821516.875 Max: 3126670.000 Stddev: 1139250.294 Sum: 6572135.000
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.raft.leader.dispatchLog': Count: 3 Min: 0.007 Mean: 0.018 Max: 0.039 Stddev: 0.018 Sum: 0.054
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.leader.reconcileMember': Count: 1 Sum: 0.007
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.leader.reconcile': Count: 1 Sum: 0.025
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.raft.fsm.apply': Count: 1 Sum: 1.306
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.client.get_allocs': Count: 1 Sum: 0.110
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.worker.dequeue_eval': Count: 29 Min: 0.003 Mean: 363.426 Max: 503.377 Stddev: 228.126 Sum: 10539.354
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.serf.queue.Event': Count: 6 Sum: 0.000
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.raft.commitTime': Count: 3 Min: 0.013 Mean: 0.037 Max: 0.079 Stddev: 0.037 Sum: 0.110
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.leader.barrier': Count: 1 Sum: 0.071
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.client.register': Count: 1 Sum: 1.626
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.eval.dequeue': Count: 21 Min: 500.610 Mean: 501.753 Max: 503.361 Stddev: 1.030 Sum: 10536.813
-[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.memberlist.gossip': Count: 12 Min: 0.009 Mean: 0.017 Max: 0.025 Stddev: 0.005 Sum: 0.204
-```
-
-# Key Metrics
-
-When telemetry is being streamed to statsite or statsd, `interval` is defined to
-be their flush interval. Otherwise, the interval can be assumed to be 10 seconds
-when retrieving metrics using the above described signals.
-
-
-
- Metric |
- Description |
- Unit |
- Type |
-
-
- `nomad.runtime.num_goroutines` |
- Number of goroutines and general load pressure indicator |
- # of goroutines |
- Gauge |
-
-
- `nomad.runtime.alloc_bytes` |
- Memory utilization |
- # of bytes |
- Gauge |
-
-
- `nomad.runtime.heap_objects` |
- Number of objects on the heap. General memory pressure indicator |
- # of heap objects |
- Gauge |
-
-
- `nomad.raft.apply` |
- Number of Raft transactions |
- Raft transactions / `interval` |
- Counter |
-
-
- `nomad.raft.lastIndex` |
- Index of the last log in stable storage |
- Sequence number |
- Gauge |
-
-
- `nomad.raft.appliedIndex` |
- Index of the last applied log |
- Sequence number |
- Gauge |
-
-
- `nomad.raft.replication.appendEntries` |
- Raft transaction commit time |
- ms / Raft Log Append |
- Timer |
-
-
- `nomad.raft.leader.lastContact` |
- Time since last contact to leader. General indicator of Raft latency |
- ms / Leader Contact |
- Timer |
-
-
- `nomad.broker.total_ready` |
- Number of evaluations ready to be processed |
- # of evaluations |
- Gauge |
-
-
- `nomad.broker.total_unacked` |
- Evaluations dispatched for processing but incomplete |
- # of evaluations |
- Gauge |
-
-
- `nomad.broker.total_blocked` |
-
- Evaluations that are blocked until an existing evaluation for the same job
- completes
- |
- # of evaluations |
- Gauge |
-
-
- `nomad.plan.queue_depth` |
- Number of scheduler Plans waiting to be evaluated |
- # of plans |
- Gauge |
-
-
- `nomad.plan.submit` |
-
- Time to submit a scheduler Plan. Higher values cause lower scheduling
- throughput
- |
- ms / Plan Submit |
- Timer |
-
-
- `nomad.plan.evaluate` |
-
- Time to validate a scheduler Plan. Higher values cause lower scheduling
- throughput. Similar to `nomad.plan.submit` but does not include RPC time
- or time in the Plan Queue
- |
- ms / Plan Evaluation |
- Timer |
-
-
- `nomad.state.snapshotIndex` |
- Latest index in the server's in memory state store |
- Sequence number |
- Gauge |
-
-
- `nomad.worker.invoke_scheduler.` |
- Time to run the scheduler of the given type |
- ms / Scheduler Run |
- Timer |
-
-
- `nomad.worker.wait_for_index` |
-
- Time waiting for Raft log replication from leader. High delays result in
- lower scheduling throughput
- |
- ms / Raft Index Wait |
- Timer |
-
-
- `nomad.heartbeat.active` |
-
- Number of active heartbeat timers. Each timer represents a Nomad Client
- connection
- |
- # of heartbeat timers |
- Gauge |
-
-
- `nomad.heartbeat.invalidate` |
-
- The length of time it takes to invalidate a Nomad Client due to failed
- heartbeats
- |
- ms / Heartbeat Invalidation |
- Timer |
-
-
- `nomad.rpc.query` |
- Number of RPC queries |
- RPC Queries / `interval` |
- Counter |
-
-
- `nomad.rpc.request` |
- Number of RPC requests being handled |
- RPC Requests / `interval` |
- Counter |
-
-
- `nomad.rpc.request_error` |
- Number of RPC requests being handled that result in an error |
- RPC Errors / `interval` |
- Counter |
-
-
-
-# Client Metrics
-
-The Nomad client emits metrics related to the resource usage of the allocations
-and tasks running on it and the node itself. Operators have to explicitly turn
-on publishing host and allocation metrics. Publishing allocation and host
-metrics can be turned on by setting the value of `publish_allocation_metrics`
-`publish_node_metrics` to `true`.
-
-
-By default the collection interval is 1 second but it can be changed by the
-changing the value of the `collection_interval` key in the `telemetry`
-configuration block.
-
-Please see the [agent configuration](/docs/configuration/telemetry.html)
-page for more details.
-
-As of Nomad 0.9, Nomad will emit additional labels for [parameterized](/docs/job-specification/parameterized.html) and
-[periodic](/docs/job-specification/parameterized.html) jobs. Nomad
-emits the parent job id as a new label `parent_id`. Also, the labels `dispatch_id`
-and `periodic_id` are emitted, containing the ID of the specific invocation of the
-parameterized or periodic job respectively. For example, a dispatch job with the id
-`myjob/dispatch-1312323423423`, will have the following labels.
-
-
-
- Metric |
- Description |
- Unit |
- Type |
- Labels |
-
-
- `nomad.client.allocated.cpu` |
- Total amount of CPU shares the scheduler has allocated to tasks |
- MHz |
- Gauge |
- node_id, datacenter |
-
-
- `nomad.client.unallocated.cpu` |
- Total amount of CPU shares free for the scheduler to allocate to tasks |
- MHz |
- Gauge |
- node_id, datacenter |
-
-
- `nomad.client.allocated.memory` |
- Total amount of memory the scheduler has allocated to tasks |
- Megabytes |
- Gauge |
- node_id, datacenter |
-
-
- `nomad.client.unallocated.memory` |
- Total amount of memory free for the scheduler to allocate to tasks |
- Megabytes |
- Gauge |
- node_id, datacenter |
-
-
- `nomad.client.allocated.disk` |
- Total amount of disk space the scheduler has allocated to tasks |
- Megabytes |
- Gauge |
- node_id, datacenter |
-
-
- `nomad.client.unallocated.disk` |
- Total amount of disk space free for the scheduler to allocate to tasks |
- Megabytes |
- Gauge |
- node_id, datacenter |
-
-
- `nomad.client.allocated.network` |
- Total amount of bandwidth the scheduler has allocated to tasks on the
- given device |
- Megabits |
- Gauge |
- node_id, datacenter, device |
-
-
- `nomad.client.unallocated.network` |
- Total amount of bandwidth free for the scheduler to allocate to tasks on
- the given device |
- Megabits |
- Gauge |
- node_id, datacenter, device |
-
-
- `nomad.client.host.memory.total` |
- Total amount of physical memory on the node |
- Bytes |
- Gauge |
- node_id, datacenter |
-
-
- `nomad.client.host.memory.available` |
- Total amount of memory available to processes which includes free and
- cached memory |
- Bytes |
- Gauge |
- node_id, datacenter |
-
-
- `nomad.client.host.memory.used` |
- Amount of memory used by processes |
- Bytes |
- Gauge |
- node_id, datacenter |
-
-
- `nomad.client.host.memory.free` |
- Amount of memory which is free |
- Bytes |
- Gauge |
- node_id, datacenter |
-
-
- `nomad.client.uptime` |
- Uptime of the host running the Nomad client |
- Seconds |
- Gauge |
- node_id, datacenter |
-
-
- `nomad.client.host.cpu.total` |
- Total CPU utilization |
- Percentage |
- Gauge |
- node_id, datacenter, cpu |
-
-
- `nomad.client.host.cpu.user` |
- CPU utilization in the user space |
- Percentage |
- Gauge |
- node_id, datacenter, cpu |
-
-
- `nomad.client.host.cpu.system` |
- CPU utilization in the system space |
- Percentage |
- Gauge |
- node_id, datacenter, cpu |
-
-
- `nomad.client.host.cpu.idle` |
- Idle time spent by the CPU |
- Percentage |
- Gauge |
- node_id, datacenter, cpu |
-
-
- `nomad.client.host.disk.size` |
- Total size of the device |
- Bytes |
- Gauge |
- node_id, datacenter, disk |
-
-
- `nomad.client.host.disk.used` |
- Amount of space which has been used |
- Bytes |
- Gauge |
- node_id, datacenter, disk |
-
-
- `nomad.client.host.disk.available` |
- Amount of space which is available |
- Bytes |
- Gauge |
- node_id, datacenter, disk |
-
-
- `nomad.client.host.disk.used_percent` |
- Percentage of disk space used |
- Percentage |
- Gauge |
- node_id, datacenter, disk |
-
-
- `nomad.client.host.disk.inodes_percent` |
- Disk space consumed by the inodes |
- Percent |
- Gauge |
- node_id, datacenter, disk |
-
-
- `nomad.client.allocs.start` |
- Number of allocations starting |
- Integer |
- Counter |
- node_id, job, task_group |
-
-
- `nomad.client.allocs.running` |
- Number of allocations starting to run |
- Integer |
- Counter |
- node_id, job, task_group |
-
-
- `nomad.client.allocs.failed` |
- Number of allocations failing |
- Integer |
- Counter |
- node_id, job, task_group |
-
-
- `nomad.client.allocs.restart` |
- Number of allocations restarting |
- Integer |
- Counter |
- node_id, job, task_group |
-
-
- `nomad.client.allocs.complete` |
- Number of allocations completing |
- Integer |
- Counter |
- node_id, job, task_group |
-
-
- `nomad.client.allocs.destroy` |
- Number of allocations being destroyed |
- Integer |
- Counter |
- node_id, job, task_group |
-
-
-
-Nomad 0.9 adds an additional "node_class" label from the client's
-`NodeClass` attribute. This label is set to the string "none" if empty.
-
-## Host Metrics (deprecated post Nomad 0.7)
-
-The below are metrics emitted by Nomad in versions prior to 0.7. These metrics
-can be emitted in the below format post-0.7 (as well as the new format,
-detailed above) but any new metrics will only be available in the new format.
-
-
-
- Metric |
- Description |
- Unit |
- Type |
-
-
- `nomad.client.allocated.cpu.` |
- Total amount of CPU shares the scheduler has allocated to tasks |
- MHz |
- Gauge |
-
-
- `nomad.client.unallocated.cpu.` |
- Total amount of CPU shares free for the scheduler to allocate to tasks |
- MHz |
- Gauge |
-
-
- `nomad.client.allocated.memory.` |
- Total amount of memory the scheduler has allocated to tasks |
- Megabytes |
- Gauge |
-
-
- `nomad.client.unallocated.memory.` |
- Total amount of memory free for the scheduler to allocate to tasks |
- Megabytes |
- Gauge |
-
-
- `nomad.client.allocated.disk.` |
- Total amount of disk space the scheduler has allocated to tasks |
- Megabytes |
- Gauge |
-
-
- `nomad.client.unallocated.disk.` |
- Total amount of disk space free for the scheduler to allocate to tasks |
- Megabytes |
- Gauge |
-
-
- `nomad.client.allocated.network..` |
- Total amount of bandwidth the scheduler has allocated to tasks on the
- given device |
- Megabits |
- Gauge |
-
-
- `nomad.client.unallocated.network..` |
- Total amount of bandwidth free for the scheduler to allocate to tasks on
- the given device |
- Megabits |
- Gauge |
-
-
- `nomad.client.host.memory..total` |
- Total amount of physical memory on the node |
- Bytes |
- Gauge |
-
-
- `nomad.client.host.memory..available` |
- Total amount of memory available to processes which includes free and
- cached memory |
- Bytes |
- Gauge |
-
-
- `nomad.client.host.memory..used` |
- Amount of memory used by processes |
- Bytes |
- Gauge |
-
-
- `nomad.client.host.memory..free` |
- Amount of memory which is free |
- Bytes |
- Gauge |
-
-
- `nomad.client.uptime.` |
- Uptime of the host running the Nomad client |
- Seconds |
- Gauge |
-
-
- `nomad.client.host.cpu...total` |
- Total CPU utilization |
- Percentage |
- Gauge |
-
-
- `nomad.client.host.cpu...user` |
- CPU utilization in the user space |
- Percentage |
- Gauge |
-
-
- `nomad.client.host.cpu...system` |
- CPU utilization in the system space |
- Percentage |
- Gauge |
-
-
- `nomad.client.host.cpu...idle` |
- Idle time spent by the CPU |
- Percentage |
- Gauge |
-
-
- `nomad.client.host.disk...size` |
- Total size of the device |
- Bytes |
- Gauge |
-
-
- `nomad.client.host.disk...used` |
- Amount of space which has been used |
- Bytes |
- Gauge |
-
-
- `nomad.client.host.disk...available` |
- Amount of space which is available |
- Bytes |
- Gauge |
-
-
- `nomad.client.host.disk...used_percent` |
- Percentage of disk space used |
- Percentage |
- Gauge |
-
-
- `nomad.client.host.disk...inodes_percent` |
- Disk space consumed by the inodes |
- Percent |
- Gauge |
-
-
-
-## Allocation Metrics
-
-
-
- Metric |
- Description |
- Unit |
- Type |
-
-
- `nomad.client.allocs.....memory.rss` |
- Amount of RSS memory consumed by the task |
- Bytes |
- Gauge |
-
-
- `nomad.client.allocs.....memory.cache` |
- Amount of memory cached by the task |
- Bytes |
- Gauge |
-
-
- `nomad.client.allocs.....memory.swap` |
- Amount of memory swapped by the task |
- Bytes |
- Gauge |
-
-
- `nomad.client.allocs.....memory.max_usage` |
- Maximum amount of memory ever used by the task |
- Bytes |
- Gauge |
-
-
- `nomad.client.allocs.....memory.kernel_usage` |
- Amount of memory used by the kernel for this task |
- Bytes |
- Gauge |
-
-
- `nomad.client.allocs.....memory.kernel_max_usage` |
- Maximum amount of memory ever used by the kernel for this task |
- Bytes |
- Gauge |
-
-
- `nomad.client.allocs.....cpu.total_percent` |
- Total CPU resources consumed by the task across all cores |
- Percentage |
- Gauge |
-
-
- `nomad.client.allocs.....cpu.system` |
- Total CPU resources consumed by the task in the system space |
- Percentage |
- Gauge |
-
-
- `nomad.client.allocs.....cpu.user` |
- Total CPU resources consumed by the task in the user space |
- Percentage |
- Gauge |
-
-
- `nomad.client.allocs.....cpu.throttled_time` |
- Total time that the task was throttled |
- Nanoseconds |
- Gauge |
-
-
- `nomad.client.allocs.....cpu.total_ticks` |
- CPU ticks consumed by the process in the last collection interval |
- Integer |
- Gauge |
-
-
-
-# Job Metrics
-
-Job metrics are emitted by the Nomad leader server.
-
-
+
+ Metric |
+ Description |
+ Unit |
+ Type |
+
+
+ `nomad.runtime.num_goroutines` |
+ Number of goroutines and general load pressure indicator |
+ # of goroutines |
+ Gauge |
+
+
+ `nomad.runtime.alloc_bytes` |
+ Memory utilization |
+ # of bytes |
+ Gauge |
+
+
+ `nomad.runtime.heap_objects` |
+ Number of objects on the heap. General memory pressure indicator |
+ # of heap objects |
+ Gauge |
+
+
+ `nomad.raft.apply` |
+ Number of Raft transactions |
+ Raft transactions / `interval` |
+ Counter |
+
+
+ `nomad.raft.replication.appendEntries` |
+ Raft transaction commit time |
+ ms / Raft Log Append |
+ Timer |
+
+
+ `nomad.raft.leader.lastContact` |
+ Time since last contact to leader. General indicator of Raft latency |
+ ms / Leader Contact |
+ Timer |
+
+
+ `nomad.broker.total_ready` |
+ Number of evaluations ready to be processed |
+ # of evaluations |
+ Gauge |
+
+
+ `nomad.broker.total_unacked` |
+ Evaluations dispatched for processing but incomplete |
+ # of evaluations |
+ Gauge |
+
+
+ `nomad.broker.total_blocked` |
+
+ Evaluations that are blocked until an existing evaluation for the same job
+ completes
+ |
+ # of evaluations |
+ Gauge |
+
+
+ `nomad.plan.queue_depth` |
+ Number of scheduler Plans waiting to be evaluated |
+ # of plans |
+ Gauge |
+
+
+ `nomad.plan.submit` |
+
+ Time to submit a scheduler Plan. Higher values cause lower scheduling
+ throughput
+ |
+ ms / Plan Submit |
+ Timer |
+
+
+ `nomad.plan.evaluate` |
+
+ Time to validate a scheduler Plan. Higher values cause lower scheduling
+ throughput. Similar to `nomad.plan.submit` but does not include RPC time
+ or time in the Plan Queue
+ |
+ ms / Plan Evaluation |
+ Timer |
+
+
+ `nomad.worker.invoke_scheduler.` |
+ Time to run the scheduler of the given type |
+ ms / Scheduler Run |
+ Timer |
+
+
+ `nomad.worker.wait_for_index` |
+
+ Time waiting for Raft log replication from leader. High delays result in
+ lower scheduling throughput
+ |
+ ms / Raft Index Wait |
+ Timer |
+
+
+ `nomad.heartbeat.active` |
+
+ Number of active heartbeat timers. Each timer represents a Nomad Client
+ connection
+ |
+ # of heartbeat timers |
+ Gauge |
+
+
+ `nomad.heartbeat.invalidate` |
+
+ The length of time it takes to invalidate a Nomad Client due to failed
+ heartbeats
+ |
+ ms / Heartbeat Invalidation |
+ Timer |
+
+
+ `nomad.rpc.query` |
+ Number of RPC queries |
+ RPC Queries / `interval` |
+ Counter |
+
+
+ `nomad.rpc.request` |
+ Number of RPC requests being handled |
+ RPC Requests / `interval` |
+ Counter |
+
+
+ `nomad.rpc.request_error` |
+ Number of RPC requests being handled that result in an error |
+ RPC Errors / `interval` |
+ Counter |
+
+
+
+## Client Metrics
+
+The Nomad client emits metrics related to the resource usage of the allocations
+and tasks running on it and the node itself. Operators have to explicitly turn
+on publishing host and allocation metrics. Publishing allocation and host
+metrics can be turned on by setting the value of `publish_allocation_metrics`
+`publish_node_metrics` to `true`.
+
+
+By default the collection interval is 1 second but it can be changed by the
+changing the value of the `collection_interval` key in the `telemetry`
+configuration block.
+
+Please see the [agent configuration](/docs/configuration/telemetry.html)
+page for more details.
+
+As of Nomad 0.9, Nomad will emit additional labels for [parameterized](/docs/job-specification/parameterized.html) and
+[periodic](/docs/job-specification/parameterized.html) jobs. Nomad
+emits the parent job id as a new label `parent_id`. Also, the labels `dispatch_id`
+and `periodic_id` are emitted, containing the ID of the specific invocation of the
+parameterized or periodic job respectively. For example, a dispatch job with the id
+`myjob/dispatch-1312323423423`, will have the following labels.
+
+
+
+ Metric |
+ Description |
+ Unit |
+ Type |
+ Labels |
+
+
+ `nomad.client.allocated.cpu` |
+ Total amount of CPU shares the scheduler has allocated to tasks |
+ MHz |
+ Gauge |
+ node_id, datacenter |
+
+
+ `nomad.client.unallocated.cpu` |
+ Total amount of CPU shares free for the scheduler to allocate to tasks |
+ MHz |
+ Gauge |
+ node_id, datacenter |
+
+
+ `nomad.client.allocated.memory` |
+ Total amount of memory the scheduler has allocated to tasks |
+ Megabytes |
+ Gauge |
+ node_id, datacenter |
+
+
+ `nomad.client.unallocated.memory` |
+ Total amount of memory free for the scheduler to allocate to tasks |
+ Megabytes |
+ Gauge |
+ node_id, datacenter |
+
+
+ `nomad.client.allocated.disk` |
+ Total amount of disk space the scheduler has allocated to tasks |
+ Megabytes |
+ Gauge |
+ node_id, datacenter |
+
+
+ `nomad.client.unallocated.disk` |
+ Total amount of disk space free for the scheduler to allocate to tasks |
+ Megabytes |
+ Gauge |
+ node_id, datacenter |
+
+
+ `nomad.client.allocated.network` |
+ Total amount of bandwidth the scheduler has allocated to tasks on the
+ given device |
+ Megabits |
+ Gauge |
+ node_id, datacenter, device |
+
+
+ `nomad.client.unallocated.network` |
+ Total amount of bandwidth free for the scheduler to allocate to tasks on
+ the given device |
+ Megabits |
+ Gauge |
+ node_id, datacenter, device |
+
+
+ `nomad.client.host.memory.total` |
+ Total amount of physical memory on the node |
+ Bytes |
+ Gauge |
+ node_id, datacenter |
+
+
+ `nomad.client.host.memory.available` |
+ Total amount of memory available to processes which includes free and
+ cached memory |
+ Bytes |
+ Gauge |
+ node_id, datacenter |
+
+
+ `nomad.client.host.memory.used` |
+ Amount of memory used by processes |
+ Bytes |
+ Gauge |
+ node_id, datacenter |
+
+
+ `nomad.client.host.memory.free` |
+ Amount of memory which is free |
+ Bytes |
+ Gauge |
+ node_id, datacenter |
+
+
+ `nomad.client.uptime` |
+ Uptime of the host running the Nomad client |
+ Seconds |
+ Gauge |
+ node_id, datacenter |
+
+
+ `nomad.client.host.cpu.total` |
+ Total CPU utilization |
+ Percentage |
+ Gauge |
+ node_id, datacenter, cpu |
+
+
+ `nomad.client.host.cpu.user` |
+ CPU utilization in the user space |
+ Percentage |
+ Gauge |
+ node_id, datacenter, cpu |
+
+
+ `nomad.client.host.cpu.system` |
+ CPU utilization in the system space |
+ Percentage |
+ Gauge |
+ node_id, datacenter, cpu |
+
+
+ `nomad.client.host.cpu.idle` |
+ Idle time spent by the CPU |
+ Percentage |
+ Gauge |
+ node_id, datacenter, cpu |
+
+
+ `nomad.client.host.disk.size` |
+ Total size of the device |
+ Bytes |
+ Gauge |
+ node_id, datacenter, disk |
+
+
+ `nomad.client.host.disk.used` |
+ Amount of space which has been used |
+ Bytes |
+ Gauge |
+ node_id, datacenter, disk |
+
+
+ `nomad.client.host.disk.available` |
+ Amount of space which is available |
+ Bytes |
+ Gauge |
+ node_id, datacenter, disk |
+
+
+ `nomad.client.host.disk.used_percent` |
+ Percentage of disk space used |
+ Percentage |
+ Gauge |
+ node_id, datacenter, disk |
+
+
+ `nomad.client.host.disk.inodes_percent` |
+ Disk space consumed by the inodes |
+ Percent |
+ Gauge |
+ node_id, datacenter, disk |
+
+
+ `nomad.client.allocs.start` |
+ Number of allocations starting |
+ Integer |
+ Counter |
+ node_id, job, task_group |
+
+
+ `nomad.client.allocs.running` |
+ Number of allocations starting to run |
+ Integer |
+ Counter |
+ node_id, job, task_group |
+
+
+ `nomad.client.allocs.failed` |
+ Number of allocations failing |
+ Integer |
+ Counter |
+ node_id, job, task_group |
+
+
+ `nomad.client.allocs.restart` |
+ Number of allocations restarting |
+ Integer |
+ Counter |
+ node_id, job, task_group |
+
+
+ `nomad.client.allocs.complete` |
+ Number of allocations completing |
+ Integer |
+ Counter |
+ node_id, job, task_group |
+
+
+ `nomad.client.allocs.destroy` |
+ Number of allocations being destroyed |
+ Integer |
+ Counter |
+ node_id, job, task_group |
+
+
+
+Nomad 0.9 adds an additional `node_class` label from the client's
+`NodeClass` attribute. This label is set to the string "none" if empty.
+
+## Host Metrics (deprecated post Nomad 0.7)
+
+The below are metrics emitted by Nomad in versions prior to 0.7. These metrics
+can be emitted in the below format post-0.7 (as well as the new format,
+detailed above) but any new metrics will only be available in the new format.
+
+
+
+ Metric |
+ Description |
+ Unit |
+ Type |
+
+
+ `nomad.client.allocated.cpu.` |
+ Total amount of CPU shares the scheduler has allocated to tasks |
+ MHz |
+ Gauge |
+
+
+ `nomad.client.unallocated.cpu.` |
+ Total amount of CPU shares free for the scheduler to allocate to tasks |
+ MHz |
+ Gauge |
+
+
+ `nomad.client.allocated.memory.` |
+ Total amount of memory the scheduler has allocated to tasks |
+ Megabytes |
+ Gauge |
+
+
+ `nomad.client.unallocated.memory.` |
+ Total amount of memory free for the scheduler to allocate to tasks |
+ Megabytes |
+ Gauge |
+
+
+ `nomad.client.allocated.disk.` |
+ Total amount of disk space the scheduler has allocated to tasks |
+ Megabytes |
+ Gauge |
+
+
+ `nomad.client.unallocated.disk.` |
+ Total amount of disk space free for the scheduler to allocate to tasks |
+ Megabytes |
+ Gauge |
+
+
+ `nomad.client.allocated.network..` |
+ Total amount of bandwidth the scheduler has allocated to tasks on the
+ given device |
+ Megabits |
+ Gauge |
+
+
+ `nomad.client.unallocated.network..` |
+ Total amount of bandwidth free for the scheduler to allocate to tasks on
+ the given device |
+ Megabits |
+ Gauge |
+
+
+ `nomad.client.host.memory..total` |
+ Total amount of physical memory on the node |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.host.memory..available` |
+ Total amount of memory available to processes which includes free and
+ cached memory |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.host.memory..used` |
+ Amount of memory used by processes |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.host.memory..free` |
+ Amount of memory which is free |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.uptime.` |
+ Uptime of the host running the Nomad client |
+ Seconds |
+ Gauge |
+
+
+ `nomad.client.host.cpu...total` |
+ Total CPU utilization |
+ Percentage |
+ Gauge |
+
+
+ `nomad.client.host.cpu...user` |
+ CPU utilization in the user space |
+ Percentage |
+ Gauge |
+
+
+ `nomad.client.host.cpu...system` |
+ CPU utilization in the system space |
+ Percentage |
+ Gauge |
+
+
+ `nomad.client.host.cpu...idle` |
+ Idle time spent by the CPU |
+ Percentage |
+ Gauge |
+
+
+ `nomad.client.host.disk...size` |
+ Total size of the device |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.host.disk...used` |
+ Amount of space which has been used |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.host.disk...available` |
+ Amount of space which is available |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.host.disk...used_percent` |
+ Percentage of disk space used |
+ Percentage |
+ Gauge |
+
+
+ `nomad.client.host.disk...inodes_percent` |
+ Disk space consumed by the inodes |
+ Percent |
+ Gauge |
+
+
+
+## Allocation Metrics
+
+
+
+ Metric |
+ Description |
+ Unit |
+ Type |
+
+
+ `nomad.client.allocs.....memory.rss` |
+ Amount of RSS memory consumed by the task |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.allocs.....memory.cache` |
+ Amount of memory cached by the task |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.allocs.....memory.swap` |
+ Amount of memory swapped by the task |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.allocs.....memory.max_usage` |
+ Maximum amount of memory ever used by the task |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.allocs.....memory.kernel_usage` |
+ Amount of memory used by the kernel for this task |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.allocs.....memory.kernel_max_usage` |
+ Maximum amount of memory ever used by the kernel for this task |
+ Bytes |
+ Gauge |
+
+
+ `nomad.client.allocs.....cpu.total_percent` |
+ Total CPU resources consumed by the task across all cores |
+ Percentage |
+ Gauge |
+
+
+ `nomad.client.allocs.....cpu.system` |
+ Total CPU resources consumed by the task in the system space |
+ Percentage |
+ Gauge |
+
+
+ `nomad.client.allocs.....cpu.user` |
+ Total CPU resources consumed by the task in the user space |
+ Percentage |
+ Gauge |
+
+
+ `nomad.client.allocs.....cpu.throttled_time` |
+ Total time that the task was throttled |
+ Nanoseconds |
+ Gauge |
+
+
+ `nomad.client.allocs.....cpu.total_ticks` |
+ CPU ticks consumed by the process in the last collection interval |
+ Integer |
+ Gauge |
+
+
+
+## Job Metrics
+
+Job metrics are emitted by the Nomad leader server.
+
+