diff --git a/.github/ISSUE_TEMPLATE/bug_report.yaml b/.github/ISSUE_TEMPLATE/bug_report.yaml index b09b4102e4..8a7e8aee67 100644 --- a/.github/ISSUE_TEMPLATE/bug_report.yaml +++ b/.github/ISSUE_TEMPLATE/bug_report.yaml @@ -49,6 +49,7 @@ body: - area:gen-ai - area:go - area:graphql + - area:hardware - area:heroku - area:host - area:http diff --git a/.github/ISSUE_TEMPLATE/change_proposal.yaml b/.github/ISSUE_TEMPLATE/change_proposal.yaml index e30d0a4630..d924b6fe58 100644 --- a/.github/ISSUE_TEMPLATE/change_proposal.yaml +++ b/.github/ISSUE_TEMPLATE/change_proposal.yaml @@ -41,6 +41,7 @@ body: - area:gen-ai - area:go - area:graphql + - area:hardware - area:heroku - area:host - area:http diff --git a/.github/ISSUE_TEMPLATE/new-conventions.yaml b/.github/ISSUE_TEMPLATE/new-conventions.yaml index 8301d9b1bc..ca0f864b07 100644 --- a/.github/ISSUE_TEMPLATE/new-conventions.yaml +++ b/.github/ISSUE_TEMPLATE/new-conventions.yaml @@ -50,6 +50,7 @@ body: - area:gen-ai - area:go - area:graphql + - area:hardware - area:heroku - area:host - area:http diff --git a/docs/attributes-registry/README.md b/docs/attributes-registry/README.md index 003b21c5b4..f4a53171a0 100644 --- a/docs/attributes-registry/README.md +++ b/docs/attributes-registry/README.md @@ -61,6 +61,7 @@ Currently, the following namespaces exist: - [Gen AI](gen-ai.md) - [Go](go.md) - [GraphQL](graphql.md) +- [Hardware](hardware.md) - [Heroku](heroku.md) - [Host](host.md) - [HTTP](http.md) diff --git a/docs/attributes-registry/hardware.md b/docs/attributes-registry/hardware.md new file mode 100644 index 0000000000..84f4042f4c --- /dev/null +++ b/docs/attributes-registry/hardware.md @@ -0,0 +1,48 @@ + + + + + +# Hardware + +## Hardware Attributes + +Attributes for hardware. + +| Attribute | Type | Description | Examples | Stability | +| ----------- | ------ | ---------------------------------------------------------------------------------------------------------------- | ----------------------------------- | ---------------------------------------------------------------- | +| `hw.id` | string | An identifier for the hardware component, unique within the monitored host | `win32battery_battery_testsysa33_1` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `hw.name` | string | An easily-recognizable name for the hardware component | `eth0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `hw.parent` | string | Unique identifier of the parent component (typically the `hw.id` attribute of the enclosure, or disk controller) | `dellStorage_perc_0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `hw.state` | string | The current state of the component | `ok`; `degraded`; `failed` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `hw.type` | string | Type of the component [1] | `battery`; `cpu`; `disk_controller` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +**[1]:** Describes the category of the hardware component for which `hw.state` is being reported. For example, `hw.type=temperature` along with `hw.state=degraded` would indicate that the temperature of the hardware component has been reported as `degraded`. + +`hw.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +| ---------- | ----------- | ---------------------------------------------------------------- | +| `degraded` | Degraded | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `failed` | Failed | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `ok` | Ok | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +`hw.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +| ----------------- | --------------- | ---------------------------------------------------------------- | +| `battery` | Battery | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `cpu` | CPU | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `disk_controller` | Disk controller | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `enclosure` | Enclosure | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `fan` | Fan | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gpu` | GPU | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `logical_disk` | Logical disk | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `memory` | Memory | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `network` | Network | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `physical_disk` | Physical disk | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `power_supply` | Power supply | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `tape_drive` | Tape drive | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `temperature` | Temperature | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `voltage` | Voltage | ![Experimental](https://img.shields.io/badge/-experimental-blue) | diff --git a/docs/hardware/README.md b/docs/hardware/README.md new file mode 100644 index 0000000000..6b7f098cbe --- /dev/null +++ b/docs/hardware/README.md @@ -0,0 +1,17 @@ + + +# Semantic Conventions for Hardware + +**Status**: [Experimental][DocumentStatus] + +This document describes instruments and attributes for common hardware level +metrics in OpenTelemetry. Consider the [general metric semantic conventions](/docs/general/metrics.md#general-metric-semantic-conventions) +when creating instruments not explicitly defined in the specification. + +Semantic conventions for hardware are defined as following: + +* [Common Hardware Metrics](common.md): Semantic Conventions for *common* hardware metrics. + +[DocumentStatus]: https://opentelemetry.io/docs/specs/otel/document-status diff --git a/docs/hardware/common.md b/docs/hardware/common.md new file mode 100644 index 0000000000..050f91086d --- /dev/null +++ b/docs/hardware/common.md @@ -0,0 +1,362 @@ + + +# Semantic Conventions for Common Hardware Metrics + +**Status**: [Experimental][DocumentStatus] + + + +- [Common hardware metrics](#common-hardware-metrics) + - [Metric: `hw.energy`](#metric-hwenergy) + - [Metric: `hw.errors`](#metric-hwerrors) + - [Metric: `hw.power`](#metric-hwpower) + - [Metric: `hw.status`](#metric-hwstatus) + + + +Hardware metrics do not include attributes that identify the device, machine, or host they are reported for. This +information is expected to be provided via resource attributes configured by user applications. +Application developers are encouraged to configure [Host](/docs/resource/host.md) resource attributes. + +## Common hardware metrics + +The below metrics apply to any type of hardware component. + +These common `hw.` metrics include the below attributes to describe the +monitored component: + + + + + + + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`hw.id`](/docs/attributes-registry/hardware.md) | string | An identifier for the hardware component, unique within the monitored host | `win32battery_battery_testsysa33_1` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.type`](/docs/attributes-registry/hardware.md) | string | Type of the component [1] | `battery`; `cpu`; `disk_controller` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.name`](/docs/attributes-registry/hardware.md) | string | An easily-recognizable name for the hardware component | `eth0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.parent`](/docs/attributes-registry/hardware.md) | string | Unique identifier of the parent component (typically the `hw.id` attribute of the enclosure, or disk controller) | `dellStorage_perc_0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +**[1]:** Describes the category of the hardware component for which `hw.state` is being reported. For example, `hw.type=temperature` along with `hw.state=degraded` would indicate that the temperature of the hardware component has been reported as `degraded`. + + + +`hw.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `battery` | Battery | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `cpu` | CPU | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `disk_controller` | Disk controller | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `enclosure` | Enclosure | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `fan` | Fan | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gpu` | GPU | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `logical_disk` | Logical disk | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `memory` | Memory | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `network` | Network | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `physical_disk` | Physical disk | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `power_supply` | Power supply | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `tape_drive` | Tape drive | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `temperature` | Temperature | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `voltage` | Voltage | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + + + + + + + +### Metric: `hw.energy` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | +| -------- | --------------- | ----------- | -------------- | --------- | +| `hw.energy` | Counter | `J` | Energy consumed by the component | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + + + + + + + + + + + + + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`hw.id`](/docs/attributes-registry/hardware.md) | string | An identifier for the hardware component, unique within the monitored host | `win32battery_battery_testsysa33_1` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.type`](/docs/attributes-registry/hardware.md) | string | Type of the component [1] | `battery`; `cpu`; `disk_controller` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.name`](/docs/attributes-registry/hardware.md) | string | An easily-recognizable name for the hardware component | `eth0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.parent`](/docs/attributes-registry/hardware.md) | string | Unique identifier of the parent component (typically the `hw.id` attribute of the enclosure, or disk controller) | `dellStorage_perc_0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +**[1]:** Describes the category of the hardware component for which `hw.state` is being reported. For example, `hw.type=temperature` along with `hw.state=degraded` would indicate that the temperature of the hardware component has been reported as `degraded`. + + + +`hw.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `battery` | Battery | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `cpu` | CPU | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `disk_controller` | Disk controller | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `enclosure` | Enclosure | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `fan` | Fan | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gpu` | GPU | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `logical_disk` | Logical disk | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `memory` | Memory | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `network` | Network | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `physical_disk` | Physical disk | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `power_supply` | Power supply | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `tape_drive` | Tape drive | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `temperature` | Temperature | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `voltage` | Voltage | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + + + + + + + +### Metric: `hw.errors` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | +| -------- | --------------- | ----------- | -------------- | --------- | +| `hw.errors` | Counter | `{error}` | Number of errors encountered by the component | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + + + + + + + + + + + + + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`hw.id`](/docs/attributes-registry/hardware.md) | string | An identifier for the hardware component, unique within the monitored host | `win32battery_battery_testsysa33_1` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.type`](/docs/attributes-registry/hardware.md) | string | Type of the component [1] | `battery`; `cpu`; `disk_controller` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`error.type`](/docs/attributes-registry/error.md) | string | The type of error encountered by the component [2] | `uncorrected`; `zero_buffer_credit`; `crc`; `bad_sector` | `Conditionally Required` if and only if an error has occurred | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`hw.name`](/docs/attributes-registry/hardware.md) | string | An easily-recognizable name for the hardware component | `eth0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.parent`](/docs/attributes-registry/hardware.md) | string | Unique identifier of the parent component (typically the `hw.id` attribute of the enclosure, or disk controller) | `dellStorage_perc_0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +**[1]:** Describes the category of the hardware component for which `hw.state` is being reported. For example, `hw.type=temperature` along with `hw.state=degraded` would indicate that the temperature of the hardware component has been reported as `degraded`. + +**[2]:** The `error.type` SHOULD match the error code reported by the component, the canonical name of the error, or another low-cardinality error identifier. Instrumentations SHOULD document the list of errors they report. + + + +`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | + + +`hw.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `battery` | Battery | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `cpu` | CPU | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `disk_controller` | Disk controller | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `enclosure` | Enclosure | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `fan` | Fan | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gpu` | GPU | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `logical_disk` | Logical disk | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `memory` | Memory | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `network` | Network | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `physical_disk` | Physical disk | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `power_supply` | Power supply | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `tape_drive` | Tape drive | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `temperature` | Temperature | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `voltage` | Voltage | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + + + + + + + +### Metric: `hw.power` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | +| -------- | --------------- | ----------- | -------------- | --------- | +| `hw.power` | Gauge | `W` | Instantaneous power consumed by the component [1] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + +**[1]:** It is recommended to report `hw.energy` instead of `hw.power` when possible. + + + + + + + + + + + + + + + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`hw.id`](/docs/attributes-registry/hardware.md) | string | An identifier for the hardware component, unique within the monitored host | `win32battery_battery_testsysa33_1` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.type`](/docs/attributes-registry/hardware.md) | string | Type of the component [1] | `battery`; `cpu`; `disk_controller` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.name`](/docs/attributes-registry/hardware.md) | string | An easily-recognizable name for the hardware component | `eth0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.parent`](/docs/attributes-registry/hardware.md) | string | Unique identifier of the parent component (typically the `hw.id` attribute of the enclosure, or disk controller) | `dellStorage_perc_0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +**[1]:** Describes the category of the hardware component for which `hw.state` is being reported. For example, `hw.type=temperature` along with `hw.state=degraded` would indicate that the temperature of the hardware component has been reported as `degraded`. + + + +`hw.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `battery` | Battery | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `cpu` | CPU | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `disk_controller` | Disk controller | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `enclosure` | Enclosure | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `fan` | Fan | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gpu` | GPU | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `logical_disk` | Logical disk | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `memory` | Memory | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `network` | Network | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `physical_disk` | Physical disk | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `power_supply` | Power supply | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `tape_drive` | Tape drive | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `temperature` | Temperature | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `voltage` | Voltage | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + + + + + + + +### Metric: `hw.status` + +This metric is [recommended][MetricRecommended]. + + + + + + + + +| Name | Instrument Type | Unit (UCUM) | Description | Stability | +| -------- | --------------- | ----------- | -------------- | --------- | +| `hw.status` | UpDownCounter | `1` | Operational status: `1` (true) or `0` (false) for each of the possible states [1] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + +**[1]:** `hw.status` is currently specified as an *UpDownCounter* but would ideally be represented using a [*StateSet* as defined in OpenMetrics](https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#stateset). This semantic convention will be updated once *StateSet* is specified in OpenTelemetry. This planned change is not expected to have any consequence on the way users query their timeseries backend to retrieve the values of `hw.status` over time. + + + + + + + + + + + + + + + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`hw.id`](/docs/attributes-registry/hardware.md) | string | An identifier for the hardware component, unique within the monitored host | `win32battery_battery_testsysa33_1` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.state`](/docs/attributes-registry/hardware.md) | string | The current state of the component | `ok`; `degraded`; `failed` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.type`](/docs/attributes-registry/hardware.md) | string | Type of the component [1] | `battery`; `cpu`; `disk_controller` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.name`](/docs/attributes-registry/hardware.md) | string | An easily-recognizable name for the hardware component | `eth0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`hw.parent`](/docs/attributes-registry/hardware.md) | string | Unique identifier of the parent component (typically the `hw.id` attribute of the enclosure, or disk controller) | `dellStorage_perc_0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +**[1]:** Describes the category of the hardware component for which `hw.state` is being reported. For example, `hw.type=temperature` along with `hw.state=degraded` would indicate that the temperature of the hardware component has been reported as `degraded`. + + + +`hw.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `degraded` | Degraded | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `failed` | Failed | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `ok` | Ok | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + +`hw.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `battery` | Battery | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `cpu` | CPU | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `disk_controller` | Disk controller | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `enclosure` | Enclosure | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `fan` | Fan | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gpu` | GPU | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `logical_disk` | Logical disk | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `memory` | Memory | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `network` | Network | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `physical_disk` | Physical disk | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `power_supply` | Power supply | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `tape_drive` | Tape drive | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `temperature` | Temperature | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `voltage` | Voltage | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + + + + + + + +[DocumentStatus]: https://opentelemetry.io/docs/specs/otel/document-status +[MetricRecommended]: /docs/general/metric-requirement-level.md#recommended diff --git a/docs/system/hardware-metrics.md b/docs/system/hardware-metrics.md index 76e69590ee..a7b7039cc1 100644 --- a/docs/system/hardware-metrics.md +++ b/docs/system/hardware-metrics.md @@ -10,11 +10,16 @@ This document describes instruments and attributes for common hardware level metrics in OpenTelemetry. Consider the [general metric semantic conventions](/docs/general/metrics.md#general-metric-semantic-conventions) when creating instruments not explicitly defined in the specification. - +This document is being converted to specific hardware metrics, parts of this document that have already been +converted are now located in the [Hardware] (/docs/hardware/README.md) folder and are no longer present in this file. -- [Common hardware attributes](#common-hardware-attributes) +Please note that this is an [ongoing process](https://github.com/open-telemetry/semantic-conventions/issues/1309) and may take some time to complete. + + + +- [Common hardware attributes](/docs/attributes-registry/hardware.md) - [Metric Instruments](#metric-instruments) - - [`hw.` - Common hardware metrics](#hw---common-hardware-metrics) + - [`hw.` - Common hardware metrics](/docs/hardware/common.md) - [`hw.host.` - Physical host metrics](#hwhost---physical-host-metrics) - [`hw.battery.` - Battery metrics](#hwbattery---battery-metrics) - [`hw.cpu.` - Physical processor metrics](#hwcpu---physical-processor-metrics) @@ -31,8 +36,6 @@ when creating instruments not explicitly defined in the specification. - [`hw.temperature.` - Temperature sensor metrics](#hwtemperature---temperature-sensor-metrics) - [`hw.voltage.` - Voltage sensor metrics](#hwvoltage---voltage-sensor-metrics) - - > **Warning** > Existing instrumentations and collector that are using > [v1.21.0 of this document](https://github.com/open-telemetry/semantic-conventions/blob/v1.21.0/docs/system/hardware-metrics.md) @@ -44,43 +47,8 @@ when creating instruments not explicitly defined in the specification. > * SHOULD introduce a control mechanism to allow users to opt-in to the new > conventions once the migration plan is finalized. -## Common hardware attributes - -All metrics in `hw.` instruments should be attached to a [Host Resource](/docs/resource/host.md) -and therefore inherit its attributes, like `host.id` and `host.name`. - -Additionally, all metrics in `hw.` instruments have the following attributes: - -| Attribute Key | Description | Example | Requirement Level | -| ------------- | ------------------------------------------------------------------------------------------------------------- | ----------------------------------- | ----------------- | -| `id` | An identifier for the hardware component, unique within the monitored host | `win32battery_battery_testsysa33_1` | **Required** | -| `name` | An easily-recognizable name for the hardware component | `eth0` | Recommended | -| `parent` | Unique identifier of the parent component (typically the `id` attribute of the enclosure, or disk controller) | `dellStorage_perc_0` | Recommended | - ## Metric Instruments -### `hw.` - Common hardware metrics - -The below metrics apply to any type of hardware component. - -| Name | Description | Units | Instrument Type ([*](/docs/general/metrics.md#instrument-types)) | Value Type | Attribute Key(s) | Attribute Values | -| ----------- | ---------------------------------------------------------------------------------- | ------- | ------------------------------------------------- | ---------- | ----------------------------- | -------------------------- | -| `hw.energy` | Energy consumed by the component, in joules | J | Counter | Int64 | | | -| `hw.errors` | Number of errors encountered by the component | {error} | Counter | Int64 | `hw.error.type` (Recommended) | | -| `hw.power` | Instantaneous power consumed by the component, in Watts (`hw.energy` is preferred) | W | Gauge | Double | | | -| `hw.status` | Operational status: `1` (true) or `0` (false) for each of the possible states | | UpDownCounter | Int | `state` (**Required**) | `ok`, `degraded`, `failed` | - -These common `hw.` metrics must include the below attributes to describe the -monitored component: - -| Attribute Key | Description | Example | Requirement Level | -| ------------- | --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- | -| `hw.type` | Type of the component | `battery`, `cpu`, `disk_controller`, `enclosure`, `fan`, `gpu`, `logical_disk`, `memory`, `network`, `physical_disk`, `power_supply`, `tape_drive`, `temperature`, `voltage` | **Required** | - -> **Warning** -> -> `hw.status` is currently specified as an *UpDownCounter* but would ideally be represented using a [*StateSet* as defined in OpenMetrics](https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#stateset). This semantic convention will be updated once *StateSet* is specified in OpenTelemetry. This planned change is not expected to have any consequence on the way users query their timeseries backend to retrieve the values of `hw.status` over time. - ### `hw.host.` - Physical host metrics **Description:** Physical system as opposed to a virtual system or a container. diff --git a/model/hardware-common.yaml b/model/hardware-common.yaml new file mode 100644 index 0000000000..6abf99809a --- /dev/null +++ b/model/hardware-common.yaml @@ -0,0 +1,12 @@ +groups: + - id: hardware.attributes.common + type: attribute_group + stability: experimental + brief: 'Common hardware attributes' + attributes: + - ref: hw.id + requirement_level: required + - ref: hw.name + requirement_level: recommended + - ref: hw.parent + requirement_level: recommended diff --git a/model/metrics/hardware/common.yaml b/model/metrics/hardware/common.yaml new file mode 100644 index 0000000000..b975f53e18 --- /dev/null +++ b/model/metrics/hardware/common.yaml @@ -0,0 +1,65 @@ +groups: + # COMMON METRICS + - id: metric.hardware.attributes + type: attribute_group + brief: "Attributes for hardware metrics" + extends: hardware.attributes.common + attributes: + - ref: hw.type + requirement_level: required + + - id: metric.hardware.energy + type: metric + metric_name: hw.energy + stability: experimental + brief: "Energy consumed by the component" + instrument: counter + unit: "J" + extends: metric.hardware.attributes + + - id: metric.hardware.errors + type: metric + metric_name: hw.errors + stability: experimental + brief: "Number of errors encountered by the component" + instrument: counter + unit: "{error}" + extends: metric.hardware.attributes + attributes: + - ref: error.type + brief: "The type of error encountered by the component" + examples: ['uncorrected', 'zero_buffer_credit', 'crc', 'bad_sector'] + requirement_level: + conditionally_required: if and only if an error has occurred + note: > + The `error.type` SHOULD match the error code reported by the component, the canonical name of the error, + or another low-cardinality error identifier. Instrumentations SHOULD document the list of errors they report. + + - id: metric.hardware.power + type: metric + metric_name: hw.power + stability: experimental + brief: "Instantaneous power consumed by the component" + note: > + It is recommended to report `hw.energy` instead of `hw.power` when possible. + instrument: gauge + unit: "W" + extends: metric.hardware.attributes + + - id: metric.hardware.status + type: metric + metric_name: hw.status + stability: experimental + brief: "Operational status: `1` (true) or `0` (false) for each of the possible states" + instrument: updowncounter + unit: "1" + extends: metric.hardware.attributes + note: > + `hw.status` is currently specified as an *UpDownCounter* but would ideally be represented using a + [*StateSet* as defined in OpenMetrics](https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#stateset). + This semantic convention will be updated once *StateSet* is specified in OpenTelemetry. This planned change + is not expected to have any consequence on the way users query their timeseries backend to retrieve the + values of `hw.status` over time. + attributes: + - ref: hw.state + requirement_level: required diff --git a/model/registry/hardware.yaml b/model/registry/hardware.yaml new file mode 100644 index 0000000000..248484b193 --- /dev/null +++ b/model/registry/hardware.yaml @@ -0,0 +1,108 @@ +groups: + - id: registry.hardware + type: attribute_group + brief: > + Attributes for hardware. + attributes: + - id: hw.id + type: string + stability: experimental + brief: > + An identifier for the hardware component, unique within the monitored host + examples: ['win32battery_battery_testsysa33_1'] + - id: hw.name + type: string + stability: experimental + brief: > + An easily-recognizable name for the hardware component + examples: ['eth0'] + - id: hw.parent + type: string + stability: experimental + brief: > + Unique identifier of the parent component (typically the `hw.id` attribute of the enclosure, or disk controller) + examples: ['dellStorage_perc_0'] + - id: hw.type + type: + members: + - id: battery + value: 'battery' + brief: "Battery" + stability: experimental + - id: cpu + value: 'cpu' + brief: 'CPU' + stability: experimental + - id: disk_controller + value: 'disk_controller' + brief: 'Disk controller' + stability: experimental + - id: enclosure + value: 'enclosure' + brief: 'Enclosure' + stability: experimental + - id: fan + value: 'fan' + brief: 'Fan' + stability: experimental + - id: gpu + value: 'gpu' + brief: 'GPU' + stability: experimental + - id: logical_disk + value: 'logical_disk' + brief: 'Logical disk' + stability: experimental + - id: memory + value: 'memory' + brief: 'Memory' + stability: experimental + - id: network + value: 'network' + brief: 'Network' + stability: experimental + - id: physical_disk + value: 'physical_disk' + brief: 'Physical disk' + stability: experimental + - id: power_supply + value: 'power_supply' + brief: 'Power supply' + stability: experimental + - id: tape_drive + value: 'tape_drive' + brief: 'Tape drive' + stability: experimental + - id: temperature + value: 'temperature' + brief: 'Temperature' + stability: experimental + - id: voltage + value: 'voltage' + brief: 'Voltage' + stability: experimental + stability: experimental + brief: > + Type of the component + note: > + Describes the category of the hardware component for which `hw.state` is being reported. For example, + `hw.type=temperature` along with `hw.state=degraded` would indicate that the temperature of the hardware + component has been reported as `degraded`. + - id: hw.state + type: + members: + - id: ok + value: 'ok' + brief: "Ok" + stability: experimental + - id: degraded + value: 'degraded' + brief: 'Degraded' + stability: experimental + - id: failed + value: 'failed' + brief: 'Failed' + stability: experimental + stability: experimental + brief: > + The current state of the component