Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add semantic conventions for Elasticsearch client instrumentation #23

Merged
merged 27 commits into from
Jun 29, 2023
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
09d901e
Add Elasticsearch semantic conventions and specification
estolfo May 15, 2023
a5d9965
Update to latest http method reference
estolfo May 15, 2023
45bf2f4
Elasticsearch semantic conventions attributes table
estolfo May 15, 2023
093d6a0
Updates based on latest url.* and http.* changes
estolfo May 15, 2023
5b0aec1
Udpates after review
estolfo May 30, 2023
0e63d33
Use READCTED in example
estolfo May 30, 2023
fa5df4a
Minor updates
estolfo May 31, 2023
9aa0bdf
Remove reference to http span
estolfo Jun 2, 2023
e474c4b
Add db.operation to span attributes
estolfo Jun 2, 2023
c5d9d00
add CHANGELOG again
estolfo Jun 2, 2023
b64fddd
Minor wording update
estolfo Jun 2, 2023
82c6371
Rearrange span attributes with required at the top
estolfo Jun 2, 2023
ad96374
Use path_parts for the dynamic url path values
estolfo Jun 13, 2023
5da011a
Update span name of ES spans
estolfo Jun 13, 2023
fdb2063
Better example for path parts
estolfo Jun 13, 2023
311cb1f
Fix linting errors
estolfo Jun 27, 2023
1bd79a1
Update note for db.statement
estolfo Jun 28, 2023
42464f1
Re-generate the table
estolfo Jun 28, 2023
bc3211e
Only define Recommended description
estolfo Jun 28, 2023
02cf2dd
Merge branch 'main' into elasticsearch
jsuereth Jun 28, 2023
ff6f615
Add Elasticsearch example span
estolfo Jun 29, 2023
5b009eb
Add TOC entry
estolfo Jun 29, 2023
424b0f3
Update example query
estolfo Jun 29, 2023
cbdfd90
Move example to elasticsearch.md
estolfo Jun 29, 2023
ddc4ee1
Fix formatting
estolfo Jun 29, 2023
09b155c
Use example without sanitization
estolfo Jun 29, 2023
6d73602
Merge branch 'main' into elasticsearch
arminru Jun 29, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,8 @@ Note: This is the first release of Semantic Conventions separate from the Specif
([#57](https://github.com/open-telemetry/semantic-conventions/pull/57))
- Add container `image.id`, `command`, `command_line` and `command_args` resource attributes.
([#39](https://github.com/open-telemetry/semantic-conventions/pull/39))
- Add Elasticsearch client semantic conventions.
([#23](https://github.com/open-telemetry/semantic-conventions/pull/23))

## v1.20.0 (2023-04-07)

Expand Down
26 changes: 26 additions & 0 deletions semantic_conventions/trace/database.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,32 @@ groups:
The collection being accessed within the database stated in `db.name`.
examples: [ 'customers', 'products' ]

- id: db.elasticsearch
prefix: db.elasticsearch
type: span
extends: db
brief: >
Call-level attributes for Elasticsearch
attributes:
- ref: http.request.method
requirement_level: required
- ref: db.operation
requirement_level: required
brief: The endpoint identifier for the request.
examples: [ 'search', 'ml.close_job', 'cat.aliases' ]
- ref: url.full
requirement_level: required
examples: [ 'https://localhost:9200/index/_search?q=user.id:kimchy' ]
- ref: db.statement
requirement_level:
recommended: >
Should be collected by default for search-type queries and only if there is sanitization that excludes
sensitive information.
brief: The request body for a [search-type query](https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html), as a json string.
examples: [ '"{\"name\":\"TestUser\",\"password\":\"REDACTED\"}"' ]
estolfo marked this conversation as resolved.
Show resolved Hide resolved
- ref: server.address
- ref: server.port

- id: db.sql
prefix: 'db.sql'
type: span
Expand Down
1 change: 1 addition & 0 deletions specification/trace/semantic_conventions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ The following library-specific semantic conventions are defined:

* [AWS Lambda](instrumentation/aws-lambda.md): For AWS Lambda spans.
* [AWS SDK](instrumentation/aws-sdk.md): For AWS SDK spans.
* [Elasticsearch](instrumentation/elasticsearch.md): For Elasticsearch spans.
* [GraphQL](instrumentation/graphql.md): For GraphQL spans.

Apart from semantic conventions for traces and [metrics](https://github.com/open-telemetry/opentelemetry-specification/tree/v1.21.0/specification/metrics/semantic_conventions/README.md),
Expand Down
15 changes: 15 additions & 0 deletions specification/trace/semantic_conventions/database.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
* [Redis](#redis)
* [MongoDB](#mongodb)
* [Microsoft Azure Cosmos DB](#microsoft-azure-cosmos-db)
* [Elasticsearch](#elasticsearch)

<!-- tocstop -->

Expand Down Expand Up @@ -364,4 +365,18 @@ Furthermore, `db.name` is not specified as there is no database name in Redis an
| `db.cosmosdb.sub_status_code` | `0` |
| `db.cosmosdb.request_charge` | `7.43` |

### Elasticsearch
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved

| Key | Value |
|:------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------|
| Span name | `"search"` |
| `db.system` | `"elasticsearch"` |
| `server.address` | `"elasticsearch.mydomain.com"` |
| `server.port` | `9200` |
| `http.request.method` | `"GET"` |
| `db.statement` | `"{\"query\":{\"term\":{\"user.id\":\"kimchy\"}}}"` |
| `db.operation` | `"search"` |
| `url.full` | `"https://elasticsearch.mydomain.com:9200/my-index-000001/_search?from=40&size=20"` |
| `db.elasticsearch.path_parts.index` | `"my-index-000001"` |

[DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/blob/v1.21.0/specification/document-status.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Semantic conventions for Elasticsearch

**Status**: [Experimental][DocumentStatus]

This document defines semantic conventions to apply when creating a span for requests to Elasticsearch.

## Span Name
arminru marked this conversation as resolved.
Show resolved Hide resolved

The **span name** SHOULD be of the format `<endpoint id>`.

The elasticsearch endpoint identifier is used instead of the url path in order to reduce the cardinality of the span
name, as the path could contain dynamic values. The endpoint id is the `name` field in the
[elasticsearch schema](https://raw.githubusercontent.com/elastic/elasticsearch-specification/main/output/schema/schema.json).
If the endpoint id is not available, the span name SHOULD be the `http.request.method`.

## URL path parts

Many Elasticsearch url paths allow dynamic values. These SHOULD be recorded in span attributes in the format
`db.elasticsearch.path_parts.<key>`, where `<key>` is the url path part name. The implementation SHOULD
reference the [elasticsearch schema](https://raw.githubusercontent.com/elastic/elasticsearch-specification/main/output/schema/schema.json)
in order to map the path part values to their names.

| Attribute | Type | Description | Examples | Requirement Level |
|-------------------------------------|---|---------------------------------------|------------------------------------------------------------------------------------------|---|
| `db.elasticsearch.path_parts.<key>` | string | A dynamic value in the url path. | `db.elasticsearch.path_parts.index=test-index`; `db.elasticsearch.path_parts.doc_id=123` | Conditionally Required: [1] |
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved

**[1]:** when the url has dynamic values
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved

## Span attributes

<!-- semconv db.elasticsearch -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| [`db.operation`](../database.md) | string | The endpoint identifier for the request. [1] | `search`; `ml.close_job`; `cat.aliases` | Required |
| [`db.statement`](../database.md) | string | The request body for a [search-type query](https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html), as a json string. | `"{\"query\":{\"term\":{\"user.id\":\"kimchy\"}}}"` | Recommended: [2] |
| `http.request.method` | string | HTTP request method. [3] | `GET`; `POST`; `HEAD` | Required |
| [`server.address`](../span-general.md) | string | Logical server hostname, matches server FQDN if available, and IP or socket address if FQDN is not known. | `example.com` | See below |
| [`server.port`](../span-general.md) | int | Logical server port number | `80`; `8080`; `443` | Recommended |
| `url.full` | string | Absolute URL describing a network resource according to [RFC3986](https://www.rfc-editor.org/rfc/rfc3986) [4] | `https://localhost:9200/index/_search?q=user.id:kimchy` | Required |

**[1]:** When setting this to an SQL keyword, it is not recommended to attempt any client-side parsing of `db.statement` just to get this property, but it should be set if the operation name is provided by the library being instrumented. If the SQL statement has an ambiguous operation, or performs more than one operation, this value may be omitted.

**[2]:** Should be collected by default for search-type queries and only if there is sanitization that excludes sensitive information.

**[3]:** HTTP request method value SHOULD be "known" to the instrumentation.
By default, this convention defines "known" methods as the ones listed in [RFC9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods)
and the PATCH method defined in [RFC5789](https://www.rfc-editor.org/rfc/rfc5789.html).

If the HTTP request method is not known to instrumentation, it MUST set the `http.request.method` attribute to `_OTHER` and, except if reporting a metric, MUST
set the exact method received in the request line as value of the `http.request.method_original` attribute.

If the HTTP instrumentation could end up converting valid HTTP request methods to `_OTHER`, then it MUST provide a way to override
the list of known HTTP methods. If this override is done via environment variable, then the environment variable MUST be named
OTEL_INSTRUMENTATION_HTTP_KNOWN_METHODS and support a comma-separated list of case-sensitive known HTTP methods
(this list MUST be a full override of the default known method, it is not a list of known methods in addition to the defaults).

HTTP method names are case-sensitive and `http.request.method` attribute value MUST match a known HTTP method name exactly.
Instrumentations for specific web frameworks that consider HTTP methods to be case insensitive, SHOULD populate a canonical equivalent.
Tracing instrumentations that do so, MUST also set `http.request.method_original` to the original value.

**[4]:** For network calls, URL usually has `scheme://host[:port][path][?query][#fragment]` format, where the fragment is not transmitted over HTTP, but if it is known, it should be included nevertheless.
`url.full` MUST NOT contain credentials passed via URL in form of `https://username:password@www.example.com/`. In such case username and password should be redacted and attribute's value should be `https://REDACTED:REDACTED@www.example.com/`.
`url.full` SHOULD capture the absolute URL when it is available (or can be reconstructed) and SHOULD NOT be validated or modified except for sanitizing purposes.
<!-- endsemconv -->