You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ECS and OpenTelemetry Semantic Conventions are merging, which is great.
So far, we've mostly been working on adding ECS fields to SemConv, and are discussing how to resolve discrepancies between fields with different names but similar semantics.
As also noted in the contribution guidelines , we expect more and more of the schema work to be done in OpenTelemetry Semantic Conventions. However, we haven't yet defined a mechanism where we decide how SemConv fields should be mapped to Elasticsearch. For example, whether a SemConv string field should be mapped as keyword, as *text, or a combination of both (via multi-fields). This is inherently Elasticsearch-specific, so not really appropriate for upstream SemConv.
I think this is what ECS should evolve to: ECS can provide us with a workflow, and with the tooling, to decide how a field that's added to SemConv should be mapped in Elasticsearch.
We should think about how we can streamline that process as much as possible, and build automation to propose a mapping given the field type and name, as defined by SemConv. This could alleviate some of the manual burden to decide on the most appropriate field type in ES and ensure that we can also deal well with data that's not (yet) part of ECS or SemConv. Some options that come to mind for this:
OTLP doesn't define complex types like IP or geo location. This makes it more difficult to choose the most appropriate ES field type. While naming conventions, like *_ip and *.ip can help, there's a risk of both false positives and false negatives. We could discuss adding these types to OTel SemConv, or to add some kind of type hints to the string type.
For metrics, rely on OTLP metadata to dynamically map metrics, without having to manually create index templates that define time_series_metric and time_series_dimension, and the type (histogram, long, float, aggregate_metric_double, ...).
To differ between "actual" ECS fields and those coming from SemConv, we could introduce a semconv level, next to core and extended.
This somewhat overlaps with how we should map OTel attributes that are unknown (generic/custom/ad-hoc schema) or well-known but not defined by SemConv. For example, receivers from collector-components or shared processing templates that extract fields from plain-text logs.
There are other related questions about the long-term future of ECS, like whether the name ECS still makes sense if it's purpose is to define ES mappings for SemConv, or what other purposes ECS should serve, such as defining aliasing/conversion between ECS and SemConv, or providing Elastic-specific fields on top of SemConv. But these questions deserve their own and separate discussion.
The text was updated successfully, but these errors were encountered:
ECS and OpenTelemetry Semantic Conventions are merging, which is great.
So far, we've mostly been working on adding ECS fields to SemConv, and are discussing how to resolve discrepancies between fields with different names but similar semantics.
As also noted in the contribution guidelines , we expect more and more of the schema work to be done in OpenTelemetry Semantic Conventions. However, we haven't yet defined a mechanism where we decide how SemConv fields should be mapped to Elasticsearch. For example, whether a SemConv string field should be mapped as
keyword
, as*text
, or a combination of both (via multi-fields). This is inherently Elasticsearch-specific, so not really appropriate for upstream SemConv.I think this is what ECS should evolve to: ECS can provide us with a workflow, and with the tooling, to decide how a field that's added to SemConv should be mapped in Elasticsearch.
We should think about how we can streamline that process as much as possible, and build automation to propose a mapping given the field type and name, as defined by SemConv. This could alleviate some of the manual burden to decide on the most appropriate field type in ES and ensure that we can also deal well with data that's not (yet) part of ECS or SemConv. Some options that come to mind for this:
ecs@mappings
is doing. See also Making all *.name fields be multi-field #2118.*_ip
and*.ip
can help, there's a risk of both false positives and false negatives. We could discuss adding these types to OTel SemConv, or to add some kind of type hints to the string type.time_series_metric
andtime_series_dimension
, and the type (histogram
,long
,float
,aggregate_metric_double
, ...).To differ between "actual" ECS fields and those coming from SemConv, we could introduce a
semconv
level, next tocore
andextended
.This somewhat overlaps with how we should map OTel attributes that are unknown (generic/custom/ad-hoc schema) or well-known but not defined by SemConv. For example, receivers from collector-components or shared processing templates that extract fields from plain-text logs.
There are other related questions about the long-term future of ECS, like whether the name ECS still makes sense if it's purpose is to define ES mappings for SemConv, or what other purposes ECS should serve, such as defining aliasing/conversion between ECS and SemConv, or providing Elastic-specific fields on top of SemConv. But these questions deserve their own and separate discussion.
The text was updated successfully, but these errors were encountered: