Feature Request: Restructure Semantic Convention markdown to be topical vs. by signal #137

jsuereth · 2023-06-26T14:06:22Z

Today, the semantic conventions are split by signal (Resource, Trace, Logs, Metrics). Instead we should have topical semantic conventions, e.g. Http, Database, etc.

This bug tracks the idea and work associated with refactoring.

AlexanderWert · 2023-06-27T08:13:12Z

I started working on this.

I came up with the following proposal for the structure:

┬ specification
├─ README.md                      <- entry point to all of the semantic conventions (this could be used to embed into the docs page?)
├─ general                        <- general semantic conventions
│   ├─ README.md
│   ├─ spans-general.md
│   ├─ metrics-general.md
│   ├─ logs-general.md
│   ├─ events-general.md
│   └─ technology-general
│      ├─ README.md
│      ├─ general-aws-sdk.md      <- proposing to split existing aws-sdk.md into general, DynamocDB, s3 specific specs and place them under the corresponding topics (e.g. DynamoDB under database)
│      ├─ graphql.md
│      └─ ...
├─ cloudevents
│   └─ README.md
├─ database
│   ├─ README.md
│   ├─ database-spans.md
│   ├─ database-metrics.md
│   ├─ cassandra.md               <- tech-specific db semconv (including spans, metrics, logs)
│   ├─ dynamodb.md                <- tech-specific db semconv (including spans, metrics, logs)
│   ├─ cosmosdb.md                <- tech-specific db semconv (including spans, metrics, logs)
│   └─ ...                        <- tech-specific db semconv (including spans, metrics, logs)
├─ exceptions
│   ├─ README.md
│   ├─ exceptions-logs.md
│   └─ exceptions-spans.md
├─ faas
│   ├─ README.md
│   ├─ faas-spans.md
│   ├─ faas-metrics.md
│   ├─ aws-lambda.md              <- tech-specific faas semconv (including spans, metrics, logs)
│   └─ ...
├─ feature-flags
│   ├─ README.md
│   ├─ feature-flags-spans.md
│   └─ feature-flags-logs.md
├─ http
│   ├─ README.md
│   ├─ http-spans.md
│   └─ http-metrics.md
├─ messaging
│   ├─ README.md
│   ├─ messaging-spans.md
│   └─ kafka.md                   <- tech-specific messaging semconv (including spans, metrics, logs)
├─ object-stores
│   ├─ README.md
│   └─ aws-s3.md
├─ rpc
│   ├─ README.md
│   ├─ rpc-spans.md
│   └─ rpc-metrics.md
├─ resource
│   ├─ README.md
│   ├─ resource-attributes
│   │  └─ ...                     <- the resource semantic attributes as is today
│   ├─ resource-metrics           <- resource-related metrics
│   │   ├─ README.md
│   │   ├─ hardware-metrics.md
│   │   ├─ process-metrics.md
│   │   ├─ runtime-environment-metrics.md
│   │   └─ system-metrics.md
├─ compatibility                  <- compatibility specs (e.g. OpenTracing, AWS X-Ray, ...)
│   ├─ README.md
│   ├─ open-tracing.md
│   └─ aws.md

@jsuereth, @reyang WDYT?

Will also create a (draft) PR with the proposed structure.

pyohannes · 2023-06-27T09:50:04Z

There has been a similar attempt in the past, see this PR: open-telemetry/opentelemetry-specification#1977

If I remember correctly, one big open question was how to also keep the YAML structure in line with the markdown changes (open-telemetry/opentelemetry-specification#2019).

joaopgrassi · 2023-06-27T10:17:39Z

If I remember correctly, one big open question was how to also keep the YAML structure in line with the markdown changes (open-telemetry/opentelemetry-specification#2019).

In the semconv meeting from yesterday (26.06.2023), we discussed that the idea would be to keep the YAML as they are, and just modify the rendering part for easy of consumption by readers. The YAML as they are are important for code generation apart from generating the markdown. I guess we could if we want later on to make the YAML match the structure of the markdown, but I don't think it's required. @jsuereth did I got it right?

AlexanderWert · 2023-06-27T10:40:15Z

Thoughts on whether we should do the change as one big PR vs. moving things around in multiple smaller PRs?

pyohannes · 2023-06-27T10:41:27Z

Here's another issue, which seems to overlaps with this one: open-telemetry/opentelemetry-specification#2397

It contains this paragraph:

Many of the upcoming conventions for logs will need to be shared with conventions for traces. The current structure complete isolates the signals into separate directories which makes such sharing of conventions difficult, requiring duplication which is hard to maintain.

Is it a goal for this feature request to tackle this issue of duplication? Currently shared attributes need to be duplicated in YAML files, and consequently they are duplicated in the resulting markdown.

AlexanderWert · 2023-06-27T10:49:36Z

Is it a goal for this feature request to tackle this issue of duplication? Currently shared attributes need to be duplicated in YAML files, and consequently they are duplicated in the resulting markdown.

I think some of the yaml files are already created to be signal-agnostic (i.e. the ones in the root directory here: https://github.com/open-telemetry/semantic-conventions/tree/main/semantic_conventions). Those attributes are then referenced in other yaml files (rather than duplicating them). IMHO we need to make this the default (i.e. defining attributes as being signal-agnostic and only referencing them from the specific contexts / signal definitions). Still, I think it can be done as a separate step after restructuring the markdown (so my understanding is, it should not be part of this issue).

Oberon00 · 2023-06-27T11:07:16Z

The tooling supports specifying a (primary) type for conventions independent of directory structure, and @lmolkova recently added improvements for filtering for that in code generators open-telemetry/build-tools#157

joaopgrassi · 2023-06-27T11:45:37Z

Thoughts on whether we should do the change as one big PR vs. moving things around in multiple smaller PRs?

Maybe moving "one signal at a time" is better? No idea if that really helps much

AlexanderWert · 2023-06-27T11:58:28Z

Thoughts on whether we should do the change as one big PR vs. moving things around in multiple smaller PRs?

Maybe moving "one signal at a time" is better? No idea if that really helps much

I was rather thinking of doing it by topic (e.g. starting with http, database, etc.).

With a bigger change I fear it would be difficult to resolve conflicts with other things going on (as it would imply that files are being moved around, deleted, created)

jsuereth · 2023-06-27T12:07:55Z

I guess we could if we want later on to make the YAML match the structure of the markdown, but I don't think it's required. @jsuereth did I got it right?

Yes I'd like to do the migration separately here. Our primary goal right now is the following:

Ensure we have a good structure of semantic conventions for two audiences: Instrumentation Authors + Instrumentation Consumers (users, dashboard/alert creators, etc.)
Update our specification/ folder to make it easier to consume in the website (see: Fixes #2907 - Schema files are served from semantic-conventions. opentelemetry.io#2908)

Maybe moving "one signal at a time" is better? No idea if that really helps much

I'd actually suggest move one category at a time. From the proposal here, I think we should do the following:

Scaffold the new layout
- Create specification/README.md
- Move one category, e.g. http, to the new structure.
Rinse and repeat for different categories.

AlexanderWert · 2023-06-27T12:09:49Z

I'd actually suggest move one category at a time. From the proposal here, I think we should do the following:

Scaffold the new layout

Create specification/README.md

Move one category, e.g. http, to the new structure.

Rinse and repeat for different categories.

Working on that. Will create a PR for that first step today.

jsuereth · 2023-06-27T12:18:58Z

I love the proposal, but some feedback:

Keep supplementary guidelines outside of the specification for now. These are non-normative and we want them clearly indicated as such. We need to figure out how to handle them better going forward.
I half-like, half-dislike including System metrics with Resource attributes. I think it makes sense, but let's keep that on a separate PR so folks can feel that change out. Maybe start pulling Resource attributes into the new location then merge metrics into it and see how folks like that layout.
Regarding "technology general", I think perhaps these need to be split.
Regarding reosurces, I think calling the section "Resource attributes" isn't quite right. There's some odd interactions between Resource "types" and Resource detectors in practice. We essentially need named "types" of resources with a set of attributes that type hollistically creates or not. This is different from metrics. If you wanted to blend them, I still think you'd have one hierarchy, e.g. "process" would have attributes required for process resource and the set of metrics. I'm personally not convinced we want to blend these yet because:
- Unlike metric/log/trace instrumentation, Resource detection is always a completely separate code path from other signals.
- These metrics are not reported against the resource they are discussing. E.g. process metrics are reported against a host, not the process itself. We still struggle with defining a "best practice" for these today.

My counter-proposal:

┬ specification
├─ README.md                      <- entry point to all of the semantic conventions (this could be used to embed into the docs page?)
├─ general                        <- general semantic conventions
│   ├─ README.md
│   ├─ spans-general.md
│   ├─ metrics-general.md
│   ├─ logs-general.md
│   ├─ events-general.md
│   └─ cloud-provider-general
│      ├─ README.md
│      ├─ general-aws-sdk.md      <- proposing to split existing aws-sdk.md into general, DynamocDB, s3 specific specs and place them under the corresponding topics (e.g. DynamoDB under database)
├─ cloudevents
│   └─ README.md
├─ database
│   ├─ README.md
│   ├─ database-spans.md
│   ├─ database-metrics.md
│   ├─ graphql.md                   <- graph specific query semconv
│   ├─ cassandra.md               <- tech-specific db semconv (including spans, metrics, logs)
│   ├─ dynamodb.md                <- tech-specific db semconv (including spans, metrics, logs)
│   ├─ cosmosdb.md                <- tech-specific db semconv (including spans, metrics, logs)
│   └─ ...                        <- tech-specific db semconv (including spans, metrics, logs)
├─ exceptions
│   ├─ README.md
│   ├─ exceptions-logs.md
│   └─ exceptions-spans.md
├─ faas
│   ├─ README.md
│   ├─ faas-spans.md
│   ├─ faas-metrics.md
│   ├─ aws-lambda.md              <- tech-specific faas semconv (including spans, metrics, logs)
│   └─ ...
├─ feature-flags
│   ├─ README.md
│   ├─ feature-flags-spans.md
│   └─ feature-flags-logs.md
├─ http
│   ├─ README.md
│   ├─ http-spans.md
│   └─ http-metrics.md
├─ messaging
│   ├─ README.md
│   ├─ messaging-spans.md
│   └─ kafka.md                   <- tech-specific messaging semconv (including spans, metrics, logs)
├─ object-stores
│   ├─ README.md
│   └─ aws-s3.md
├─ rpc
│   ├─ README.md
│   ├─ rpc-spans.md
│   └─ rpc-metrics.md
├─ resource
│   ├─ README.md
│   └─ ...                     <- the resource semantic attributes as is today
├─ system           <- system-related metrics (see new working group proposal)
│   ├─ README.md
│   ├─ hardware-metrics.md
│   ├─ process-metrics.md
│   ├─ runtime-environment-metrics.md
│   └─ system-metrics.md

reyang · 2023-06-27T15:03:03Z

Regarding "technology-general/cloud-provider-general", why are they under "general"? Shouldn't these have their own folders? (only logs-general/metrics-general/spans-general make sense to me)
How do we determine what's general vs. not?

┬ specification
├─ README.md
├─ general
│   ├─ README.md
│   ├─ logs-general.md
│   ├─ metrics-general.md
│   ├─ spans-general.md
├─ cloud-provider
│   ├─ README.md
│   ├─ aws.md

AlexanderWert · 2023-06-27T15:48:25Z

Regarding "technology-general/cloud-provider-general", why are they under "general"? Shouldn't these have their own folders? (only logs-general/metrics-general/spans-general make sense to me)
How do we determine what's general vs. not?

@reyang

Right now we have one example, that's somehow unclear where to put: aws-sdk:

aws-sdk.md contains general AWS SDK attributes (that apply to DynamoDB, S3, etc.) and then the AWS service-specific attributes (DynamoDB, S3, etc.).

It's sort of clear where to put DynamoDB, S3 attributes, etc. But where to put the general AWS SDK attributes?

So that was the idea behind: general/cloud-provider-general/general-aws-sdk.md

reyang · 2023-06-28T16:18:45Z

Regarding "technology-general/cloud-provider-general", why are they under "general"? Shouldn't these have their own folders? (only logs-general/metrics-general/spans-general make sense to me)
How do we determine what's general vs. not?

@reyang

Right now we have one example, that's somehow unclear where to put: aws-sdk:

aws-sdk.md contains general AWS SDK attributes (that apply to DynamoDB, S3, etc.) and then the AWS service-specific attributes (DynamoDB, S3, etc.).

It's sort of clear where to put DynamoDB, S3 attributes, etc. But where to put the general AWS SDK attributes?

Shove them under "aws"?

So that was the idea behind: general/cloud-provider-general/general-aws-sdk.md

I feel these are "general" for AWS clients and services, I view "general" as:

vendor agnostic
domain agnostic

AlexanderWert · 2023-06-30T07:59:20Z

reyang · 2023-07-03T20:05:15Z

Closing since the restructuring is done. Some new issues were discovered and tracked separately, e.g. #157.

AlexanderWert · 2023-07-04T05:56:00Z

@reyang I created a final "cleanup" PR for this: #158

jsuereth mentioned this issue Jun 26, 2023

Semantic Conventions 1.21.0 Release Tracker #136

Closed

github-actions bot assigned reyang Jun 26, 2023

AlexanderWert mentioned this issue Jun 27, 2023

Moved HTTP semantic conventions to a dedicated topic-based structure #141

Merged

4 tasks

chalin mentioned this issue Jun 27, 2023

Adjust to move Semantic Conventions out of OTel spec open-telemetry/opentelemetry.io#2721

Closed

33 tasks

chalin mentioned this issue Jun 27, 2023

Add SemConv spec under docs/specs (preview) open-telemetry/opentelemetry.io#2920

Merged

This was referenced Jun 28, 2023

Moved general semantic conventions markdown to a topic folder. #143

Merged

Moved database semantic conventions #144

Merged

Moved RPC semantic conventions #145

Merged

reyang closed this as completed Jul 3, 2023

AlexanderWert mentioned this issue Jul 4, 2023

Final restructuring cleanup #158

Merged

AlexanderWert mentioned this issue Jul 6, 2023

Moved resource semconv directly under the resource directory #165

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Restructure Semantic Convention markdown to be topical vs. by signal #137

Feature Request: Restructure Semantic Convention markdown to be topical vs. by signal #137

jsuereth commented Jun 26, 2023

AlexanderWert commented Jun 27, 2023 •

edited

Loading

pyohannes commented Jun 27, 2023

joaopgrassi commented Jun 27, 2023

AlexanderWert commented Jun 27, 2023

pyohannes commented Jun 27, 2023

AlexanderWert commented Jun 27, 2023 •

edited

Loading

Oberon00 commented Jun 27, 2023

joaopgrassi commented Jun 27, 2023

AlexanderWert commented Jun 27, 2023 •

edited

Loading

jsuereth commented Jun 27, 2023

AlexanderWert commented Jun 27, 2023

jsuereth commented Jun 27, 2023

reyang commented Jun 27, 2023 •

edited

Loading

AlexanderWert commented Jun 27, 2023 •

edited

Loading

reyang commented Jun 28, 2023

AlexanderWert commented Jun 30, 2023 •

edited

Loading

reyang commented Jul 3, 2023

AlexanderWert commented Jul 4, 2023

Feature Request: Restructure Semantic Convention markdown to be topical vs. by signal #137

Feature Request: Restructure Semantic Convention markdown to be topical vs. by signal #137

Comments

jsuereth commented Jun 26, 2023

AlexanderWert commented Jun 27, 2023 • edited Loading

pyohannes commented Jun 27, 2023

joaopgrassi commented Jun 27, 2023

AlexanderWert commented Jun 27, 2023

pyohannes commented Jun 27, 2023

AlexanderWert commented Jun 27, 2023 • edited Loading

Oberon00 commented Jun 27, 2023

joaopgrassi commented Jun 27, 2023

AlexanderWert commented Jun 27, 2023 • edited Loading

jsuereth commented Jun 27, 2023

AlexanderWert commented Jun 27, 2023

jsuereth commented Jun 27, 2023

reyang commented Jun 27, 2023 • edited Loading

AlexanderWert commented Jun 27, 2023 • edited Loading

reyang commented Jun 28, 2023

AlexanderWert commented Jun 30, 2023 • edited Loading

reyang commented Jul 3, 2023

AlexanderWert commented Jul 4, 2023

AlexanderWert commented Jun 27, 2023 •

edited

Loading

AlexanderWert commented Jun 27, 2023 •

edited

Loading

AlexanderWert commented Jun 27, 2023 •

edited

Loading

reyang commented Jun 27, 2023 •

edited

Loading

AlexanderWert commented Jun 27, 2023 •

edited

Loading

AlexanderWert commented Jun 30, 2023 •

edited

Loading