Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define instrumentation configuration API #4128

Merged
merged 8 commits into from
Aug 15, 2024

Conversation

jack-berg
Copy link
Member

@jack-berg jack-berg commented Jul 3, 2024

Resolves #3535.

This introduces an API component to file configuration, which has been limited to SDK (i.e. end user facing) up until this point.

The configuration model recently added the first surface area related to instrumentation configuration properties in open-telemetry/opentelemetry-configuration#91.

The API proposed in this PR is collectively called the "Instrumentation config API", and provides a mechanism for instrumentation libraries to participate in file configuration and read relevant properties during initialization. The intent is for both OpenTelemetry-authored and native instrumentation alike to be able to be configured by users in a standard way. New API surface area is necessary to accomplish this to avoid instrumentation libraries from needing to take a dependency on SDK artifacts.

The following summarizes the additions:

  • Introduce ConfigProvider, the instrumentation config API analog of TracerProvider, MeterProvider, LoggerProvider. This is the entry point to the API.
  • Define "Get instrumentation config" operation for ConfigProvider. This returns something called ConfigProperties, which is a programmatic representation of a YAML mapping node. The ConfigProperties returned by "Get instrumentation config" represents the .instrumentation node defined in a config file.
  • Rebrand "file configuration" to "declarative configuration". This expresses the intent without coupling to the file representation, which although will be the most popular way to consume these features is just one possible way to represent the configuration model and use these tools.
  • Break out dedicated api.md, data-model.md, and sdk.md files for respective API, data model, and SDK portions of declarative configuration. This aligns with other portions of the spec. The separation should improve clarity regarding what should and should not be exposed in the API.

I've prototyped this new API in opentelemetry-java here: open-telemetry/opentelemetry-java#6549

cc @open-telemetry/configuration-maintainers, @open-telemetry/specs-semconv-maintainers

@jack-berg jack-berg requested review from a team July 3, 2024 18:59
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the text needs to be sanitized to avoid references to "file", especially in the API sections. API has nothing to do with files, it is only concerned with the configuration data model. Whether that configuration comes from a file or from a REST endpoint is irrelevant to the API.

Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some specific comments on configuration API.

I believe instrumentation API does not belong in file-configuration.md and should be moved to api-configuration.md and should be written without any assumption on where configuration properties come from.

specification/configuration/file-configuration.md Outdated Show resolved Hide resolved
specification/configuration/file-configuration.md Outdated Show resolved Hide resolved
specification/configuration/file-configuration.md Outdated Show resolved Hide resolved
@jack-berg
Copy link
Member Author

I think the text needs to be sanitized to avoid references to "file", especially in the API sections. API has nothing to do with files, it is only concerned with the configuration data model. Whether that configuration comes from a file or from a REST endpoint is irrelevant to the API.

What do folks think about breaking up file-configuration.md into 3 files to mirror the structure of the other signals?

  • ./configuration/data-model.md - talk about the configuration data model and the file representation of it
  • ./configuration/api.md - talk about the new instrumentation config API introduced in this PR
  • ./configuration/sdk.md - talk about the implementation of the API, and SDK tooling including parse, create operations, ComponentProvider

Do we think the spec would benefit from this decoupling? Or does the benefit of having all the information co-located in a single page outweigh the benefit of breaking out separate pages?

@lmolkova
Copy link
Contributor

What do folks think about breaking up file-configuration.md into 3 files to mirror the structure of the other signals?

I support breaking things down with a caveat:

  • ./configuration/data-model.md - talk about the configuration data model and the file representation of it

talks about property structure/names without specifying where properties came from (file or not) in the same way as Spring configuration properties are not tight to any specific source.

Having all things (data model, api, sdk) in one file could also be an option if file-based configuration details are defined in a different file.

@jack-berg
Copy link
Member Author

I've pushed a few commits which restructure the config docs. See updated PR description for details.

specification/configuration/README.md Outdated Show resolved Hide resolved
specification/configuration/api.md Outdated Show resolved Hide resolved
Copy link

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature is near and dear to the LLM work to me, as there are some configuration some vendors would like to be able to set, knowing there is a moratorium on OTEL_ values. For example, disabling of prompt/completion (think request/response) logging

I'm not an approver, but I'm going to play as one. I haven't reviewed fully the implementation, but I agree with the direction and consider my comments in the nit/as you wish category

spec-compliance-matrix.md Show resolved Hide resolved
specification/configuration/README.md Show resolved Hide resolved
specification/configuration/README.md Show resolved Hide resolved
specification/configuration/api.md Outdated Show resolved Hide resolved
specification/configuration/api.md Show resolved Hide resolved
specification/configuration/sdk.md Outdated Show resolved Hide resolved
specification/configuration/sdk.md Outdated Show resolved Hide resolved
specification/configuration/sdk.md Outdated Show resolved Hide resolved
specification/configuration/sdk.md Outdated Show resolved Hide resolved
specification/configuration/sdk.md Outdated Show resolved Hide resolved
@codefromthecrypt
Copy link

After reviewing the impl something came to mind. We have our semantics, makes sense. Do we consider things that aren't generic enough to have their own semantics but ought to be common for all langs? e.g. openai specific config which may be an extension of genai/llm? or postgres config which ought to be the same regardless of java or whatever?

Just clarifying end state scope interest.

@jack-berg
Copy link
Member Author

I'm not an approver, but I'm going to play as one.

This is highly encouraged and very helpful 🙂

A handful of the comments were on sections of code which were relocated rather than introduced in this PR, but I addressed them anyhow because they were editorial.

We have our semantics, makes sense. Do we consider things that aren't generic enough to have their own semantics but ought to be common for all langs? e.g. openai specific config which may be an extension of genai/llm? or postgres config which ought to be the same regardless of java or whatever?

I think so. Another example of this is configuration for opentelemetry-sqlcommenter - #3560 is tracking adding standardizing configuration for the various language implementations. Over in opentelemetry-configuration we define the scope of the schema here. We may need to rephrase to capture these types of scenarios. Or perhaps semantic-conventions can hold definitions can define these more specific instrumentation configuration concerns.

Copy link

github-actions bot commented Aug 6, 2024

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Aug 6, 2024
Copy link
Member

@zeitlinger zeitlinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work 😄

specification/configuration/README.md Show resolved Hide resolved
specification/configuration/api.md Outdated Show resolved Hide resolved
specification/configuration/api.md Show resolved Hide resolved
specification/configuration/data-model.md Show resolved Hide resolved
specification/configuration/data-model.md Show resolved Hide resolved
specification/configuration/data-model.md Show resolved Hide resolved
specification/configuration/sdk.md Show resolved Hide resolved
specification/configuration/sdk.md Show resolved Hide resolved
specification/configuration/sdk.md Show resolved Hide resolved
@zeitlinger
Copy link
Member

I wonder about the autoconfig bridge from the java sdk to the spring starter and how it relates to this spec.

Is such bridge function a may or should - or maybe something else entirely?

@jack-berg
Copy link
Member Author

I wonder about the autoconfig bridge from the java sdk to the spring starter and how it relates to this spec.
Is such bridge function a may or should - or maybe something else entirely?

My gut feeling is that this is a java specific concern, but I think we'll need to do some prototyping and see if anything emerges that seems like it will recur across languages.

@zeitlinger
Copy link
Member

I wonder about the autoconfig bridge from the java sdk to the spring starter and how it relates to this spec.
Is such bridge function a may or should - or maybe something else entirely?

My gut feeling is that this is a java specific concern, but I think we'll need to do some prototyping and see if anything emerges that seems like it will recur across languages.

@brunobat is quarkus planning to use the config bridge?

@jack-berg
Copy link
Member Author

This PR has been open for over a month now and has enough approvals to merge. Will merge tomorrow unless someone communicates an intent to provide additional review / feedback.

Thanks.

@jack-berg jack-berg merged commit 9e1aeb7 into open-telemetry:main Aug 15, 2024
6 checks passed
@brunobat
Copy link
Contributor

I wonder about the autoconfig bridge from the java sdk to the spring starter and how it relates to this spec.
Is such bridge function a may or should - or maybe something else entirely?

My gut feeling is that this is a java specific concern, but I think we'll need to do some prototyping and see if anything emerges that seems like it will recur across languages.

@brunobat is quarkus planning to use the config bridge?

No, Quarkus has no plans to use the yaml based config

undefined_key: ${UNDEFINED_KEY} # Invalid reference, UNDEFINED_KEY is not defined and is replaced with ""
${STRING_VALUE}: value # Invalid reference, substitution is not valid in mapping keys and reference is ignored
recursive_key: ${REPLACE_ME} # Valid reference to REPLACE_ME
# invalid_identifier_key: ${STRING_VALUE:?error} # If uncommented, this is an invalid identifier, it would fail to parse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jack-berg why isn't :?error syntax supported? It's standard shell syntax, just like :-default. Was there a discussion?

Bigger point - if we're using some convention, such as shell var syntax, it's very surprising when that convention is only partially supported. Context: https://github.com/open-telemetry/opentelemetry-collector/pull/10907/files#r1722093779

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's open an issue to track. This syntax wasn't introduced in this PR, it was just moved around. At the time when env var substitution syntax was needed, we were solving a very targeted problem and used shell syntax prior art to avoid reinventing something. You make a good point that partially supporting shell syntax would be surprising to some users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -1,56 +0,0 @@
<!--- Hugo front matter used to generate the website version of this page:
linkTitle: SDK
aliases: [/docs/reference/specification/sdk-configuration]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Hugo front matter should have been moved to the top of the new sdk.md file, otherwise we lose the alias, which can result in 404s.

/cc @open-telemetry/docs-maintainers

carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 31, 2024
Resolves open-telemetry#3535.

This introduces an API component to file configuration, which has been
limited to SDK (i.e. end user facing) up until this point.

The configuration model recently added the first surface area related to
instrumentation configuration properties in
open-telemetry/opentelemetry-configuration#91.

The API proposed in this PR is collectively called the "Instrumentation
config API", and provides a mechanism for instrumentation libraries to
participate in file configuration and read relevant properties during
initialization. The intent is for both OpenTelemetry-authored and native
instrumentation alike to be able to be configured by users in a standard
way. New API surface area is necessary to accomplish this to avoid
instrumentation libraries from needing to take a dependency on SDK
artifacts.

The following summarizes the additions:

- Introduce ConfigProvider, the instrumentation config API analog of
TracerProvider, MeterProvider, LoggerProvider. This is the entry point
to the API.
- Define "Get instrumentation config" operation for ConfigProvider. This
returns something called ConfigProperties, which is a programmatic
representation of a YAML mapping node. The ConfigProperties returned by
"Get instrumentation config" represents the
[`.instrumentation`](https://github.com/open-telemetry/opentelemetry-configuration/blob/670901762dd5cce1eecee423b8660e69f71ef4be/examples/kitchen-sink.yaml#L438-L439)
node defined in a config file.
- Rebrand "file configuration" to "declarative configuration". This
expresses the intent without coupling to the file representation, which
although will be the most popular way to consume these features is just
one possible way to represent the configuration model and use these
tools.
- Break out dedicated `api.md`, `data-model.md`, and `sdk.md` files for
respective API, data model, and SDK portions of declarative
configuration. This aligns with other portions of the spec. The
separation should improve clarity regarding what should and should not
be exposed in the API.

I've prototyped this new API in `opentelemetry-java` here:
open-telemetry/opentelemetry-java#6549

cc @open-telemetry/configuration-maintainers,
@open-telemetry/specs-semconv-maintainers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Specify configuration mechanism for native instrumentations
10 participants