Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add semantic conventions for Elasticsearch client instrumentation #23

Merged
merged 27 commits into from
Jun 29, 2023

Conversation

estolfo
Copy link
Contributor

@estolfo estolfo commented May 15, 2023

This PR originated as a PR in the opentelemetry-specification repo.
The outstanding discussion point from that PR was regarding the path forward for the existing Elasticsearch client instrumentations:

  • .NET: No concerns about updating their instrumentation to adopt these semantic conventions
  • Python: No concerns about updating their instrumentation to adopt these semantic conventions
  • Java: request to understand the complexity of updating the instrumentation by opening a PR with the changes

We are in the process of figuring out the path forward for the Java instrumentation and I'll update here when we have a plan.

Two instrumentations in progress that will use these semantic conventions:

semantic_conventions/trace/database.yaml Outdated Show resolved Hide resolved
semantic_conventions/trace/database.yaml Outdated Show resolved Hide resolved
semantic_conventions/trace/database.yaml Outdated Show resolved Hide resolved
@estolfo estolfo requested a review from a team May 30, 2023 12:19
@estolfo
Copy link
Contributor Author

estolfo commented May 30, 2023

Hi @lmolkova, we've discussed what changes to make based on your comments/questions.

We agree it makes sense to have url.full instead of the url.path and url.query. We think it's still important to have the components extracted out, so we'll have doc_id, target, url.full and the span name containing the path pattern with placeholders that will together provide a complete set of information on the api queried and with what data.
We've added a note that the server address and port are recommended.

We are also assuming from the spec that there will also be an http span.

Lastly, we've changed the db.statement to capture the request body for search-type queries. The list of Search APIs can be found here.

A config option could also be used that allows users to opt-in for capturing bodies of every request.

Copy link
Member

@joaopgrassi joaopgrassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to get some insights on what is the intended span structure such conventions want to achieve. The initial paragraph about an HTTP span already existing is a bit unclear if an additional span for Elasticsearch should always be there, or attributes should be added to the underlying http span.

semantic_conventions/trace/database.yaml Outdated Show resolved Hide resolved
semantic_conventions/trace/database.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@estolfo thank you for addressing my comments! Please also add a changelog entry.

@estolfo
Copy link
Contributor Author

estolfo commented Jun 14, 2023

@lmolkova and @joaopgrassi I've updated the span name and attributes after a discussion with some more people internally.
There are two main changes:

  • span name is the endpoint id of the request (the name field here, ex: search), rather than the http request method and request template (ex GET /{target}/_doc/{id}) . The Elasticsearch clients team said people would more likely want to identify ES requests by the endpoint id and that the request template would be less natural for them to see as a request identifier. We also have the http method anyway in the span attributes so we don't need to repeat it in the span name.
  • the dynamic parts of the url path are set at span attributes like they are for http headers, as db.elasticsearch.path_parts.<key>

What do you think?

Copy link
Member

@joaopgrassi joaopgrassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given you are the subject-matter expert here I think the span name change makes sense (makes more sense to customers). But the path_parts change I'm not sure. I left a common with my concerns there

@joaopgrassi
Copy link
Member

@open-telemetry/specs-semconv-maintainers could you please take a look? Thanks!

@jsuereth
Copy link
Contributor

Generally looks good to me, but have some concerns around the implications of sensitive data handling.

@estolfo estolfo force-pushed the elasticsearch branch 2 times, most recently from 40a5fcb to 0a4cd2f Compare June 28, 2023 10:58
Copy link
Member

@joaopgrassi joaopgrassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending when this merges, it might be affected by #143

@estolfo
Copy link
Contributor Author

estolfo commented Jun 28, 2023

I think @carlosalberto said in the SIG last night that he'd approve it and then we can merge.

Copy link

@Mpdreamz Mpdreamz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants