Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Links have to be deserialized all the time: instrumentation perf concern #305

Closed
lmolkova opened this issue Oct 11, 2019 · 9 comments
Closed
Milestone

Comments

@lmolkova
Copy link
Contributor

lmolkova commented Oct 11, 2019

In batching scenarios, when multiple messages are received e.g. from Kafka, the instrumentation library has to extract all the links from all messages, populate array of links and set it on the span before sampling decision is made. Or without even knowing if instrumentation is operational.

This is the cost that the instrumentation library pays for all users in all cases. This could be blocking for some libraries (like messaging services SDKs) to instrument themselves directly without requiring explicit user choice/additional plugins or bridges.

It would make sense to give the instrumentation library a legit and very fast hint that instrumentation is operational. This at least eliminates concern around noop case.

Regarding sampling, I believe what sampling otep means is that if any link is sampled in, it should be treated as ParentSampledFlag = true.
So in this case, deserializing all the links is absolutely required before any span is created and this is inevitable perf hit on each message.

Is my understanding correct?

@bogdandrutu
Copy link
Member

You have a "LazyAddLink" option on the builder, so the extraction will happen only if needed. But I do agree that will at least force one object allocation for each Link.

@lmolkova
Copy link
Contributor Author

lmolkova commented Oct 11, 2019

LazyAddLink indeed solves noop deserialization issue. You could also avoid allocation by providing static lambda function to lazy add.

In an operational case, the funny thing with LazyAddLink that if all links are sampled out, you'll deserialize all of them anyway. And allocations are just part of the problem - access to the properties on the message could be another issues (this sometimes is done in a lazy way too).

@bogdandrutu
Copy link
Member

You could also avoid allocation by providing static lambda function to lazy add.

You are talking too much .net :))

Not sure I understand the last part of the comment, can you clarify exactly what is the problem?

@lmolkova
Copy link
Contributor Author

lmolkova commented Oct 11, 2019

You are talking too much .net :))

sorry :)

Not sure I understand the last part of the comment, can you clarify exactly what is the problem?

So imagine you do LazyAddLink in whatever language. If there is an SDK, it will have to execute all lazy adds, right? It needs to do it to get all links, SDK cannot optimize and stop (might be able to have a hard stop on number of links though). So the deserialization of all links must happen.
There are costs that come from just parsing traceparent/tracestate and creating SpanContext.

There is an additional cost that depends on how library works and just accessing message properties may cost - regardless of what you parse, because the library could be optimized for never-access-properties scenarios and initializes properties on first access.

There is exactly the same concern in ASP.NET (sorry, talking .NET again) which initializes headers lazily, I.e. if you just run perf test without instrumentation and with it (with sampling set to 0) you'll get significant perf hit, because you still have to access headers to parse context before you make sampling decision. This is very synthetic though as normally an app would want to access headers even implicitly and this is amortized cost.

I don't know how to solve this problem and if it is feasible to solve it. But I want to make it explicit that

  • libraries should know if there is no SDK before creating spans and should be able to optimize based on this
  • sampling does not help address certain instrumentation perf concerns

@bogdandrutu
Copy link
Member

We do not execute the "LazyFormatters" unless is necessary. To count the number of links and enforce a limit is not necessary to decode the Link. Am I missing something?

So you can avoid executing all the formatting if the SDK knows that there is no chance in sampling this request.

@lmolkova
Copy link
Contributor Author

yes, and we should have lazy extract too

@carlosalberto
Copy link
Contributor

Hey @lmolkova is this solved? Is there any work left to do?

@lmolkova
Copy link
Contributor Author

lmolkova commented Jan 28, 2020

@carlosalberto this is partially solved with lazy formatters to the extent possible. There is no way to optimize it further for the default sampling and existing filtering approach. I'm closing the issue as there are no remaining work items left.

@carlosalberto
Copy link
Contributor

@lmolkova Thanks for the clarification, truly appreciated.

TuckTuckFloof pushed a commit to TuckTuckFloof/opentelemetry-specification that referenced this issue Oct 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants