Limitation on number of attributes, links, events and behavior #182

lmolkova · 2019-07-13T15:13:46Z

In Java SDK implementation, there is a limit and behavior is to discard in FIFO.

The limitations are not part of the spec and behavior when there are more than max is not defined.

austinlparker · 2019-07-13T19:40:22Z

Seems like there’s a couple of options:

Spec defines a hard max limit, unconfigurable.
Spec defines a default max, user-adjustable.
Spec defines no limit, exporters can select a limit (config of this limit is an exporter-specific concern)

The third option is the most similar to OpenTracing imo. I think the second option might be the best because there’s always going to be some sort of maximum, might as well put it in a well-known place.

discostu105 · 2019-07-16T12:18:01Z

I'd have a slight preference for the third option:

If the limits are user-adjustable via API, aren't we mixing different personas? An application developer (who would set the limit) might not know the exporter and ultimatively the backend that uses the data. The backend might have difficulties with 10 or with 10k attributes. It should be a concern of the exporter/backend only.

austinlparker · 2019-07-16T12:24:30Z

That's a good point.

mtwo · 2019-07-16T23:07:17Z

3 also seems like the most reasonable choice to me, though I'm not extremely opinionated about this topic

bogdandrutu · 2019-09-30T18:42:13Z

3 is not a good option because the intent for this is to not grow memory unbounded, this is not to protect the exporter or the backend, is to help Application owners to control how much memory their instrumentation uses.

I think the best compromise is 2, the limit is adjustable via the SDK not via the API (so an instrumentation dev, as you pointed does not know the backend), but the Application owner knows the backend in almost all the situations.

SergeyKanzhelev · 2019-09-30T20:06:14Z

2 and 3 are almost identical in a sense that if exporter can configure this SDK setting, than it is the same. Multiple exporters will just need to set the max of options and then do additional trimming on their side. There is no way to prevent exporters to configure SDK as long as this setting is available for end user.

jmacd · 2020-01-22T16:38:05Z

Removed this from the 0.3 milestone as it's not a blocker. We can add this in the 0.4 milestone.

iNikem · 2020-07-27T11:28:44Z

Do we also want to impose/allow to configure some limits on the size of attributes values?

Oberon00 · 2020-07-27T11:30:57Z

For attribute and label keys, there is discussion about a maximum length (among other things) on #504.

toumorokoshi · 2020-08-11T04:47:25Z

2 and 3 are almost identical in a sense that if exporter can configure this SDK setting, than it is the same. Multiple exporters will just need to set the max of options and then do additional trimming on their side. There is no way to prevent exporters to configure SDK as long as this setting is available for end user.

I would argue that 2 provides a significant advantage by also enabling limiting of events enforceable by the SDK itself, which ensures that one cannot accidentally flood memory with hundreds of thousands of attributes before the trace reaches the exporter.

For that reason alone I would argue for #2. Even if exporters end up configuring it, it also covers the don't-cause-memory-issues use case.

Other idea: ulimit-style limits.

As another idea too, you could make the maximum count work similar to ulimits in Linux: the value starts at infinite, you can set it to be lower, but can't set it higher. This ensures that the most attributes that will ever be passed to an exporter is a count that all exporters support.

But reading it through, it sounds like it's just overcomplicating things significantly. Most likely someone can configure their SDK appropriately with complete knowledge of their backends.

tigrannajaryan · 2020-09-08T18:12:44Z

I suggest to remove release:required-for-ga label since this is an additive, non-breaking change.

carlosalberto · 2020-09-09T14:32:54Z

+1 on making this release:after-ga .

andrewhsu · 2020-09-09T15:25:21Z

From the issue triage mtg today, i'm changing the label to release:after-ga since it looks like from the comments this can be punted.

arminru · 2020-09-10T12:48:29Z

@tigrannajaryan
It depends on the outcome of this discussion if this is an additive, non-breaking change.
If some users are used to recording 10k events on a span and after updating the SDK suddenly 9k of them are dropped because a "sanity limit" of 1k is introduced to the SDK (to limit memory usage in case of usage errors as @bogdandrutu pointed out, for example), this would be a breaking change in my opinion.
If the spec would clearly state, that at least n number of attributes, links and events (each) are supported to be recorded and exported, this would already give users some kind of guarantee and consumers know what they will have to expect at least.
Having a user-configurable limit to go beyond (or above) the sanity limit imposed by the SDK would be additive and non-breaking.

toumorokoshi · 2020-09-10T13:23:18Z

Would setting some relatively arbitrary limit be acceptable, in that case? I do agree that having a sane limit will save some real production headaches, and loss of telemetry (e.g. span attributes) is more desirable than an app crashing due to OOM.

Based on what I've seen 128 would be a good threshold (i've never seen someone go beyond that), but it's very anecdotal.

arminru · 2020-09-10T13:45:08Z

@toumorokoshi Exactly. I have no idea on what number can be deemed safe for "all" use cases since I fear we know our users not well enough, however. I've seen requests on our Gitter that people intend to use spans for representing long-running actions that span over multiple hours (days?) and I could expect a higher number of events on these. We should probably aim high to not prevent "valid" (who's to decide what's valid anyway) edge cases and only have it as a fallback for error conditions to prevent production crashes. If the limit we decide on turns out to be too low for certain use cases we haven't thought of, we can always go higher or allow overriding by users in a truly non-breaking manner.

@tigrannajaryan @carlosalberto @SergeyKanzhelev since you voted for moving this out of GA - what do you think?

tigrannajaryan · 2020-09-10T16:59:17Z

since you voted for moving this out of GA - what do you think?

@arminru I am certainly not against having this capability and believe it is useful but we need to make some sacrifices to cut the scope and deliver the GA.

If some users are used to recording 10k events on a span and after updating the SDK suddenly 9k of them are dropped because a "sanity limit" of 1k is introduced to the SDK (to limit memory usage in case of usage errors as @bogdandrutu pointed out, for example), this would be a breaking change in my opinion.

Some behavioral changes are likely inevitable when we release new versions. Formally, one can call this a breaking change but it is not the sort of the breaking change I think we should worry the most. If we can avoid hard breaking changes that make it impossible to recompile your app with a new version of OpenTelemetry because function signatures are different I think that will likely be good enough.

If we want to absolutely guarantee no changes in the behavior we can set the default to infinity when we introduce the Span data limiting and trimming capability.

[UPDATE] Just to make it clear: I don't advocate for weak compatibility guarantees, but I think we need to strike the right balance between being careful and being able to postpone functionality to after GA. It is very difficult to be completely certain that we will not need to change the behavior after the GA. If we set the absolute certainty of non-breaking behavioral changes after GA as the criteria for the GA then we will slow down a lot and I don't think we can meet our GA timelines.

toumorokoshi · 2020-09-11T05:28:51Z

I would say if we can't come to a fast consensus, then postponing makes sense.

Although I think this could be summed up into two changes to the spec:

set a common limit across events, links, attributes
add the need for logging if any of the above are added and discarded.

And to @arminru's point,we just choose a number that's well beyond 99.99% of use cases. Don't want to throw numbers out without data, but 1000 seems well above the needed amount.

toumorokoshi · 2020-09-11T05:32:10Z

If that would be acceptable, I'm happy to file a PR.

tigrannajaryan · 2020-09-11T16:22:43Z

If that would be acceptable, I'm happy to file a PR.

Sounds good, please do it.

Fixes open-telemetry#182 To help safeguard against a common error, adding default limits to the number of events per span in the SDK.

* SDK Trace: Adding span collection limits Fixes #182 To help safeguard against a common error, adding default limits to the number of events per span in the SDK. * Adding to the changelog * Addressing feedback Adding optional additional configurations for span limits. Modifying the unmlimited value to -1 to enable restricting collections to 0 elements. * Addressing feedback - modifying logging messaging to SHOULD, to imply configurability - clarifying collections types rather than using "each" * Update specification/trace/sdk.md Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> * Clarifying limits are in elements, not bytes Ensuring there is no ambiguation between limits being in bytes, or numbers of elements in a collection. * Removing customizable span collection limit Addressing feedback arguing for a non-configurable limit, as is the case with other limits present in OpenTelemetry. * fixing lint * Update specification/trace/sdk.md Co-authored-by: Nikita Salnikov-Tarnovski <gnikem@gmail.com> * Update specification/trace/sdk.md Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com> * adding collection limits for attributes, links Adding collection limits for spans in the compliance matrix for attributes and links as well as events. * Making collection limits optional Addressing feedback to make collection limits optional for now. The motivation is to allow innovation around safe SDK behavior, while providing a guideline and awareness for the issue. Reducing the logging on discarded attributes, events, or links to once per process to further reduce spam. * Updating log recommendations. Giving the SDK implementors more freedom in how they choose to log, while setting guidelines on the frequency. * adding back limits to spec-compliance-matrix To help make consumers aware of which SDKS implement span collection limits. * Apply suggestions from code review Co-authored-by: Nikita Salnikov-Tarnovski <gnikem@gmail.com> Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> Co-authored-by: Nikita Salnikov-Tarnovski <gnikem@gmail.com> Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>

* Introduce remote-parent OTEP

* SDK Trace: Adding span collection limits Fixes open-telemetry#182 To help safeguard against a common error, adding default limits to the number of events per span in the SDK. * Adding to the changelog * Addressing feedback Adding optional additional configurations for span limits. Modifying the unmlimited value to -1 to enable restricting collections to 0 elements. * Addressing feedback - modifying logging messaging to SHOULD, to imply configurability - clarifying collections types rather than using "each" * Update specification/trace/sdk.md Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> * Clarifying limits are in elements, not bytes Ensuring there is no ambiguation between limits being in bytes, or numbers of elements in a collection. * Removing customizable span collection limit Addressing feedback arguing for a non-configurable limit, as is the case with other limits present in OpenTelemetry. * fixing lint * Update specification/trace/sdk.md Co-authored-by: Nikita Salnikov-Tarnovski <gnikem@gmail.com> * Update specification/trace/sdk.md Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com> * adding collection limits for attributes, links Adding collection limits for spans in the compliance matrix for attributes and links as well as events. * Making collection limits optional Addressing feedback to make collection limits optional for now. The motivation is to allow innovation around safe SDK behavior, while providing a guideline and awareness for the issue. Reducing the logging on discarded attributes, events, or links to once per process to further reduce spam. * Updating log recommendations. Giving the SDK implementors more freedom in how they choose to log, while setting guidelines on the frequency. * adding back limits to spec-compliance-matrix To help make consumers aware of which SDKS implement span collection limits. * Apply suggestions from code review Co-authored-by: Nikita Salnikov-Tarnovski <gnikem@gmail.com> Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> Co-authored-by: Nikita Salnikov-Tarnovski <gnikem@gmail.com> Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>

* Introduce remote-parent OTEP

bogdandrutu added the area:sdk Related to the SDK label Jul 16, 2019

c24t mentioned this issue Jul 18, 2019

Span and SpanContext implementation open-telemetry/opentelemetry-python#58

Merged

iredelmeier added the needs discussion Need more information before all suitable labels can be applied label Jul 30, 2019

bogdandrutu added this to the Alpha v0.3 milestone Sep 30, 2019

jmacd removed this from the Alpha v0.3 milestone Jan 22, 2020

lmolkova mentioned this issue Jan 23, 2020

Make SpanData tiny again open-telemetry/opentelemetry-dotnet#465

Merged

bogdandrutu added the spec:trace Related to the specification/trace directory label Jun 12, 2020

carlosalberto added the release:required-for-ga Must be resolved before GA release, or nice to have before GA label Jul 2, 2020

andrewhsu added the priority:p2 Medium priority level label Jul 17, 2020

iNikem mentioned this issue Jul 30, 2020

Allow to configure maximum attribute value size open-telemetry/opentelemetry-java#1478

Closed

andrewhsu added release:after-ga Not required before GA release, and not going to work on before GA and removed priority:p2 Medium priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA labels Sep 9, 2020

toumorokoshi mentioned this issue Sep 13, 2020

SDK Trace: Adding span collection limits #942

Merged

2 tasks

SergeyKanzhelev closed this as completed in #942 Oct 14, 2020

TuckTuckFloof pushed a commit to TuckTuckFloof/opentelemetry-specification that referenced this issue Oct 15, 2020

Meter provider sdk (open-telemetry#182)

a95e2a6

Oberon00 mentioned this issue Feb 10, 2021

Update default for span limits to 128, 1000 seems off #1419

Merged

carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this issue Oct 21, 2024

Export SpanContext.IsRemote in OTLP (open-telemetry#182)

b1d602b

* Introduce remote-parent OTEP

carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this issue Oct 23, 2024

Export SpanContext.IsRemote in OTLP (open-telemetry#182)

a003f41

* Introduce remote-parent OTEP

carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this issue Oct 31, 2024

Export SpanContext.IsRemote in OTLP (open-telemetry#182)

acb040c

* Introduce remote-parent OTEP

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limitation on number of attributes, links, events and behavior #182

Limitation on number of attributes, links, events and behavior #182

lmolkova commented Jul 13, 2019

austinlparker commented Jul 13, 2019

discostu105 commented Jul 16, 2019

austinlparker commented Jul 16, 2019

mtwo commented Jul 16, 2019 •

edited by bogdandrutu

Loading

bogdandrutu commented Sep 30, 2019

SergeyKanzhelev commented Sep 30, 2019

jmacd commented Jan 22, 2020

iNikem commented Jul 27, 2020

Oberon00 commented Jul 27, 2020

toumorokoshi commented Aug 11, 2020

tigrannajaryan commented Sep 8, 2020

carlosalberto commented Sep 9, 2020

andrewhsu commented Sep 9, 2020

arminru commented Sep 10, 2020

toumorokoshi commented Sep 10, 2020

arminru commented Sep 10, 2020

tigrannajaryan commented Sep 10, 2020 •

edited

Loading

toumorokoshi commented Sep 11, 2020

toumorokoshi commented Sep 11, 2020

tigrannajaryan commented Sep 11, 2020

Limitation on number of attributes, links, events and behavior #182

Limitation on number of attributes, links, events and behavior #182

Comments

lmolkova commented Jul 13, 2019

austinlparker commented Jul 13, 2019

discostu105 commented Jul 16, 2019

austinlparker commented Jul 16, 2019

mtwo commented Jul 16, 2019 • edited by bogdandrutu Loading

bogdandrutu commented Sep 30, 2019

SergeyKanzhelev commented Sep 30, 2019

jmacd commented Jan 22, 2020

iNikem commented Jul 27, 2020

Oberon00 commented Jul 27, 2020

toumorokoshi commented Aug 11, 2020

Other idea: ulimit-style limits.

tigrannajaryan commented Sep 8, 2020

carlosalberto commented Sep 9, 2020

andrewhsu commented Sep 9, 2020

arminru commented Sep 10, 2020

toumorokoshi commented Sep 10, 2020

arminru commented Sep 10, 2020

tigrannajaryan commented Sep 10, 2020 • edited Loading

toumorokoshi commented Sep 11, 2020

toumorokoshi commented Sep 11, 2020

tigrannajaryan commented Sep 11, 2020

mtwo commented Jul 16, 2019 •

edited by bogdandrutu

Loading

tigrannajaryan commented Sep 10, 2020 •

edited

Loading