-
Notifications
You must be signed in to change notification settings - Fork 772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SDK] Add a test covering exception thrown in custom sampler #4072
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #4072 +/- ##
==========================================
- Coverage 85.57% 85.56% -0.02%
==========================================
Files 289 289
Lines 11261 11261
==========================================
- Hits 9637 9635 -2
- Misses 1624 1626 +2
|
A related question - what is the expected behavior if the sampler throws? Here are some possible options:
|
I personally would vote for leaving it as undefined behavior, and not handle it in the SDK. This also applies to the following situations:
|
{ | ||
public override SamplingResult ShouldSample(in SamplingParameters samplingParameters) | ||
{ | ||
throw new InvalidOperationException("ThrowingSampler"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the direction about who is responsible for catching/ not throwing the exception should come from the spec. Similar to how they have specified it for Processor OnStart
and OnEnd
methods here.
If we have a particular preference, we should try to get the spec updated for ShouldSample first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the direction about who is responsible for catching the exception should come from the spec. Similar to how they have specified it for Processor
OnStart
andOnEnd
methods here
The spec doesn't seem to talk about who should be responsible if exception was thrown from a Processor's OnStart
/OnEnd
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OnEnd is called after a span is ended (i.e., the end timestamp is already set). This method MUST be called synchronously within the Span.End() API, therefore it should not block or throw an exception.
I interpret this statement as "Processor.OnEnd should not throw an exception" which makes it the user's responsibility to provide a spec-compliant processor to the SDK. If the processor logic can throw an exception, it's the user's responsibility to ensure that it is not thrown (they could simply wrap a try-catch around their processor logic.
Another related question: Why do we not have a try-catch around opentelemetry-dotnet/src/OpenTelemetry/BatchExportProcessor.cs Lines 279 to 296 in f98f8fe
We have try-catch around opentelemetry-dotnet/src/OpenTelemetry/SimpleExportProcessor.cs Lines 44 to 54 in f98f8fe
opentelemetry-dotnet/src/OpenTelemetry/Metrics/BaseExportingMetricReader.cs Lines 90 to 110 in f98f8fe
|
See #1640 |
I was just poking around the spec. The docs for the different extension points have varying degrees of clarity, but the error handling doc seems crystal clear 😄
The spirit of this doc says to me the primary goal is to not interrupt the function of the code being instrumented a la we should try/catch user code. |
What would you consider as "user code"? E.g. if someone is writing a processor/exporter, do we consider them as SDK developers or SDK plugin developers? |
Let me amend my statement: The spirit of this doc says to me the primary goal is to not interrupt the function of the code being instrumented a la we should try/catch To say that another way, what I feel like the spec is trying to say is: Once started, the SDK should never throw. Regardless of if it is itself throwing the exception or it is bubbling up an exception from some plugin it executed. |
Putting these together, how would you phrase the implementation principles/guidelines? I think either way is fine as long as we have clear principles, and the worst situation is that we don't have principles + we never thought about it + we just do things randomly without clarity. Take an example:
Here are more examples:
|
My personal opinion, agree with @utpilla it would be best for the spec to prescribe the behavior. Ambiguity in specifications is bad. What should the spec say? IMO, keep it simple: OnStart ... should not block or throw exceptions. However, if OnStart throws exceptions, the SDK should swallow the exception and write the details to whatever diagnostic mechanism is in place for the SDK. My reasoning is that the spec has been ambiguous and our code has been unprotected thus far, and the world hasn't ended, so we shouldn't over engineer something 😄 |
I feel this simple answer is starting to scratch the surface of the design, which leads to couple more questions:
|
This is just my opinion but for the sampler I would say a throw is a drop. I would not prescribe back-off from the spec or anything more perverse. It could leave the door open for that should an SDK want it. My reasoning is a sampler can implement its own retry/backoff/fallback/whatever if it is doing complex things. The goal is just to fulfil the spec's mission to be transparent and not bring down the process/interfere with useful code the app is trying to execute. |
I don't feel this would solve any problems or make things better, and it is more likely to make things worse. I'll take one example - a very bad sampler which triggers a blocking I/O operation:
I consider taking blocking calls, making recursive calls, leaking memory, don't properly handle exceptions - these are equally bad, and from the consistency perspective it makes more sense to treat them in the same way. More thinking regarding the ecosystem, completely agree with:
I think a simple "solution" is to do nothing (I guess LoggingProvider is doing the same - not over protecting), not over protecting things knowing that we won't be able to protect it and on the other side we might introduce bad behavior (e.g. a plugin developer might decide to throw exception from a deep callstack for "simplicity" instead of return SamplingDecision.Drop). |
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
Reminds me of this old conversation #1081 (comment). At that point in time, my understanding of the spec was that the SDK should suppress exceptions even from custom components.
I'm not as convinced anymore 😆. From the error handling spec
The MUST NOT in this statement does not imply to me that the SDK must seek to swallow unexpected exceptions raised from all components - custom or otherwise. The second statement suggests that it is the exporter's responsibility to not throw exceptions. I agree that the exporters hosted in this repo must not throw exceptions. If they did, we would certainly treat this as a bug and fix it. Just as it is our responsibility to ensure our components do not throw exceptions, I think authors of custom exporters, samplers, processors, etc share this responsibility. If the intent of the spec is that SDKs must suppress all exceptions that may crash an application then I think it should clearly state this. Specifically with regards to samplers, I asked Java. Turns out an exception thrown from a custom sampler in Java will not be swallowed by the SDK. Just another data point... another example within the .NET ecosystem: if you write a custom ILogger logging provider that throws an exception, this will not be caught and may cause an app to crash. |
@alanwest good points. |
// TODO: Discuss: An exception thrown in custom sampler probably | ||
// should not blow up like this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove TODO assuming we agree this is the behavior we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or leave a comment saying this is by design, and link to this PR/issues, so a future reader can easily find all the prior discussions on this topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
It'd be good to modify the docs (separately) meant for custom plugin writers, to strongly warn them of the implications of throwing from their components and be explicit like "if you throw, it might lead to app crash"/similar wordings. |
…ntelemetry-dotnet into sampler-exception
Changes
/cc @alanwest @utpilla @cijothomas