-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What is intended to be reported as an "error" from the HTTP plugin? #297
Comments
Right now you are correct that only network errors are considered as errors. I'm actually not sure what the best behavior would be. For example, from an HTTP client standpoint:
I'd say maybe it makes sense to make all of these errors as you've proposed. At the same time, I'd like to keep the behavior consistent across all our tracers so we'll have to discuss this internally first. In any case, there should be a configuration for this, similar to |
My gut feeling is that for most use cases, recording 4XXs and 5XXs as errors would be valuable. Most of the time, an elevated rate of these would indicate a problem with the health of my application which I want to be able to see at a glance, receive alerts for, etc. That said, there are cases where (at least for certain requests), some 4XX or 5XX errors are an expected part of the ordinary operation of the application and don't represent a problem, so having the flexibility to decide what is and isn't an error would be valuable. (On the other hand, it seems like you would always want to consider timeouts and network errors an "error"). A Nevertheless, it feels like recording these as errors would be a reasonable default. Can I suggest that we?:
|
Thinking about it, you might well have logic to retry connection failures
too (if the request is idempotent), and despite having handled the error,
you’d still probably want it recorded as an error. I think the same goes
for 500s.
…On Sun, 30 Sep 2018 at 18:21, Roch Devost ***@***.***> wrote:
Right now you are correct that only network errors are considered as
errors. I'm actually not sure what the best behavior would be.
For example, from an HTTP client standpoint:
- The source of network errors is usually unknown from the client
perspective, so it probably should be an error.
- 5xx errors are actually errors upstream, not errors from the client.
It may or may not cause the caller to decide to fail. These can also be
retried.
- 4xx are definitely errors from the client. Retrying makes no sense
since the same invalid request would simply be resent.
I'd say maybe it makes sense to make all of these errors as you've
proposed. At the same time, I'd like to keep the behavior consistent across
all our tracers so we'll have to discuss this internally first.
In any case, there should be a configuration for this, similar to
validateStatus on HTTP server integrations, which could be added in the
meantime while we reconsider the defaults.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#297 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHFpim3N1WjL57LZbCVmaxmA1wCbE2wks5ugP2MgaJpZM4XAZ8L>
.
|
I'm happy to write the code for this. How would you want the errors to be recorded? There seem to be three properties: |
In this case, since there is no actual error, I would probably omit the error object and simply set the I wouldn't generate an actual error whenever it's possible not to do it because error generation is pretty heavy in JavaScript. |
I have an implementation for this locally. I’ll push it up tomorrow. |
The [OpenTracing spec][1] states that a span should be tagged as an error "if and only if the application considers the operation represented by the Span to have failed". The `dd-trace-js` HTTP plugin allows you to automatically instrument outgoing HTTP requests. Currently, it only tags a span as an error when the request emits an `error` event, which only happens in the case of "hard failures" like a timeout or connection problem. https://github.com/DataDog/dd-trace-js/blob/34c168555f44a961a61ede95a39341d7c527701a/src/plugins/http.js#L59-L67 This means that requests that return responses never are considered errors, even if a 4XX or 5XX response is received. Having spans tagged as errors is useful, because it means they are flagged as such in the Datadog APM interface, useful for reporting. This commit adds support for a new `validateStatus` option for the HTTP plugin which allows the user to supply a `validateStatus` function which, when a response is received, is passed the response status and expected to return `false` if the response should be considered an error, or `true` if not. If it returns `false`, the span will be marked as an error. By default, we will continue not to consider any requests that receive a response to be errors, but this configuration option allows the user to "opt in" if they want to see these spans as errors. Fixes DataDog#297. [1]: https://github.com/opentracing/specification/blob/master/semantic_conventions.md
The [OpenTracing spec][1] states that a span should be tagged as an error "if and only if the application considers the operation represented by the Span to have failed". The `dd-trace-js` HTTP plugin allows you to automatically instrument outgoing HTTP requests. Currently, it only tags a span as an error when the request emits an `error` event, which only happens in the case of "hard failures" like a timeout or connection problem. https://github.com/DataDog/dd-trace-js/blob/34c168555f44a961a61ede95a39341d7c527701a/src/plugins/http.js#L59-L67 This means that requests that return responses never are considered errors, even if a 4XX or 5XX response is received. Having spans tagged as errors is useful, because it means they are flagged as such in the Datadog APM interface, useful for reporting. This commit adds support for a new `validateStatus` option for the HTTP plugin which allows the user to supply a `validateStatus` function which, when a response is received, is passed the response status and expected to return `false` if the response should be considered an error, or `true` if not. If it returns `false`, the span will be marked as an error. By default, we will consider requests that return `4XX` responses to be errors, since a `4XX` indicates a failure which is within the user's control. Fixes DataDog#297. [1]: https://github.com/opentracing/specification/blob/master/semantic_conventions.md
The [OpenTracing spec][1] states that a span should be tagged as an error "if and only if the application considers the operation represented by the Span to have failed". The `dd-trace-js` HTTP plugin allows you to automatically instrument outgoing HTTP requests. Currently, it only tags a span as an error when the request emits an `error` event, which only happens in the case of "hard failures" like a timeout or connection problem. https://github.com/DataDog/dd-trace-js/blob/34c168555f44a961a61ede95a39341d7c527701a/src/plugins/http.js#L59-L67 This means that requests that return responses never are considered errors, even if a 4XX or 5XX response is received. Having spans tagged as errors is useful, because it means they are flagged as such in the Datadog APM interface, useful for reporting. This commit adds support for a new `validateStatus` option for the HTTP plugin which allows the user to supply a `validateStatus` function which, when a response is received, is passed the response status and expected to return `false` if the response should be considered an error, or `true` if not. If it returns `false`, the span will be marked as an error. By default, we will consider requests that return `4XX` responses to be errors, since a `4XX` indicates a failure which is within the user's control. Fixes #297. [1]: https://github.com/opentracing/specification/blob/master/semantic_conventions.md
Hey @timrogers, thanks for submitting this change. We were racking our brains trying to figure out why our 4xx errors weren't being reported in APM as errors. I have pointed our npm package to the latest in master as your branch was merged, but not yet published. I was curious if you could provide an example of how you implement this? Does it require any changes or does it work by default now that 4xx errors are treated as errors, since by definition they are errors? Thanks! |
It should just work. I haven’t updated yet 🙈 What’s missing here is the
ability to customise what is considered an error based on how the specific
API you’re calling works, but if you’re not worried about that and can
configure globally, this should work for you.
…On Wed, 24 Oct 2018 at 08:55, Johnny ***@***.***> wrote:
Hey @timrogers <https://github.com/timrogers>, thanks for submitting this
change. We were racking our brains trying to figure out why our 4xx errors
weren't being reported in APM as errors.
I have pointed our npm package to the latest in master as your branch was
merged, but not yet published.
I was curious if you could provide an example of how you implement this?
Does it require any changes or does it work by default now that 4xx errors
are treated as errors, since by definition they are errors?
Thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#297 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHFpgQMg-2vWDw83FtS_1-UDSs-AF_Yks5uoBzpgaJpZM4XAZ8L>
.
|
@johnnydimas Please note that the feature was merged in the |
Thanks for the responses! |
@rochdev Is there a timeline on when this will be available in a release? |
@johnnydimas I don't have a specific timeline but it should be fairly soon. In the meantime, you can use one of the intermediary beta releases, such as |
I’m going to close this as my change was merged, if not released. In an idea world, you’d be able to customise on a per-request basis but it’s hard to imagine how you’d do that in a nice automatic way! |
What kinds of errors are we trying to report as "errors" from the HTTP(S) plugin?
dd-trace-js/src/plugins/http.js
Lines 59 to 67 in 34c1685
I've observed that 4XX and 5XX HTTP responses don't seem to lead to errors being recorded in Datadog APM. Perhaps this is the expected behaviour, but strikes me as a little counterintuitive and worth documenting if it is how it is meant to work.
I realise that this approach may align with the JavaScript ecosystem more generally (for example
fetch
doesn't natively treat HTTP error responses as errors.The text was updated successfully, but these errors were encountered: