-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate OT libraries #251
Comments
I had similar thoughts on how to split this. For
The idea is that each component can be used on its own, and the first 2 are not related to OpenTracing at all. For example they could be used in OpenCensus as well or for any context propagation use case in general. For the various integrations that you mentioned (i.e. mysql-opentracing) I think this is a great idea. However we would have to be very careful how this is implemented, as each APM vendor has their own sets of tags that need to be included. For example in our case we have a specific requirement for |
One thing to check out is Zone.js which has a lot of those and supports a variety of environments. It does not, however, support native syntax async/await. (But AFAIK there is no way to support async/await in some runtimes, like browsers.) There's even a TC39 proposal for it, but it's stalled. https://github.com/domenic/zones From process.addEventListener, AsyncWrap, async_hooks, Zones, ... I don't know what the right answer is. This is something that ideally would be in the ES standard.
With callbacks, CLS is super easy. I'll leave that to opentracing/specification#126 discussion.
This is in fact the intent of the OpenTracing design. So that each APM vendor doesn't need to instrument every library from scratch. There are a lot of libraries out there and it doesn't make sense for users to hope their APM supports the one they want.
A first draft of http client and server instrumentation: https://gist.github.com/pauldraper/5f7055b2d24d042c58625bfebfd2331f
Ideally, the standard OT tags cover a lot of ground. For example, does Datadog's And then tracer can add more data onto all traces (e.g. name or IP address of the current machine).
Agreed. It doesn't have to be a standardized way. But each library instrumentation should give you some options. Especially for the ability to add tags. If there isn't a way to write vendor-agnostic instrumentation for a library, OpenTracing is a bad idea and its design has fundamentally failed. |
Does Zone.js support Node properly? I have yet to see a single project that uses it on the server.
I think this is an acceptable limitation in the browser since in most cases it's transpiled anyway. In Node however async/await is very common and must be supported.
From my understanding, async/await is implemented in most major browsers: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/await However, as mentioned above, I don't think it's widely used yet.
We had cases where we had to manually enter/exit the current continuation in a different asynchronous context, which cannot be done with callbacks. This could have been a limitation with This is also a major departure from the current ScopeManager specification that is already implemented in other languages. I think it's worth trying to stick to the current specification if possible, and then iterate later on with an easier API on top. Otherwise the ScopeManager specification needs to be deprecated which could take a lot of time. One thing to note also is that the main reason for the complexity of our current implementation is trying to automatically close the scope at the end of the entire asynchronous context tree. The same complexity would exist if trying to have this functionality with a CPS. Other than that the implementation is actually very simple.
Definitely agree. I wonder if it would be useful to have also some specification for integrations. For example which tags are useful for |
Yes. It supports domains, fs, EventEmitters, etc. opentracing/opentracing-javascript#113 has the CPS-style ScopeManager and rivethealth/opentracing-javascript@13fc56c has the imperative ScopeManager. Both have functioning Node.js tests. Since its core doesn't rely runtime-specific features and therefore cannot work with native async/await, it's not surprising that Node.js projects stick to the Node.js-specific implementations which can support native async/await. There has been work to drive it with async_hooks but there are edge cases which have not been buttoned-up.
Do you remember what those are? I've found that patching EventEmitter gets me 95% of the way there in Node.js. For what it's worth, Node.js is choosing CPS for async_hooks/AsyncResource.runInAsyncScope. (They have emitBefore/emitAfter but have chosen to deprecate.) Whether that means hooks are easy is another question, but AsyncResource uses CPS.
True. Honestly I don't understand why this is even a thing. Every span I've seen has well-defined start and stop. It starts with input, it ends with output. If I add code to fire-and-forget some analytics/diagnostics logic, why would I now want to include that as part of my HTTP request/response span? A trace already has the property that it "starts" at the first component and "stops" at the last component, including ancillary non-critical path items. A span...I just don't see the point of finishOnClose (particularly if closes are automatic). Glad to hear this is the source of all the complexity, because that makes a lot of sense. |
That would mean that the first implementation couldn't use Zone.js however, since all the recent projects I've seen use async/await.
I don't remember exactly, but it was a case where the way the library was built, it was not possible to handle a specific case. I think it was something along these lines: we needed to activate the span in the current context so we can access it later, but there were some events or promises triggered that would result in the user's code being called, but at that point the span should no longer be available since at this point the span is already finished. But again, I think this was mostly because of the way
I'm not saying I'm against CPS or that it's not a superior solution, but my worry is that having an entirely new specification approved that basically does the same thing as the existing scope manager could take a very long time. Any work on a CPS implementation until then is based on the assumption that it will be approved, which it might not.
This was mostly done to support Would you be opened to use a scope manager implementation while this is discussed with the OTSC? This way it will be possible to work on something that is already approved and thus OpenTracing compliant with the current specification. Switching later on shouldn't be very difficult either. For reference, when I switched all our integrations from CLS to the scope manager, it only took a few hours. |
Give that imperative ScopeManager is the current standard, it does make sense to keep that; sorry, I didn't mean for that discussion to leak in here. I think finishOnClose is extraordinary complexity for a use case I cannot think of, but again, that has to do with RFC itself and is discussed better elsewhere. Main point of this issue here is splitting up the libraries.
You have some opinions about splitting up (1) even further; I'm not sure which way is up on that. As long as something works on Node.js, I'm happy. (2) should stay here and depend on (1) and (3). (3) should be separate projects, if OT has any value (and I think it does). They should be customizable, and it sounds like there are some Datadog specific bits. |
Totally agree.
This can be iterated on. The most important part is move the entire functionality out. Afterwards it can be split further if necessary. I just think in the long run it would be interesting to have something that can be used outside of OpenTracing. This is definitely not a short-term concern.
The most important ones for us that are common across all integrations are the following:
As long as the integrations are flexible enough to be able to add this metadata, it would handle most of our use cases.
Should this simply go directly in |
That can easily be added at the Tracer level.
Personally, I think this is common enough be added to the OT standard, as
There is
Excellent. And the tracer can always do the conversion.
Yes. opentracing/opentracing-javascript#113 should be closed in favor of it. It may be helpful for reference, for tests, etc. It will have to be in TypeScript though. I'm happy to port it, or you can. |
It already is, but we prefix the remote service basically. I think it's possible to achieve that using the existing tags, I would have to do some testing.
The only issue is that different vendors expect different formats in most cases. While it's not ideal, it would probably take a lot of effort to achieve a common format for every integration.
That could work. At worst a conversion would be needed before sending the span. So, in summary, this is the steps I think that should be taken here:
What do you think? |
Sounds good. Are you thinking the RFC will or will not include tracking descendents for finishOnClose? Java tracks it. I might be missing something about cases when that is helpful. |
When you say Java tracks it, this means that the current thread plus every single thread that is created from that point on need to be terminated before In any case, the RFC will include all options. The benefit of tracking descendants is to support a I think the most important part is how |
Yes. When finishOnClose is true, it has an atmoic ref counter that means
async operations (normally done via a thread pool rather than a new thread,
but could be anything with instrumentation) can affect the lifetime of the
context. java-concurrent has the instrumentation, I think.
I don't know what Python does.
…On Tue, Aug 21, 2018, 08:26 Roch Devost ***@***.***> wrote:
When you say Java tracks it, this means that the current thread plus every
single thread that is created from that point on need to be terminated
before close() is automatically called? This would be the equivalent
behavior of our JavaScript scope manager.
In any case, the RFC will include all options. The benefit of tracking
descendants is to support a CHILD_OF relationship when no callback is
available so that the parent will include the duration of the entire
asynchronous context tree. This is to support providers that don't have
FOLLOWS_FROM. It does come however with an increase in complexity. If
other tracers are consistent in how this is handled, we should aim to make
JavaScript consistent as well. I doubt this is the case however since the
spec is very vague.
I think the most important part is how close() behaves when called
manually which you mentioned in opentracing/specification#126
<opentracing/specification#126>. This is mostly
important because the outcome will either make a CPS wrapper possible or
not.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#251 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABVph2YB5hqxvp-Elftcvz9Tu-6Ug9yjks5uTBilgaJpZM4WEWBW>
.
|
Do you see any issue with doing that in JS too other than the added complexity? (which is also already there) |
I have bad experiences with tracking complex leaking trace implementations, so I prefer to make it as simple as possible. (But obviously no simpler, if there is a needed use.) |
Sorry for not posting the RFC yet. I'm currently experimenting with a few different approaches first to get a better idea of what to recommend. It's proving very difficult to keep track of when to automatically close the scope efficiently. |
FWIW if you're discussing shared lib instrumentation (+1), I stumbled across Google's strackdriver, which looks like it is a ~fairly mature?/complete? implementation of these (I've not used it, just found it in github yesterday while googling for async hook misc things): https://github.com/GoogleCloudPlatform/cloud-trace-nodejs/tree/master/src/plugins. Disclaimer it's very likely you've already seen it/etc. but just thought it wouldn't hurt to mention. |
@rochdev any more thoughts on this? As awesome as DataDog is, the point of OpenTracing was to decouple the instrumentation from the APM provider. Most of the work in the repo is theoretically just generic OpenTracing instrumentation. |
@pauldraper Most of the work mentioned in the OP has either already happened or should happen soon for I would highly recommend participating to the OpenTelemetry discussions as the project is still young and a lot of decisions are still being taken, so you could have a big impact on the design. It's currently very Java centric, so the more JS engineers onboard the better to try and change that. |
@rochdev thanks, will do ! |
Closing this as we're moving away from OpenTracing, focusing instead on reworking dd-trace into individual components and eventually individual modules. As far as open standards go, we're adding support for OpenTelemetry in dd-opentelemetry-exporter-js for now, and will likely also support the OTel API at some point to interact with dd-trace directly, as well as W3C propagation. |
This is awesome work!
Before I stumbled on this, I was working on
It looks like (2), (5), and (6) have already been worked on by this project.
I'd like to continue with the great work being done for OT JS.
I think this library should be split up into:
Thoughts?
The text was updated successfully, but these errors were encountered: