Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of request context scope #22

Open
fed135 opened this issue Jan 13, 2020 · 16 comments
Open

Handling of request context scope #22

fed135 opened this issue Jan 13, 2020 · 16 comments
Labels
discuss Topics for open discussion

Comments

@fed135
Copy link
Member

fed135 commented Jan 13, 2020

It's a common use case to need access to the request data, (ex: headers) further down the user code, or to want to store stateful information that is scoped to a specific request. - this is a concept present in other languages (like PHP), or is a use of thread caching in Java, for example.

There are multiple ways to do this right now, either via the domains API, continuation-local-storage or cls-hooked. Unfortunately, these are either deprecated or introduce weird behaviors, memory leaks and a significant performance overhead.

I'm opening this to debate that this is an obstacle that API developers simply have to deal with, or if there's room for improvement in Node/ Web frameworks.

@mcollina
Copy link
Member

I think it's worthwhile to ping @nodejs/diagnostics on this one.

Some related work: nodejs/node#31016 nodejs/node#30959 nodejs/node#27172 nodejs/node#26540

@wesleytodd
Copy link
Member

Great resources @mcollina, thanks!

Is there something this team can do to help move this forward? It looks like it is on track, but maybe we could provide some documentation on usage within a web app? Or maybe an experimental framework implemented on top of nodejs/node#26540 to show it in action?

This is where I think we need to clear up our team's scope so that we know where we should pitch in. I am not sure what will be the most effective way to participate in these conversations, but I do think we should be aware and help where we can.

@jkrems
Copy link

jkrems commented Jan 14, 2020

One interesting angle here may be how CLS & friends relate to something like Trace Context. IIRC the authors of that spec were hoping that web frameworks (or even node itself) would actively forward the context.

One note of caution for "retrieve arbitrary data attached to the request context": While it can be very convenient, especially in very small apps, it can have all the baggage of globals / action at a distance. Maybe a react-context like mechanism where namespacing is part of the API would be neat to establish (write and read of a given key only possible from the same place)..?

@wesleytodd
Copy link
Member

IIRC the authors of that spec were hoping that web frameworks (or even node itself) would actively forward the context.

Seems like a framework concern to me, but I agree tracing is something which I would like to see more clear support for in the popular frameworks (express included).

One note of caution for "retrieve arbitrary data attached to the request context": While it can be very convenient, especially in very small apps, it can have all the baggage of globals / action at a distance.

This is for sure the largest downside of the approach IMO. And I think even react's implementation suffers under misuse. I think that exposing this to users is hard to get right, but I think if wrapped inside a framework it can be really powerful.

@wesleytodd wesleytodd added the discuss Topics for open discussion label Jan 14, 2020
@wesleytodd wesleytodd changed the title agenda: Handling of request context scope Handling of request context scope Jan 14, 2020
@Flarna
Copy link
Member

Flarna commented Jan 14, 2020

Regaring TraceContext:
It's not that easy and straight forward to put this into node core or some web framework. TraceContext (and/or some propietary variant/extension) is used by tools like OpenTelemetry and APM vendors. They monitor more then WebRequests, also database requests, message queues, RPC calls,... accross technologies.
Therefore these tools have to tinker already with request/response headers (and similar stuff in other protocols).
If node core and some web frameworks starts also to create/modifiy TraceState and TraceParent header it may result in conflicts. For example a trace-id has to be unique and created exactly once for a trace. Therefore all involved components have to agree on when/how this is done.

Independent who does the TraceContext handling this component needs some scope handling/CLS. OTel and APMs have implementations of this (similar to e.g. cls-hooked mentioned above) and all of them have their limitations.
As there is no formal context specification in Javascript adding a CLS API in node core can be only the first step. UserLand modules using connection pooling or similar have to use AsyncResource to tell node core about context propagation. An API in core would be for sure a good step in convincing these modules to actually implement this.

@jkrems
Copy link

jkrems commented Jan 14, 2020

One reason why node having a native notion of trace state ("request context") is that right now, it's one of the few things that - no matter what - requires APMs to patch node for a "vanilla node service". Most other things could be additive APIs but this is something where just providing hooks from node wouldn't be enough. Nothing but node can make sure that server.on('response', fn) calls fn within a well-known context (assuming a world with stable APIs for APMs instead of today's monkey-patch dance).

Also, having node and other application platforms being able to forward HTTP-level trace context means that tracing could be enabled in services that don't have active support libraries hooking into them. Such a "true service mesh-level" solution isn't possible without it.

@Flarna
Copy link
Member

Flarna commented Jan 14, 2020

Yes, pure forwarding of trace context from incoming to outgoing http as default fallback sounds reasonable. But tools need hooks to modify this behavior to allow to add spans for operations happen between entry and exit of a process.
if entry is e.g. GRPC instead HTTP the GRPC module could create this well known context.

In general it's not just a simple one server request triggers zero or one client request setup, Often a single request results in several client requests - the result is a tree. Which means there should be the ability to branch a context.
This branching could happen in sync events if they include the needed info to pimp the new created async resources with a new context linked to the parent.
With more tools in the same app this could be problematic if all of them try to refine the global well known context.

@drawm
Copy link

drawm commented Jan 15, 2020

While it can be very convenient, especially in very small apps, it can have all the baggage of globals / action at a distance.

This is exactly my experience with these kind of context. Its convenient not having to hold a reference to a key/namespace (or the context itself) but you quickly end up with a glorified globals state and all the anti-pattern that goes with it.

Like @wesleytodd said, its a framework concern. I think its out of scope for this team.

Although it would be great to formalize a common context schema all web-framework could rely on.
I believe Symfony did something similar with their request object. Other framework picked it up and the ecosystem gained a bit of interoperability.

@wesleytodd
Copy link
Member

wesleytodd commented Jan 21, 2020

Like @wesleytodd said, its a framework concern. I think its out of scope for this team.

I think my statement was miss-understood. I was saying that how frameworks handle tracing might be out of scope (but we have yet to finalize the scope, so it should be open for discussion). How CLS and the core api looks seems very in scope to me because it will be heavily used in framework and user code on web servers.

glorified globals state and all the anti-pattern that goes with it.

This is really the state most of us are in today (req.user is functionally equivalent to a request scoped context and not the same as a global). So swapping out req as your context object for one which more deeply integrates into the current patterns is still a win IMO.

@jkrems
Copy link

jkrems commented Jan 21, 2020

So swapping out req as your context object for one which more deeply integrates into the current patterns is still a win IMO.

The problem with true contextual state is where it differs from putting values on req (which, yes, is also kind of bad): There's no narrowing down of where state may be written or read. req is generally passed into functions. If they call into other code and don't forward req, that part of the code doesn't have to be considered. Globals (and a globally available request context) are different: Any line in the whole program could be depending on or modifying the state. Without any trace of access/visibility.

That's definitely different with native scoping ([setValue, getValue] = createRequestContextKey()) which restricts where a given context key can be written or read. That's a clear improvement over req.user. But getRequestContext().user would be a step backwards imo.

@wesleytodd
Copy link
Member

There's no narrowing down of where state may be written or read

This is exactly the problem, and it exists with all the common implementations I have seen in practice. The reason it is so common is because it is so easy to understand at face value, [setValue, getValue] = createRequestContextKey() introduces a ton of other issues. Do you think there is an approach where we get the control without completely sacrificing the simplicity for end users?

@vdeturckheim
Copy link
Member

vdeturckheim commented Feb 27, 2020

The AsyncLocalStorage class has been merged (nodejs/node@9c70292) last week. We still have another change to merge(nodejs/node#31930) before this lands in any release.

This clas provides CLS-like API directly in core. The API should be memory safe. We have a benchmark for it too and, even if nothing beats real life testing on that front, performances seem pretty decent so far.

This class can be used to provide anything you would expect from a CLS:

  • logging
  • services tracing
  • not passing the request object to every functions
  • error management (as a breaking-change replacement for domain that requires a bit of extra work).

I have a few ideas on how this can be leveraged by a web framework so far (probably in the form of a set of recommendations for using this class) but I don't know yet what is the right way of formalizing it so far (and where would this doc live).

However, the API should be available in the nightly builds (I build docker images of them everyday) if anyone want to test it before it lands in a release.

While I still write down ideas and recommendations about this class, feel free to ping me to provide feedback or ask questions. This is still experimental and real life feedback is in the path to success for this in core.

@ghermeto
Copy link
Contributor

@vdeturckheim I would love to hear your ideas on how you see this being leveraged by a web-framework. In Q2 I can run some canaries with a fair amount fo traffic and report back the results.

@wesleytodd
Copy link
Member

@ghermeto
Copy link
Contributor

👍

@vdeturckheim
Copy link
Member

@wesleytodd thanks for pointing to it! Indeed this is what a framework implementation can look like.

In term of framework integration, it might also be interesting to not expose the AsyncLocalStorage instance directly but only a proxy to the getStore() method. This will prevent the users from tempering with the state of the instance by mistake.

@ghermeto if you have time, we can schedule a quick call together to take a look at this together and define how we would consider the performance perspective!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Topics for open discussion
Projects
None yet
Development

No branches or pull requests

8 participants