-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
apollo-server-core: unified Studio reporting #4142
Conversation
5903a0c
to
636a4a6
Compare
lmk when this is ready for eyes again |
dec357f
to
aa4d6d9
Compare
aa4d6d9
to
39d8ed0
Compare
@@ -4,6 +4,12 @@ package mdg.engine.proto; | |||
|
|||
import "google/protobuf/timestamp.proto"; | |||
|
|||
import "google/protobuf/descriptor.proto"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my PR description that implemented this explained that you need to leave this stuff out of the version of the file you use in protobufjs somehow, because protobufjs doesn't need the option to be declared but adding the import will massively inflate the size of generated code. Bah.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is still happening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. I need to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still happening
} | ||
|
||
public addTrace(trace: Trace) { | ||
const queryLatencyStats = this.queryLatencyStats; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't done an in-depth review of this function for accuracy (I think you were mostly looking for overall structural review so far?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i got a bit distracted and i want to see your fixes from the thing we talked about on slack re the interfaces anyway. will get back to this in the next round after you push!
5b2047f
to
84a3ecb
Compare
2a3aaef
to
9c2d289
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first, just looking at the stuff that i had comment on last time
@@ -4,6 +4,12 @@ package mdg.engine.proto; | |||
|
|||
import "google/protobuf/timestamp.proto"; | |||
|
|||
import "google/protobuf/descriptor.proto"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is still happening
); | ||
reportData.size += | ||
encodedTrace.length + Buffer.byteLength(statsReportKey); | ||
reportData.traceCache.set(traceCacheKey, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is traceCache going to grow indefinitely? That's not going to be good. Could do some sort of LRU thing. Or alternatively, instead of making trace.endTime.seconds be part of each entry's key, we can use just use the time that addTrace
is called (presumably vaguely monotonic) and throw away the whole map every time we tick over to the next minute.
If we want to match the Apollo-cloud-service-side calculation which is based on the minute from the trace, and behave better when the clock turns back by a few seconds around a minute break, perhaps keep 2 or 3 maps of the most recent minutes? This can just be a tiny map where every time we scan through it and drop sub-maps that are too old (it's O(n)
but n=3 so whatever).
Also it's not really a "cache", it's more of a "seen" map
} | ||
|
||
public addTrace(trace: Trace) { | ||
const queryLatencyStats = this.queryLatencyStats; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i got a bit distracted and i want to see your fixes from the thing we talked about on slack re the interfaces anyway. will get back to this in the next round after you push!
9c2d289
to
b9c6b49
Compare
93e55a4
to
50211ca
Compare
- apollo-cache-control@0.12.1-unified.0 - apollo-datasource-rest@0.12.1-unified.0 - apollo-datasource@0.8.1-unified.0 - apollo-reporting-protobuf@0.6.3-unified.0 - apollo-server-azure-functions@2.23.1-unified.0 - apollo-server-cache-memcached@0.7.1-unified.0 - apollo-server-cache-redis@1.4.1-unified.0 - apollo-server-caching@0.6.1-unified.0 - apollo-server-cloud-functions@2.23.1-unified.0 - apollo-server-cloudflare@2.23.1-unified.0 - apollo-server-core@2.23.1-unified.0 - apollo-server-env@3.0.1-unified.0 - apollo-server-express@2.23.1-unified.0 - apollo-server-fastify@2.23.1-unified.0 - apollo-server-hapi@2.23.1-unified.0 - apollo-server-integration-testsuite@2.23.1-unified.0 - apollo-server-koa@2.23.1-unified.0 - apollo-server-lambda@2.23.1-unified.0 - apollo-server-micro@2.23.1-unified.0 - apollo-server-plugin-base@0.11.1-unified.0 - apollo-server-plugin-operation-registry@0.9.1-unified.0 - apollo-server-plugin-response-cache@0.7.1-unified.0 - apollo-server-testing@2.23.1-unified.0 - apollo-server-types@0.7.1-unified.0 - apollo-server@2.23.1-unified.0 - apollo-tracing@0.13.1-unified.0 - graphql-extensions@0.13.1-unified.0
- apollo-cache-control@0.12.1-unified.1 - apollo-datasource-rest@0.12.1-unified.1 - apollo-reporting-protobuf@0.6.3-unified.1 - apollo-server-azure-functions@2.23.1-unified.1 - apollo-server-cloud-functions@2.23.1-unified.1 - apollo-server-cloudflare@2.23.1-unified.1 - apollo-server-core@2.23.1-unified.1 - apollo-server-express@2.23.1-unified.1 - apollo-server-fastify@2.23.1-unified.1 - apollo-server-hapi@2.23.1-unified.1 - apollo-server-integration-testsuite@2.23.1-unified.1 - apollo-server-koa@2.23.1-unified.1 - apollo-server-lambda@2.23.1-unified.1 - apollo-server-micro@2.23.1-unified.1 - apollo-server-plugin-base@0.11.1-unified.1 - apollo-server-plugin-operation-registry@0.9.1-unified.1 - apollo-server-plugin-response-cache@0.7.1-unified.1 - apollo-server-testing@2.23.1-unified.1 - apollo-server-types@0.7.1-unified.1 - apollo-server@2.23.1-unified.1 - apollo-tracing@0.13.1-unified.1 - graphql-extensions@0.13.1-unified.1
- apollo-server-azure-functions@2.23.1-unified.2 - apollo-server-cloud-functions@2.23.1-unified.2 - apollo-server-cloudflare@2.23.1-unified.2 - apollo-server-core@2.23.1-unified.2 - apollo-server-express@2.23.1-unified.2 - apollo-server-fastify@2.23.1-unified.2 - apollo-server-hapi@2.23.1-unified.2 - apollo-server-integration-testsuite@2.23.1-unified.2 - apollo-server-koa@2.23.1-unified.2 - apollo-server-lambda@2.23.1-unified.2 - apollo-server-micro@2.23.1-unified.2 - apollo-server-testing@2.23.1-unified.2 - apollo-server@2.23.1-unified.2
- apollo-cache-control@0.12.1-unified2.0 - apollo-datasource-rest@0.12.1-unified2.0 - apollo-datasource@0.8.1-unified2.0 - apollo-reporting-protobuf@0.6.3-unified2.0 - apollo-server-azure-functions@2.23.1-unified2.0 - apollo-server-cache-memcached@0.7.1-unified2.0 - apollo-server-cache-redis@1.4.1-unified2.0 - apollo-server-caching@0.6.1-unified2.0 - apollo-server-cloud-functions@2.23.1-unified2.0 - apollo-server-cloudflare@2.23.1-unified2.0 - apollo-server-core@2.23.1-unified2.0 - apollo-server-env@3.0.1-unified2.0 - apollo-server-express@2.23.1-unified2.0 - apollo-server-fastify@2.23.1-unified2.0 - apollo-server-hapi@2.23.1-unified2.0 - apollo-server-integration-testsuite@2.23.1-unified2.0 - apollo-server-koa@2.23.1-unified2.0 - apollo-server-lambda@2.23.1-unified2.0 - apollo-server-micro@2.23.1-unified2.0 - apollo-server-plugin-base@0.11.1-unified2.0 - apollo-server-plugin-operation-registry@0.9.1-unified2.0 - apollo-server-plugin-response-cache@0.7.1-unified2.0 - apollo-server-testing@2.23.1-unified2.0 - apollo-server-types@0.7.1-unified2.0 - apollo-server@2.23.1-unified2.0 - apollo-tracing@0.13.1-unified2.0 - graphql-extensions@0.13.1-unified2.0
- apollo-server-azure-functions@2.23.1-unified2.1 - apollo-server-cloud-functions@2.23.1-unified2.1 - apollo-server-cloudflare@2.23.1-unified2.1 - apollo-server-core@2.23.1-unified2.1 - apollo-server-express@2.23.1-unified2.1 - apollo-server-fastify@2.23.1-unified2.1 - apollo-server-hapi@2.23.1-unified2.1 - apollo-server-integration-testsuite@2.23.1-unified2.1 - apollo-server-koa@2.23.1-unified2.1 - apollo-server-lambda@2.23.1-unified2.1 - apollo-server-micro@2.23.1-unified2.1 - apollo-server-testing@2.23.1-unified2.1 - apollo-server@2.23.1-unified2.1
- apollo-cache-control@0.12.1-unified2.1 - apollo-datasource-rest@0.12.1-unified2.1 - apollo-reporting-protobuf@0.6.3-unified2.1 - apollo-server-azure-functions@2.23.1-unified2.2 - apollo-server-cloud-functions@2.23.1-unified2.2 - apollo-server-cloudflare@2.23.1-unified2.2 - apollo-server-core@2.23.1-unified2.2 - apollo-server-express@2.23.1-unified2.2 - apollo-server-fastify@2.23.1-unified2.2 - apollo-server-hapi@2.23.1-unified2.2 - apollo-server-integration-testsuite@2.23.1-unified2.2 - apollo-server-koa@2.23.1-unified2.2 - apollo-server-lambda@2.23.1-unified2.2 - apollo-server-micro@2.23.1-unified2.2 - apollo-server-plugin-base@0.11.1-unified2.1 - apollo-server-plugin-operation-registry@0.9.1-unified2.1 - apollo-server-plugin-response-cache@0.7.1-unified2.1 - apollo-server-testing@2.23.1-unified2.2 - apollo-server-types@0.7.1-unified2.1 - apollo-server@2.23.1-unified2.2 - apollo-tracing@0.13.1-unified2.1 - graphql-extensions@0.13.1-unified2.1
- apollo-server-azure-functions@2.23.1-unified2.3 - apollo-server-cloud-functions@2.23.1-unified2.3 - apollo-server-cloudflare@2.23.1-unified2.3 - apollo-server-core@2.23.1-unified2.3 - apollo-server-express@2.23.1-unified2.3 - apollo-server-fastify@2.23.1-unified2.3 - apollo-server-hapi@2.23.1-unified2.3 - apollo-server-integration-testsuite@2.23.1-unified2.3 - apollo-server-koa@2.23.1-unified2.3 - apollo-server-lambda@2.23.1-unified2.3 - apollo-server-micro@2.23.1-unified2.3 - apollo-server-testing@2.23.1-unified2.3 - apollo-server@2.23.1-unified2.3
03fd83d
to
7fb3df0
Compare
This is released in AS 2.24.0! We expect this to be a no-op for most users (other than a potential performance benefit). If it causes problems, please file an issue and/or contact Apollo Support; you can get behavior like the previous version by passing |
The usage reporting plugin in
apollo-server-core
is not the first tool Apollobuilt to report usage to Studio. Previous iterations such as
optics-agent
andengineproxy
reported a combination of detailed per-field single-operationperformance traces and summarized stats of operations to Apollo's
servers. When we built this TypeScript usage reporting plugin in 2018, for the
sakes of expediency we did something different: it only sent traces to Apollo's
servers. This meant that the performance of every single single user operation
was described in detail to Apollo's servers. Studio is not an exhaustive trace
warehouse: we have always sampled the traces received, making only some of
them available via Studio's Traces UI. The other traces were converted to stats
inside Studio's servers.
While this meant that the reporting agent was simpler than the previous
implementations (no need to be able to describe performance statistics), it also
meant that the protocol used to talk to Studio consumed a lot more bandwidth (as
well as CPU time for encoding traces).
This PR returns us to the world where Studio usage is reported as a combination
of stats and traces. It takes a slightly different approach than the previous
implementations: instead of reporting stats and traces in parallel, usage
reports contain both stats and traces. Each GraphQL operation is described
either as a trace or as stats, not both.
We expect this to significantly reduce the network and CPU requirements of
sending usage reports to Studio. It should not significantly affect the
experience of using Studio: we have always heavily sampled traces in Studio
before saving them to the trace warehouse, and the default heuristic for which
operations to send as traces works similarly to the heuristic used in Studio's
servers.
This PR introduces an option
experimental_sendOperationAsTrace
to allow you tocontrol whether a given operation is sent as trace or stats. This is truly an
experimental option that may change at any time. For example, you should not
rely on the fact that this will be called on all operations after the operation
is done with a full, or on its signature, or even that it exists. It is likely
that future improvements to the usage reporting plugin will change how
operations are observed so that we don't have to collect a full trace before
deciding how to represent the operation.
Some other notes:
@apollo/protobufjs
with a few improvements:js_use_toArray
option which lets you encode repeated fields fromobjects that aren't stored in memory as arrays but expose
toArray
methods. We use this so that we can build up
DurationHistogram
s andmap-like objects in a non-array fashion and only convert to array at
encoding time.
js_preEncoded
option which allows you to encode messages in repeatedfields as buffers (Uint8Arrays). This helps amortize encoding cost of a
large message over time instead of freezing the event loop to encode the
whole message at once. This replaces an old hack we used for one field with
something built in to the protobuf compiler (including correct TypeScript
typings).
--no-from-object
flag which we use to reduce the size of generatedcode (as we don't use the fromObject protobuf.js API).
similar code in Studio's servers, the flag
internal_includeTracesContributingToStats
sends the traces that contributeto stats in a special field. This is something we only use as part of our own
validation in our servers; for your graphs it will have no effect other than
increasing message size.
endpoint now tells the plugin whether traces are supported on your graph's
plan; if not supported, the plugin will switch to sending all operations as
stats (regardless of the value of
experimental_sendOperationAsTrace
) afterthe first report.
a rough estimate about how big the leaf nodes of the stats messages will be
rather than carefully counting how much space is used by each number and
histogram. We do take the lengths of all strings into account.
visualizing cache-specific stats in Studio did not work. This is now fixed.
This project was begun by @jsegaran and completed by @glasser.