-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
apollo-server-core: Update schema reporting plugin to support gateways #5187
Conversation
I haven't reviewed the code yet, because your excellent PR description made it easy to reason about the design before doing so. I think this is a bit of a non-starter:
While we can make ApolloServer pass gateway automatically when it creates the schema reporting plugin, I think requiring users to do it themselves when they are calling ApolloServerPluginSchemaReporting is too error-prone. If you leave it out then it just silently doesn't show updates? Maybe there's some way to log a warning or throw in that case but I'm not sure how. I think something like this would be better:
However instead of it being "provide gateway to plugin", I'd rather just introduce a new (I'd suggest that it takes an object with keys as an argument rather than two positional arguments, to be consistent with the rest of the plugin API. So it will be shaped differently from onSchemaChange, but that's OK?) (I think it's fine to "miss" changes that happen before serverWillStart — the plugin will get whatever the schema is at the time of serverWillStart, and any afterwards, and it's OK if it misses some schemas that predate it running. As long as no change after serverWillStart is missed!) What do you think? Let me know if you'd like me to review the code now or wait for an update based on this (or if you think the current implementation is better of course). |
@@ -117,11 +117,13 @@ export class SchemaReporter { | |||
const result = await this.reportServerInfo(sendNextWithExecutableSchema); | |||
switch (result.kind) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you'd like to factor this one out into its own PR we could get it merged fast!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, moved this change to #5222 🙂
b44416b
to
e415e50
Compare
Before this PR, the code in
I'd be down for introducing a new
Yep, that sounds fine to me.
Agreed here, should be fine to miss updates prior to
I've updated this PR in the last seven commits to introduce a new The gist of the changes is that:
Let me know if this looks fine, if so I'll modify the PR description accordingly for the new approach 🙂 |
@@ -84,6 +84,11 @@ export interface ApolloConfig { | |||
}; | |||
} | |||
|
|||
export interface GraphQLSchemaContext { | |||
apiSchema: GraphQLSchema, | |||
coreSchemaSdl?: string, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave a comment noting that this will only be left out with old versions of Gateway and that we should consider making this required for AS3 (which can definitely have a minimum version of Gateway)? (Though my long comment suggests we can probably make this required now.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This argument may still be undefined
I think, specifically if schemaDidLoadOrUpdate
is used by a plugin when a gateway isn't provided (which is the case for non-federated schema reporting).
@@ -65,6 +67,9 @@ export interface ApolloServerPlugin< | |||
} | |||
|
|||
export interface GraphQLServerListener { | |||
schemaDidChange?( | |||
schemaContext: GraphQLSchemaContext, | |||
): void; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make the correctness a lot harder to make this method async (ideally just plain Promise async as opposed to the weirder ValueOrPromise)? Making all plugin methods async is in the air for AS3 (#4999). I can imagine it might make the logic harder to implement correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Paraphrasing from a Slack convo we had with Trevor) We've decided to keep this sync, as there's concerns about listeners making expensive async calls stalling execution. If users want to do async things in response to a schema load/update, they can queue the changes and manage async queue consumption themselves.
// Note that it's important for there to be no await between this schemaDidChange() | ||
// invocation and the registration of the new schema change callback, otherwise plugins | ||
// could miss schema changes. | ||
schemaDidChange({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I continue to not like the idea of calling schemaDidChange redundantly with the serverDidStart with the same schema. This ties in to what we are discussing with @trevor-scheer over at apollographql/federation#738 (comment)
I'd really prefer "did change" to actually mean "did change" and not "here is the same information we already gave you again".
If we're concerned about the serverWillStart overlapping with update thing, that makes sense: but I think we should make the change I want in 738 and then instead of asserting that latestApiSchema
must be set, use whether or not it is set to call schemaDidChange only in the case of this overlap.
Here's a thought: what if in addition to gateway.onSchemaChange
, we also allowed you to pass an onSchemaChange handler to gateway.load
, with the semantics that this handler would be installed immediately after the initial load and it's gateway's job to make sure no changes are missed? Would that let us simplify things?
Or hmm.
Hmm.
OK, having looked at the whole PR, I do see that there is something simplifying about having a callback that is called every time the schema is updated, including the first time. Maybe my objection is just to the "changed" terminology. Can we come up with a more precise term that doesn't imply a change from a previous value? schemaDidLoad
probably implies only the first time.
If we're doing that, we could consider changing #738 so that instead of adding a second argument to the callback, we just add a new onSchemaWhatever
whose callback takes an object with keys instead of separate arguments.
And ooh! I think this will let us make the ApolloServer schemaDidWhatever take coreSupergraphSdl as a non-?
argument. The idea is that we can move the runtime check into the schema loading code from the schema reporter. We can basically do:
- If any plugin registered schemaDidWhatever, and there is a gateway, then
gateway.onSchemaWhatever
must exist or it's an immediate startup error. (ie, it's an error to combine schema reporting with an old gateway, which shouldn't break any existing users since that's already an error.) - However, if no plugin registered schemaDidWhatever, then we can fall back to
gateway.onSchemaChange
if onSchemaWhatever doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the name to schemaDidLoadOrUpdate
and changed apollographql/federation#738 to have a new onSchemaLoadOrUpdate
method that takes an object with keys/properties.
And ooh! I think this will let us make the ApolloServer schemaDidWhatever take coreSupergraphSdl as a non-
?
argument.
I think that argument still ends up needing to be ?
to be for the non-federated schema reporting case.
The idea is that we can move the runtime check into the schema loading code from the schema reporter. We can basically do:
- If any plugin registered schemaDidWhatever, and there is a gateway, then
gateway.onSchemaWhatever
must exist or it's an immediate startup error. (ie, it's an error to combine schema reporting with an old gateway, which shouldn't break any existing users since that's already an error.)
Sounds good, I've added a check for this near the end of _start()
. Note that I'm beginning to think no error is currently thrown when trying to use the schema reporting plugin with a gateway (or at least, I don't see any code that throws when that's the case). However, our autoreg pipeline silently throws away reports from gateways.
- However, if no plugin registered schemaDidWhatever, then we can fall back to
gateway.onSchemaChange
if onSchemaWhatever doesn't exist.
We have to register gateway listeners early in _start()
before we ask plugins whether they provide schemaDidLoadOrUpdate
, so the new code just uses schemaDidLoadOrUpdate
if available or falls back to onSchemaChange
(regardless of plugin registration).
if (overrideReportedSchema !== undefined) { | ||
if (isFederatedSchemaReporting) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure schemaIsFederated is looking for the case that this is a subgraph schema, so neither of the places you use this bool seem right.
Are we planning to do anything about schema reporting from subgraphs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, good catch! I've reverted this part of the code, so that we use schemaIsFederated
only to throw for schemas that look like implementing service schemas.
Are we planning to do anything about schema reporting from subgraphs?
I'm not in the loop enough about future plans there. For now I think we intend to keep the status quo, but we might start having reporting there in the future.
}); | ||
unsubscribe = maybeGateway.onSchemaChange( | ||
(apiSchema, coreSchemaSdl) => { | ||
schemaDidChange({ apiSchema, coreSchemaSdl }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this needs to be better synchronized with the logic in the existing onSchemaChange. eg, we shouldn't call it if we don't actually update the schema because we're stopping or something, and we shouldn't call it until after the schema is actually updated (ie, schemaDerivedData is successfully changed). Probably this means we need to save the handlers somewhere and call them from the existing onSchemaChange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree here, I've changed _start()
to add listeners to a set and the gateway listener created in startGatewayAndLoadSchema()
to notify listeners in the set after the schema derived data is updated.
49e83ff
to
ec7f307
Compare
@glasser @trevor-scheer
Once the code's approved, I'll update the PR description and the |
bfe2e88
to
ea1092e
Compare
Update here on testing/dogfooding. Gateway schema reporting is currently running in Studio staging, and appears to be working fine for both the old GCS pipeline and the new configdelivery pipeline. I've locally tested gateways with a service list (to test the |
loadedSchema = true; | ||
this.state = { | ||
phase: 'invoking serverWillStart', | ||
barrier, | ||
schemaDerivedData, | ||
// We can't just use the variable schemaDerivedData here, as any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, this complexity is exactly why I wanted to differentiate between initial load and later updates, and why my instinct was to move away from "use the same callback for initial load and later update". I feel like the old logic of "it's the job of gateway.load to ensure that at the moment it resolves its promise, that's the current schema, and as long as you don't yield between await load()
and updating the state and starting to listen for changes, it works out.
I don't want to jerk you back and forth here... but I thought the point of doing onSchemaLoadOrUpdate was to make this stuff less complex, not more complex. This is really hard to follow right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, this complexity is exactly why I wanted to differentiate between initial load and later updates, and why my instinct was to move away from "use the same callback for initial load and later update". I feel like the old logic of "it's the job of gateway.load to ensure that at the moment it resolves its promise, that's the current schema, and as long as you don't yield between
await load()
and updating the state and starting to listen for changes, it works out.
I'm not sure separating initial load and later updates would help for this specific problem. Specifically, the following can happen:
- While executing
_start()
,await gateway.load()
runs, which executesgateway.load()
. - At some point during the execution of
gateway.load()
, it returns a promise that resolves to the initial schema, and execution suspends while waiting on that promise. Let's suppose the promise resolves at time A. - At some point later (time B),
_start()
resumes execution (now that the promise has resolved, execution is unblocked). - Between A and B, it's possible for other code to execute, including the code that updates the schema.
Basically it's hard to keep the initial schema non-stale when it's dynamic.
I don't want to jerk you back and forth here... but I thought the point of doing onSchemaLoadOrUpdate was to make this stuff less complex, not more complex. This is really hard to follow right now.
FWIW, I think once AS3 lands and we can require gateways to define onSchemaLoadOrUpdate
, this changes to a:
this.state = {
// ...
schemaDerivedData: this.state.schemaDerivedData,
}
like the other changes to this.state
, and this.state.schemaDerivedData
effectively becomes the source of truth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could do a (noninvasive) "polyfill"? Like we can define a wrapper "gateway" which, if gateway.onSchemaLoadOrUpdate doesn't exist, we wrap around the gateway. It passes through methods other than onSchemaLoadOrUpdate and load, and handles onSchemaLoadOrUpdate by delegating to onSchemaChange and also keeping it around to call after load? So we could put all the back compat logic in this one wrapper class, and keep the main startup logic clean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glasser I ended up going with something a bit more general than a polyfill here, it's at packages/apollo-server-core/src/utils/schemaManager.ts
(it wraps gateway and provides onSchemaLoadOrUpdate
, but it also wraps schemas in the non-gateway case and is responsible for updating schema-derived data).
| { | ||
phase: 'starting'; | ||
barrier: Resolvable<void>; | ||
schemaDerivedData?: SchemaDerivedData; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, the difference between the "starting" and "invoking serverWillStart" states is that the latter has the (initial) schema set up and derived, and the former doesn't. I'm not sure what the point is of having them as separate states otherwise. I guess you could merge them. But really I think it shouldn't be rocket science for us to get it working with distinct states that represent distinct moments in moving forward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up eliding out the starting
state since it was never distinguished from invoking serverWillStart
in the updated PR.
// It is crucial there not be any await statements between this | ||
// read and the addition of a listener to the listener set, or else | ||
// schema updates could be missed by listeners. | ||
const latestSchemaDerivedData = this.state.schemaDerivedData; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all far too complicated for me to understand or have any hope of successfully maintaining. Are there other approaches we can use to deal with the possibility of receiving schema change updates during startup?
For example, could we just buffer them? If a schema change comes in during 'invoking serverWillStart' we save it to some field (and I think it's fine if we just have one field rather than an array — if two schema changes come in we can just ignore the non-last one), and after serverWillStart is done we can fire off any buffered schema?
ie, that would let us just consistently use const schemaDerivedData
for all relevant calls during the course of _start
. And instead of changing the hook from "ignore in non-started state" to "ignore in all but three states", it would be "buffer in the starting/serverWillStart states, call in started, ignore otherwise".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all far too complicated for me to understand or have any hope of successfully maintaining. Are there other approaches we can use to deal with the possibility of receiving schema change updates during startup?
In the refactor, I've tried to address this by confining most of this code to non-async functions, which I feel makes it easier to reason about since such functions can't be suspended during execution.
For example, could we just buffer them? If a schema change comes in during 'invoking serverWillStart' we save it to some field (and I think it's fine if we just have one field rather than an array — if two schema changes come in we can just ignore the non-last one), and after serverWillStart is done we can fire off any buffered schema?
I added some buffering in the SchemaManager
class, there's more details about it in the SchemaManager.onSchemaLoadOrUpdate
docstring.
25b2fa2
to
6df9516
Compare
fe9bf93
to
6594f16
Compare
@glasser Finally got some time to come back to this. I've wrapped all modes of operation (old gateway, new gateway, and non-gateway) into a class
After moving Let me know if this looks mostly fine, and if so I'll update the PR description. |
Can you rebase onto |
6594f16
to
5eabab2
Compare
@glasser Sounds good, I've rebased this branch on |
…bug in error message (the hook's name was wrong)
…Manager.onSchemaLoadOrUpdate() so that we can less-fragily try/catch
… core schema instead of throwing
…nce if the schema is overridden
… for it eventually being removed
…is called after stop using a workaround
…maManager.start()
…te() to include needed gateway version
…maLoadOrUpdateEvent()
… gateway won't result in failure
… called concurrently with start() (or before start()).
…e for gateways yet, because Apollo Studio does not support it.
9fe87b2
to
f5241eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Let's make apollographql/federation#738 mergeable before we merge this and merge them both when we're both free together.
To summarize, this PR:
schemaDidLoadOrUpdate
event hook.Note that this PR is dependent on apollographql/federation#738 :
This PR specifically:
GatewayInterface
to have a new methodonSchemaLoadOrUpdate
, for registering listeners that are notified on both schema load and schema update, and for providing the core schema in addition to the API schema.onSchemaChange
method has been marked deprecated and changed to optional, to support its eventual removal from the gateway API.SchemaManager
for consolidating schema-tracking logic and abstracting away the gateway. This class specifically:async
-safe, i.e. it ensures linearizability for the TS class's public methods (providedstart()
is called and completes beforestop()
).schemaDidLoadOrUpdate
event hook so that plugins can listen to schema load/update events.ApolloServerBase._start()
registers such event hooks as schema load/update event listeners for theSchemaManager
.ApolloServerPluginSchemaReporting
to use the new plugin event hook.SchemaReporter
instance and starts it after stopping the oldSchemaReporter
instance (if it exists).ServerState
that are no longer necessary due toSchemaManager
.initialized
states have been consolidated, and theinvoking serverWillStart
state has been removed.schemaDerivedData
has effectively been replaced byschemaManager
, whileloadedSchema
has been removed entirely.start()
andstop()
to resolve in a particular way to instead manually wait onstart()
before callingstop()
.