Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breadth-first-traversal for PR #367 (allows for thunk-based dataloader batching) #388

Merged
merged 5 commits into from
Sep 10, 2018

Conversation

edpark11
Copy link

@edpark11 edpark11 commented Sep 2, 2018

This is mostly a copy of the excellent PR#367 with one important difference: an implementation of a true breadth-first traversal. As with #367 , it makes no changes to the public API.

PR #367 actually executes (for the most part) as a depth-first execution of thunks because dethunkMap goes depth-first after the first layer. In this PR, I implemented a classic queue-based breadth-first traversal for resolving thunks. This is important because a breadth-first traversal is the way that 99% of query batching would happen in n+1 query scenarios-- i.e., e.g. choose all customers, then all affiliations of those customers to stores, then all names of stores. In this case, we want to batch up the affiliations, then all names of stores in a breadth-first traversal. As per #367, no new go funcs are introduced.

Many thanks to @jjeffery @ccbrown and @chris-ramon for their excellent work and comments (this is just building on those).

FWIW, I tested this on my server using Nick Randall's dataloader for batching of thunks and it works just as efficiently as #213 with the advantage that goroutines are not required.

On the way to building this, I did a ton of research on what other graphql implementations were doing. For example, the problem that I was trying to solve from #367 is documented here from the dotnet implementation: graphql-dotnet/graphql-dotnet#537

They ended up solving it like this: graphql-dotnet/graphql-dotnet#539

Looking at the reference implementation of graphql-js (https://github.com/graphql/graphql-js/blob/master/src/execution/execute.js), read queries in fact have an implicit breadth-first implementation strategy. Pasted the important piece of the reference graphql-js implementation below. The important piece is to understand is what is being done differently with executeFields and executeFieldsSerially. The big difference is that executeFields returns a promiseForObject (https://github.com/graphql/graphql-js/blob/master/src/jsutils/promiseForObject.js), which runs a Promise.all. Promise.all does not run serially-- it fires off execution for each subobject in parallel, which is an implicit breadth-first traversal (see https://stackoverflow.com/questions/30823653/is-node-js-native-promise-all-processing-in-parallel-or-sequentially). executeFieldsSerially does not call Promise.all, which means all promises are executed serially (depth-first). This is necessary to conform to spec for mutations (TODO: just realizing this means I need to fix this implementation to allow both strategies). Fixed so executeSerially does a depth-first descent.

So long story short: I think the closest we can get to the graphql-js reference implementation would be to do a depth-first traversal of thunks for mutations and a breadth-first traversal of thunks for gets. Per @jjeffery 's original notes, async and execution order are two different things, and folks can fire off go funcs in resolvers if they want. But my guess from reading a ton of other implementations is that this will get us most of what we want in a safe way.

graphql-js reference implementation:

/**
 * Implements the "Evaluating operations" section of the spec.
 */
function executeOperation(
  exeContext: ExecutionContext,
  operation: OperationDefinitionNode,
  rootValue: mixed,
): MaybePromise<ObjMap<mixed> | null> {
  const type = getOperationRootType(exeContext.schema, operation);
  const fields = collectFields(
    exeContext,
    type,
    operation.selectionSet,
    Object.create(null),
    Object.create(null),
  );

  const path = undefined;

  // Errors from sub-fields of a NonNull type may propagate to the top level,
  // at which point we still log the error and null the parent field, which
  // in this case is the entire response.
  //
  // Similar to completeValueCatchingError.
  try {
    const result =
      operation.operation === 'mutation'
        ? executeFieldsSerially(exeContext, type, rootValue, path, fields)
        : executeFields(exeContext, type, rootValue, path, fields);
    if (isPromise(result)) {
      return result.then(undefined, error => {
        exeContext.errors.push(error);
        return Promise.resolve(null);
      });
    }
    return result;
  } catch (error) {
    exeContext.errors.push(error);
    return null;
  }
}

/**
 * Implements the "Evaluating selection sets" section of the spec
 * for "write" mode.
 */
function executeFieldsSerially(
  exeContext: ExecutionContext,
  parentType: GraphQLObjectType,
  sourceValue: mixed,
  path: ResponsePath | void,
  fields: ObjMap<Array<FieldNode>>,
): MaybePromise<ObjMap<mixed>> {
  return promiseReduce(
    Object.keys(fields),
    (results, responseName) => {
      const fieldNodes = fields[responseName];
      const fieldPath = addPath(path, responseName);
      const result = resolveField(
        exeContext,
        parentType,
        sourceValue,
        fieldNodes,
        fieldPath,
      );
      if (result === undefined) {
        return results;
      }
      if (isPromise(result)) {
        return result.then(resolvedResult => {
          results[responseName] = resolvedResult;
          return results;
        });
      }
      results[responseName] = result;
      return results;
    },
    Object.create(null),
  );
}

/**
 * Implements the "Evaluating selection sets" section of the spec
 * for "read" mode.
 */
function executeFields(
  exeContext: ExecutionContext,
  parentType: GraphQLObjectType,
  sourceValue: mixed,
  path: ResponsePath | void,
  fields: ObjMap<Array<FieldNode>>,
): MaybePromise<ObjMap<mixed>> {
  const results = Object.create(null);
  let containsPromise = false;

  for (let i = 0, keys = Object.keys(fields); i < keys.length; ++i) {
    const responseName = keys[i];
    const fieldNodes = fields[responseName];
    const fieldPath = addPath(path, responseName);
    const result = resolveField(
      exeContext,
      parentType,
      sourceValue,
      fieldNodes,
      fieldPath,
    );

    if (result !== undefined) {
      results[responseName] = result;
      if (!containsPromise && isPromise(result)) {
        containsPromise = true;
      }
    }
  }

  // If there are no promises, we can just return the object
  if (!containsPromise) {
    return results;
  }

  // Otherwise, results is a map from field name to the result of resolving that
  // field, which is possibly a promise. Return a promise that will return this
  // same map, but with any promises replaced with the values they resolved to.
  return promiseForObject(results);
}

@coveralls
Copy link

coveralls commented Sep 2, 2018

Coverage Status

Coverage decreased (-0.2%) to 91.622% when pulling 75ee0d1 on edpark11:367-with-bfs into ef7caf8 on graphql-go:master.

@edpark11 edpark11 changed the title Implemented true breadth-first-traversal for PR #367. Implemented true breadth-first-traversal for PR #367 (thunk-based batching) Sep 2, 2018
@edpark11 edpark11 changed the title Implemented true breadth-first-traversal for PR #367 (thunk-based batching) Implemented true breadth-first-traversal for PR #367 (allows for thunk-based dataloader batching) Sep 2, 2018
@edpark11 edpark11 changed the title Implemented true breadth-first-traversal for PR #367 (allows for thunk-based dataloader batching) Breadth-first-traversal for PR #367 (allows for thunk-based dataloader batching) Sep 2, 2018
@chris-ramon
Copy link
Member

This very awesome! 👍 — thanks a lot for working on this @edpark11, I def. agree that this is the right path for us to take.

After describing the possible solutions on #389, and receiving tons of incredible feedback in the related PR's and Issues, unanimous shows that we agree on extending Thunks to support concurrent resolvers, which enables such a great features such as batching via a great lib like dataloader.

I have put together a working example that leverages a real-use case described in detail by @edpark11, which I personally used to do lots of testing against this PR.

Merging this one, looking forward what we will accomplish together as graphql-go/graphql lib users! — thanks a lot to all the awesome guys that made this possible.

@chris-ramon chris-ramon merged commit b3d68b1 into graphql-go:master Sep 10, 2018
@nicksrandall
Copy link

FWIW, I think this is awesome! Let me know if you run into any problems using https://github.com/graph-gophers/dataloader

@edpark11
Copy link
Author

Thanks, @nicksrandall @chris-ramon @ccbrown @jjeffery ! Happy for the team effort getting this one over the line!

chris-ramon added a commit to chris-ramon/gqlgen that referenced this pull request Sep 12, 2018
@ccbrown
Copy link
Contributor

ccbrown commented Sep 15, 2018

Just want to add my appreciation for this. I've been leveraging it pretty heavily since it was merged, and the batching this facilitates really makes an absurd difference in performance in many scenarios. 😄

@jjeffery
Copy link
Contributor

Thanks @ccbrown , that means a lot to me. Regards John.

@edpark11
Copy link
Author

Just adding a note... we've been using this is production for a month now with no issues. Massively increases the speed and simplicity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants