Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(gatsby-source-contentful): dont re-create nodes #28642

Merged
merged 6 commits into from
Dec 22, 2020
Merged

Conversation

pvdz
Copy link
Contributor

@pvdz pvdz commented Dec 15, 2020

A warm build will currently still attempt to create the node even if it already exists. This PR will do an early check, compare the digest (entry.sys.updatedAt), and return early if the node already exists for that digest.

There might be a simpler solution but for now, this cuts the warm "no changes" build for a 2 million node site down from to about 894s (120s for node createion step) down to 512s (35s for the node creation step), saving about 5 minutes.

Need verification from Contentful that this approach is sound. But I think it is. (Tests will fail too, will fix them before we merge)

@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Dec 15, 2020
@wardpeet
Copy link
Contributor

This makes code harder to reason with. The real issue is inside gatsby-node where we merge existing nodes together with new nodes. That part needs to refactor so I'm inclined to close that PR for that reason

@wardpeet
Copy link
Contributor

I was missing a lot of context on the why here. I'm fine with this patch as it does improve performance. If we could run some tests to make sure references and all are still being updated 👍

@LekoArts LekoArts added topic: source-contentful Related to Gatsby's integration with Contentful topic: performance Related to runtime & build performance and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels Dec 17, 2020
@axe312ger axe312ger self-requested a review December 17, 2020 10:04
@zcei
Copy link

zcei commented Dec 17, 2020

Heja there 👋

got hinted to this PR by @axe312ger, I'm one of Contentful's backend engineers and want to increase your confidence into the solution. 🙂

For a single given Contentful entity, a write towards it will update its .sys.updatedAt property. This is even the case for no-change writes (i.e. a PUT to an entity with the same payload as its current shape).

That means using the .sys.updatedAt property for cache invalidation even gives users the ability to easily invalidate single nodes without changing content (why ever this might be useful 🤷‍♂️ ).

Phrased differently:

  • same updatedAt the content of two versions of one entity is the same.
  • different updatedAt the content of two versions of one entity might be the same

Hope this helps!

@axe312ger
Copy link
Collaborator

Rerunning CI

Copy link
Collaborator

@axe312ger axe312ger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot: merge on green Gatsbot will merge these PRs automatically when all tests passes topic: performance Related to runtime & build performance topic: source-contentful Related to Gatsby's integration with Contentful
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants