-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build stuck at running jobs (image transformation) #34051
Comments
I have exactly the same going on. I am on Standard/Standard plan and since 2 days fresh build will take forever. Sometimes it will finish after 25-30 minutes.
I wanted to open an issue, but I guess we have something similar. I am also using createRemoteFileNode to download and optimise remote images. I will run a build without that to see what happens. |
@NickBarreto @DennisKraaijeveld can you both please post the URL to a failed build where you see this? Then we can on our side check it out. |
On my side it is not failing 100% of the times. But buildtimes have gone from 5 to 24-30 minutes. This build is now going on: https://www.gatsbyjs.com/dashboard/c9ba2b9c-76c6-4e2e-94b6-047574fb963f/sites/d18769be-381e-458e-b4fb-d59bbc168935/builds/88ec953a-2dac-41da-8696-f07b589001cf/details this one did finish after 25 minutes with the same problem: https://www.gatsbyjs.com/dashboard/c9ba2b9c-76c6-4e2e-94b6-047574fb963f/sites/d18769be-381e-458e-b4fb-d59bbc168935/builds/fd5afb99-0d11-43c8-af8a-0624b7a748e3/details |
Sure thing. Here's a Gatsby Cloud build that failed with this error: https://www.gatsbyjs.com/dashboard/e1ae5b97-e312-4fe3-88f2-5f4f81ac0d9d/sites/1703f5eb-05d0-46ad-8fdf-e1f7129e83b7/builds/74b00a67-fbe3-4048-9f78-8150895fc298/details This is the exact same build, triggered manually, not clearing cache, immediately after, which built successfully in 6 minutes: https://www.gatsbyjs.com/dashboard/e1ae5b97-e312-4fe3-88f2-5f4f81ac0d9d/sites/1703f5eb-05d0-46ad-8fdf-e1f7129e83b7/builds/beb61dcb-be64-4cc5-844a-d6cd3f1a2326/details There were no changes in the codebase between these two builds, but one failed and the other did not. It's also not failing every time for me on Gatsby Cloud, although as I said on Netlify it is nearly every time. I suspect that may be to do with resources in the build machine. |
@LekoArts I scanned through my builds, and the build without the remote images (onCreateNode, createSchemaCustomization) did run for 4 minutes.. Might be helpful information EDIT: Never mind. Found a build yesterday without remote images, building forever as well with exactly the same issues: |
Thanks for providing the URLs. We've looked at the builds from @NickBarreto @DennisKraaijeveld and in summary these are the findings:
|
Hi @LekoArts, thanks so much for the information. What would you advise as a next step? Watch this PR until it is merged into a release, then upgrade to that release and do a few further builds to gather more diagnostic details? Is there any other way in which I could contribute? |
Following because I get this problem a lot. |
@NickBarreto (and other folks following this issue)
I did publish "canary" ( So in short, we don't need more information from you folks (at least about being stuck on image generation in Gatsby Cloud), we already can reproduce and are in process of tracking down the problem and we will post update here once we find the problem, implement a fix and have reasonably high level of confidence that the fix is correct ( we can never be 100% sure due to intermittent nature of the problem ) |
Oh, and more thing: We also found that diagnostic message printing information about "activities" in progress is not always fully correct.
We do see messages like this mentioning only |
Yesterday we published new version of Gatsby Cloud build runner image with fixes, migrated our test site to use it and were monitoring behaviour overnight. We didn't see problems anymore on our test site - it did handle over 300000 jobs successfully in that time (before that fix, it would get stuck at most around 60000 jobs, but more often it was getting stuck much quicker than that) We are rolling out this update to all sites now. Please note that migration won't happen if the site is busy (like constantly rebuilding), so good way to give a chance to migrate is to temporarily disable builds in Site Settings -> Builds (for ~5 minutes) and re-enable them after that. |
Thanks! @everyone :) |
@pieh Thanks for the update. And how about local builds? I have the same problem running this Gatsby Cloud is still failing for me as well: |
I have read the thread but I'm not using Gatsby Cloud. Currently migrating from v3 to v4 and testing everything on local and now this quite often happens on gatsby build.
Never had issues before, it's a moderate site, definitely nothing large. Is there a way to get more debug info here what's going on? Edit: played around with |
I got more debug information when I upgraded
|
@askibinski if you are hitting this issue locally - could you manually edit a file in your node_modules - And instead of just throw new Error(`Assertion failed: all worker queries are not dirty (worker #${workerId})`); Let's add information on our state that assertion fails on: throw new Error(`Assertion failed: all worker queries are not dirty (worker #${workerId})\n${require(`util`).inspect(queryStateChunk, { depth: Infinity })}`); This should additionally print information like one below alongside assertion error:
I currently have no idea how we end up in situation like that. Possibly something fails earlier and we swallow/ignore error? Or maybe we have some stale state? |
@pieh So apparantly I still had an old Image (image.js) component laying around from an earlier version/iteration which was used in one place and that debug info showed me:
Adding that debug info by default might help a lot of people migrating and running into an issue like this. |
We're trying to upgrade to 4.4 from 3 as well and are running into this exact issue - both in Gatsby Cloud https://www.gatsbyjs.com/dashboard/e156da66-cda0-4df5-b3c0-a7fdca6bf65e/sites/43774e74-f15a-4923-b6f7-d215d0ba104b/builds/e82328f4-29ff-44bc-945e-81b886afd8f8/details#rawLogs and locally: success Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs - 211.318s - 2175/2175 10.29/s ERROR Assertion failed: all worker queries are not dirty (worker #3)` |
@pieh any thoughts on #34051 (comment)? Right now I'm using a setting with |
@bkaldor and others finding this issue: I guess there might be different reasons the build stops/stalls and this issue can get messy. I summarized below:
|
@askibinski As I said in my two last comments, the image processing on Gatsby Cloud is not fixed. I'm facing timeouts every time I use WEBP with some fallback (NO_CHANGE, AUTO, PNG or JPG). And I'm also facing the same issue on local Gatsby build (GitHub Codespaces). I'll try to play with these environment variables you suggested in order to fix the local issue, but the Gatsby Cloud issue still persists. |
I resolved the problem I was having with Gatsby Cloud build fails during image processing. As I read through the comments here, it appears that my situation could be different than most, but I figured my situation might be helpful to someone with a similar problem who landed on this issue discussion. Gatsby Cloud did not specifically say why the build was stopping, other than to give an obscure message: "Failed to validate error Error [ValidationError]: "name" is not allowed.” Eventually, I discovered three image files were being referenced in my Drupal backend's database but were missing from the files directory. When I removed the database references, I stopped getting stuck build attempts. Oddly, I started using Gatsby a year ago and the problem didn't appear until a few weeks ago, even though the files had always been missing from my backend. |
Currently migrating from v3 to v4 on local and I also get this error on gatsby build. I added the snippet that @pieh suggested but the files that were flagged as
and
|
Just had another build fail in Gatsby Cloud which was because gatsby-plugin-sharp never finished it's jobs. Not sure if it is at all helpful in diagnosing further, but the issue still persists. We usually build once a day or so to incorporate any recent changes, and I've not had builds outright fail for a while until today. |
my gatsby version is v4.13.1, it works well in macOS v12.3.1. but crashed in Github Action. here is the logs: success Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs - 115.850s - 168/168 1.45/s
error Assertion failed: all worker queries are not dirty (worker #1)
Error:Assertion failed: all worker queries are not dirty (worker #1)
- queries.ts:395 assertCorrectWorkerState
[zhanglun.github.io]/[gatsby]/src/redux/reducers/queries.ts:395:13
- queries.ts:228 queriesReducer
[zhanglun.github.io]/[gatsby]/src/redux/reducers/queries.ts:228:7
- redux.js:536 combination
[zhanglun.github.io]/[redux]/lib/redux.js:536:29
- redux.js:296 next
[zhanglun.github.io]/[redux]/lib/redux.js:296:22
- index.ts:72
[zhanglun.github.io]/[gatsby]/src/redux/index.ts:72:68
- index.js:27 Object.dispatch
[zhanglun.github.io]/[redux-thunk]/lib/index.js:27:16
- pool.ts:117 mergeWorkerState
[zhanglun.github.io]/[gatsby]/src/utils/worker/pool.ts:117:13
- build.ts:305 build
[zhanglun.github.io]/[gatsby]/src/commands/build.ts:305:11
not finished Merge worker state - 0.0[97](https://github.com/zhanglun/zhanglun.github.io/runs/6247339375?check_suite_focus=true#step:8:97)s |
Probably workaround is to add env var: |
@NickBarreto we fixed the issue, you were seeing on the cloud side. |
I'm using gatsby v3.14.6 and I get an error when trying to use the debug env variable This is the error:
Any development on solving this issue? EDIT: My mistake. I put the variable in the wrong place. I put it in the .env file instead of the package.json script |
We just started experiencing this issue yesterday. We're on Gatsby 3.14.0 using a Wordpress backend.
|
My build takes about 4h as we have 8k images and somehow the incremental build does not work. So, every build takes 4h. Is there a way to not process images at all? |
I am getting this issue as well. Here is the build logs from Gatsby Cloud => https://www.gatsbyjs.com/dashboard/3968ecf3-ded8-4641-ad25-c4801a6f0d9c/sites/730a6654-269f-44cf-9d59-d0a78b2c1906/builds/67e06291-2707-4baa-8885-d63ae8c99aed/details?returnTo=%2Fdashboard%2F3968ecf3-ded8-4641-ad25-c4801a6f0d9c%2Fsites%2F730a6654-269f-44cf-9d59-d0a78b2c1906%2FcmsPreview thanks ! Gets to here and then hangs and then times out Gatsby is in "IN_PROGRESS" state without any updates for 300.000 seconds. Activities preventing Gatsby from transitioning to idle state:
Activity "build" of type "hidden" is currently in state "IN_PROGRESS"
Activity "Building static HTML for pages" of type "progress" is currently in state "IN_PROGRESS"
Process will be terminated in 1500.000 seconds if nothing will change. |
I just ran into this issue with my blog, hosted on Vercel. I found a workaround for Vercel and spotted a couple of things that might be of interest to anyone working on this issue. Background: I'm writing a plugin that sources my Instagram posts so they can be included in my blog alongside regular markdown posts. A markdown node is created for each Instagram post; the images are downloaded by I develop on macOS (M1) and didn't encounter this issue until I tried to deploy on Vercel. The error message is similar to those reported above:
Scouring the comments here, I was able to reproduce locally by setting the environment variable Surprisingly, the same trick works on Vercel: simply override the default build command (find the Project Settings page then jump down to Build & Development Settings) then my site deploys just fine:
I can't find much documentation on Vercel's build environment but I presume it's a tiny VM running somewhere where A couple of observations that may help debug this issue:
|
Same problem on 4.22.0, in my case the error "Assertion failed: all worker queries are not dirty" happens because of an existing cache. |
Stuck with same issue and it is still running from last 4+ hours. |
@engineergit I ran into the same issue as @stephzero1 did 👆🏻 up there using Gatsby: 4.19.2, Node: 16.14.2, npm: 8.5.0, macOS: 12.4. Turns out it's related to What resolved this for me was to simply run |
I was troubled by this problem for a long time, until I upgraded the version of Gatsby from 4.x to 5.x today, and upgraded all gatsby-xxx-xxx used in gatsby-config.js to the latest version, this problem solved 😆 |
Ugh, I'm not yet in a position to upgrade to v5 and still seeing this error when running a build inside of a Docker container:
I've tried upping the number of CPUS (both at the container and gatsby level), but never seem able to get past it. Is there any hope at all? |
Sharing our similar problem and how we solved it. In our case we had some page URLs generated with special symbols like We had to go into |
I am getting this error as well on gatsby v5.
I realized this file is causing the issue:
but can't seem to pinpoint why that might be the case. was working on previous version of gatsby (3) |
Also seeing this 👀 |
I am currently in the process of upgrading my old Gatsby v2 to v5. I'm essentially rewriting everything from scratch, addressing deprecations and making necessary changes. I've made good progress so far, but I've encountered an issue while running "gatsby build." During the "gatsby build" process, it gets stuck at this point: [====================== ] 1.742 s 9/11 82% Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs Interestingly, "gatsby develop" is working fine, but the build process is getting stuck. tried all these solutions and still no luck. |
Updating the gatsby-plugin-image version to "next" solved the issue for me.
Using Gatsby v5 |
If you're coming new to this issue, please see this first: #34051 (comment)
Preliminary Checks
Description
Gatsby's build process is hanging and not completing. I suspect the issue is with Sharp, as my site has quite a few images, and I saw this brought up in a previous issue, #33557.
When I upgraded to v4 I initially had no issues. However, the next day my builds all started going exceeding Netlify's maximum build time of 30 minutes.
I mentioned this problem in the thread to the other issue, as others apparently had the same problem where
run queries in workers
seems to take longer than expected.This issue is difficult to reproduce because I think in part it is to do with the scale of my site, which is moderately large and has ~1600 images. There must be something that isn't quite right in the worker process because my builds on netlify went from roughly taking around 13 or 14 minutes, to exceeding the build limit every time.
To try and diagnose the issue I tried a local build, which while it took a long-ish time, did actually complete
Since @LekoArts suggested that Gatsby Cloud's build process is better optimised for processing images, I thought I'd give that a go.
After trying out a build in Gatsby Cloud, I had no build problems at all and the whole site build with a clear cache in 7 minutes. OK, I thought, seems like the problem isn't so much with Gatsby, but in how Netlify is interacting with v4's worker process.
However, the next push I ran into the problem once again, this time in Gatsby Cloud. The bottom end of Gatsby Cloud's logs are useful, because they give me a little more information than Netlify:
The fact that a full, uncached build on Gatsby Cloud can run in 7 minutes, suggests to me that actually the issue isn't one of scale, but that the worker process is hanging, but only sometimes.
Is it to do with incremental builds? Maybe. I am using the preserved download cache, because as I said my site has quite a few images which are coming from a custom source plugin (which is relatively simple, and contains all the image links from AWS that are passed over to createRemoteFileNode).
To test things out once I had the first timeout on Gatsby Cloud, I tested a manual deploy without clearing the cache. I was hoping the process would hang again so I'd know the issue was with the cache and incremental builds, but alas, it did not. The whole build was completed in 6 minutes. Strangely, the issue does appear to occur on Netlify more frequently than not, and happens more occasionally in Gatsby Cloud. It may be to do with build process resources, because I just signed up to Gatsby Cloud, and so am in the free preview of performance builds.
Are there other diagnostic tools I can use to more closely inspect the build process? How would I be able to see which process is failing or never finishing?
Reproduction Link
I can't seem to reproduce this error as it is intermittent
Steps to Reproduce
gatsby build
in either Netlify or Gatsby CloudExpected Result
gatsby build
should eventually finish and build the siteActual Result
The state
run queries in workers
never finishes/moves on tomerge worker state
, the build eventually times out and fails.Environment
Config Flags
PRESERVE_FILE_DOWNLOAD_CACHE: true
The text was updated successfully, but these errors were encountered: