Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

next/og cause memory leak in production standalone build #65451

Closed
Innei opened this issue May 7, 2024 · 26 comments · Fixed by #70214
Closed

next/og cause memory leak in production standalone build #65451

Innei opened this issue May 7, 2024 · 26 comments · Fixed by #70214
Assignees
Labels
bug Issue was opened via the bug report template. Image (next/image) Related to Next.js Image Optimization. locked

Comments

@Innei
Copy link

Innei commented May 7, 2024

Link to the code that reproduces this issue

https://github.com/Innei/next-og-oom-repro

To Reproduce

  1. build project
  2. run as standalone build node server.js
  3. open /og
  4. open devtools and force refresh the page(ignore cache) more times.
  5. watch the system monitor, and the next app memory usage is increasing.

The initial memory usage about to 50M, and refresh /og about 10 times, got 300M
CleanShot 2024-05-07 at 8  26 37@2x

CleanShot 2024-05-07 at 8  30 54@2x

I can provide some ways to try to troubleshoot memory issues.

Add memory dump code in .next/standalone/server.js

process.title = 'next-og-oom'

const v8 = require('v8')
const fs = require('fs')

function createHeapSnapshot() {
  const snapshotStream = v8.getHeapSnapshot()
  const timestamp = new Date().toISOString().replace(/[:\.]/g, '-')
  const fileName = `/tmp/${timestamp}.heapsnapshot`
  const fileStream = fs.createWriteStream(fileName)
  snapshotStream.pipe(fileStream).on('finish', () => {
    console.log('Heap snapshot saved to', fileName)
  })
}

process.on('SIGUSR2', () => {
  console.log('SIGUSR2 received, creating heap snapshot...')
  createHeapSnapshot()
})

And then refresh the page and refresh the page several times and observe the app memory usage afterwards. When memory overflows and is not freed, hit the heap of memory at that point with kill -SIGUSR2 <pid>.

As you can see from the following dump, it's the ImageResponse-related modules that are leaking memory. ImageResponse and FigmaImageResponse that means @vercel/og causes memory leak?

Link to #44685 (comment).

Current vs. Expected behavior

  1. Memory does not free up and increases more and more, finally oom.

expected:

Memory can be freed up.

Provide environment information

Operating System:
  Platform: darwin/linux
  Arch: arm64/amd64
  Version: Darwin Kernel Version 23.4.0: Fri Mar 15 00:12:49 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T6020/ or Linux
  Available memory (MB): 32768
  Available CPU cores: 12
Binaries:
  Node: 18.18.0/20.x
  npm: 10.2.4
  Yarn: 1.22.21
  pnpm: 9.1.0
Relevant Packages:
  next: 14.2.3 // Latest available version is detected (14.2.3).
  eslint-config-next: N/A
  react: 18.3.1
  react-dom: 18.3.1
  typescript: 5.4.5
Next.js Config:
  output: standalone

Which area(s) are affected? (Select all that apply)

Image (next/image)

Which stage(s) are affected? (Select all that apply)

Other (Deployed)

Additional context

No response

@Innei Innei added the bug Issue was opened via the bug report template. label May 7, 2024
@github-actions github-actions bot added the Image (next/image) Related to Next.js Image Optimization. label May 7, 2024
@Innei Innei changed the title Maybe next/og cause memory leak in production standalone build next/og cause memory leak in production standalone build May 7, 2024
@Innei
Copy link
Author

Innei commented May 7, 2024

Attachment: 2024-05-07T12-30-03-488Z.heapsnapshot.zip

@chipcop106
Copy link

chipcop106 commented Jul 1, 2024

Here is our memory chart when using the Next 14.2.3. I confirmed a memory leak issue with the open graph and Twitter images. From 27/6, we disable this function, and this chart becomes normal.

image

Solution

Please don't use dynamically generated metadata images using tsx, js, or ts until fixed.

@zhyd1997
Copy link

zhyd1997 commented Jul 1, 2024

Hi @chipcop106

What's the metrics tracing platform?
Are you using Sentry or Datadog? 👀

@chipcop106
Copy link

Hi @chipcop106

What's the metrics tracing platform? Are you using Sentry or Datadog? 👀

I'm using AWS ECS, which is the metrics of the docker's container.

@zhyd1997
Copy link

zhyd1997 commented Jul 1, 2024

@chipcop106

Thanks!

@matthewmorek
Copy link

Possible fix already sitting on your doorstep for free since March 13: https://github.com/orgs/vercel/discussions/6117#discussioncomment-8776252

@khuezy
Copy link
Contributor

khuezy commented Aug 5, 2024

image

This leak is still not fixed in the latest 15-canary.102. Memory is stable around 200MB, then 1 request to generate dynamic og causes the memory to spike to 230MB and is never freed.
Since the thread above isn't available for discussion, can we get some vercel devs' 👀 on this, CC: @shuding

@shuding shuding self-assigned this Aug 5, 2024
@shuding
Copy link
Member

shuding commented Aug 5, 2024

Self assigned - will take a look!

@frankharkins
Copy link

Thanks @shuding! This approach worked well for us https://github.com/orgs/vercel/discussions/6117#discussioncomment-8776252. After applying the patch, our memory stopped growing as aggressively.

Screenshot 2024-08-05 at 19 42 46

@khuezy
Copy link
Contributor

khuezy commented Aug 5, 2024

@frankharkins
Might be a little off topic here, but related to memory usage and leaks. Is it normal for the memory to grow like yours after the patch? It looks like it grows a decent amount post 08/03. Is this a general memory issue with javascript , reactjs, or nextjs?
I've noticed a gradual memory usage in my app too over a period of several days.

My other apps that run on golang and rust have a flatline memory usage over long periods of months+

@Eric-Arellano
Copy link

Is it normal for the memory to grow like yours after the patch?

(Coworker of Frank). No, I don't think it's normal. Our usage stats suggest it is not due to more usage over time, but rather we suspect a memory leak. We're still trying to figure out what causes it, with two leading hypotheses:

@khuezy
Copy link
Contributor

khuezy commented Aug 5, 2024

Hey @Eric-Arellano, I've seen a few issues on the next/image optimization leakage too. But my app continues to grow in memory over time, when it does garbage collect the memory drops but is always a little higher than the previous GC, so over time is continues to grow.

I have a rewrite and a custom image loader that routes all /_next/img to a non-js server than handles the image optimization and there still seems to be a leak. So there might be multiple instances of leaks somewhere in the javascript, reactjs, or nextjs stack.

@matthewmorek
Copy link

@khuezy If you dig a little deeper, there is an open issue in resvg-wasm package which has identified a memory leak: thx/resvg-js#216 (comment)

We have applied the above patch to @vercel/og manually in our project and yet we keep experience heavy memory leak. The above solution is still better than no patch, but this package doesn't seem to be the inherent cause of the leak.

@fredi1993
Copy link

Sorry if it's the wrong thread but just to share the experience. I'm having the same memory increase issue. The memory keeps increasing without being cleared. I've removed all usage of next/image and I'm not using resvg-wasm but the issue still persists. I have a suspicion that it might be related to the cache being stored in the memory and not in files but have to look into more details into my docker image.

@QuentinScDS
Copy link

Sorry if it's the wrong thread but just to share the experience. I'm having the same memory increase issue. The memory keeps increasing without being cleared. I've removed all usage of next/image and I'm not using resvg-wasm but the issue still persists. I have a suspicion that it might be related to the cache being stored in the memory and not in files but have to look into more details into my docker image.

Which version of Node are you using ?

@fredi1993
Copy link

Version: node:20.12.0-alpine

@QuentinScDS
Copy link

Version: node:20.12.0-alpine

Try with 18-alpine, starting from version 19, I have the same memory leak issue, but everything is fine with version 18. 🤷‍♂️

@khuezy
Copy link
Contributor

khuezy commented Aug 6, 2024

@QuentinScDS does the node team know about the leakage? I thought they fixed the fetch memory leak in 18? Did they introduce another one in 19+?

@QuentinScDS
Copy link

@QuentinScDS does the node team know about the leakage? I thought they fixed the fetch memory leak in 18? Did they introduce another one in 19+?

Discover this yesterday after a lot of testing... Indeed, I have seen mentions of memory leaks in version 18, but I haven’t found anything about more recent versions. I haven’t had time to create a reproducible case. I have the impression that it only concerns Docker images.

@khuezy
Copy link
Contributor

khuezy commented Aug 6, 2024

I don't think it's related to Docker images - I'm deploying on fly.io and they only use the Dockerfile to pull down the dependencies, then they run the stack raw on firecracker. I have a typescript temporal worker stack running on node-22.5.1 and memory has been stable... so I think this is a reactjs or nextjs leak (probably the latter)

@jnm733
Copy link

jnm733 commented Sep 18, 2024

Thanks @shuding! This approach worked well for us https://github.com/orgs/vercel/discussions/6117#discussioncomment-8776252. After applying the patch, our memory stopped growing as aggressively.

Screenshot 2024-08-05 at 19 42 46

That fix the memory leak for me too. Thanks!

@huozhi huozhi assigned huozhi and unassigned shuding Sep 18, 2024
@shuding shuding closed this as completed Sep 18, 2024
@stephancill
Copy link

Not sure the suggested patch completely fixes the issue. Still getting the same error, but less frequently

@huozhi
Copy link
Member

huozhi commented Sep 21, 2024

The fix is patched in v14.2.13, please upgrade to the new version 🙏

@ethos-seth
Copy link

@huozhi I'm not sure if I should comment here or open a new issue. Please redirect me if needed!

We bumped up to v14.2.13, but continue to see the memory leak running nextjs in a long-lived docker container. The upgrade did slow the memory leak a bit, but it still appears to be present. This app only has a few server-side routes, all of which are using the @vercel/og package.

image

@huozhi
Copy link
Member

huozhi commented Sep 30, 2024

@ethos-seth please import from next/og instead of @vercel/og. If you're still having the issue, please file a new issue with repro where you monitored it has memory leak

Copy link
Contributor

This closed issue has been automatically locked because it had no new activity for 2 weeks. If you are running into a similar issue, please create a new issue with the steps to reproduce. Thank you.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Issue was opened via the bug report template. Image (next/image) Related to Next.js Image Optimization. locked
Projects
None yet
Development

Successfully merging a pull request may close this issue.