Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very long NodeJS event loop lag #1659

Closed
shappir opened this issue Mar 1, 2023 · 2 comments
Closed

Very long NodeJS event loop lag #1659

shappir opened this issue Mar 1, 2023 · 2 comments

Comments

@shappir
Copy link

shappir commented Mar 1, 2023

We are using this project to report data to Prometheus from our NodeJS / NestJS project. We have several custom metrics - in particular histograms (created directly using @siimon/prom-client APIs). As a result, the overall metrics size grows up to 8MB. This results in the NodeJS event loop lag getting to be as large as 1 second!

What we see:

  • Reducing the amount of data collected, e.g. by removing some metrics, improves the situation (however we would like to keep the metrics we have)
  • Explicitly invoking register.resetMetrics() causes the lag to go down almost to 0. It then starts increasing again
  • CPU usage is constant at around 15%
  • CPU profile shows long time spent in Node internal network functions, such as getPeerCertificate and destroySSL

Has anyone encountered this sort of behavior? Any suggestions?

@willsoto
Copy link
Owner

willsoto commented Mar 1, 2023

Probably better to report this to https://github.com/siimon/prom-client

@willsoto willsoto closed this as completed Mar 1, 2023
@shappir
Copy link
Author

shappir commented Mar 2, 2023

I did siimon/prom-client#543
But I'm not at all sure the problem is there. Or at least not wholly there.

First, as I explained, the problem seems to be network related, and all the network operation is in this project. prom-client just generates the metrics as a large string. Second, when I optimized the prom-client code and reduced the metrics generation by ~50% it had no impact on this problem.

It may be that the problem is related to PrometheusController returning the entire metrics as a single huge string rather than streaming it. (I know that prom-client doesn't provide a streaming interface).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants