You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are using this project to report data to Prometheus from our NodeJS / NestJS project. We have several custom metrics - in particular histograms (created directly using @siimon/prom-client APIs). As a result, the overall metrics size grows up to 8MB. This results in the NodeJS event loop lag getting to be as large as 1 second!
What we see:
Reducing the amount of data collected, e.g. by removing some metrics, improves the situation (however we would like to keep the metrics we have)
Explicitly invoking register.resetMetrics() causes the lag to go down almost to 0. It then starts increasing again
CPU usage is constant at around 15%
CPU profile shows long time spent in Node internal network functions, such as getPeerCertificate and destroySSL
Has anyone encountered this sort of behavior? Any suggestions?
The text was updated successfully, but these errors were encountered:
I did siimon/prom-client#543
But I'm not at all sure the problem is there. Or at least not wholly there.
First, as I explained, the problem seems to be network related, and all the network operation is in this project. prom-client just generates the metrics as a large string. Second, when I optimized the prom-client code and reduced the metrics generation by ~50% it had no impact on this problem.
It may be that the problem is related to PrometheusController returning the entire metrics as a single huge string rather than streaming it. (I know that prom-client doesn't provide a streaming interface).
We are using this project to report data to Prometheus from our NodeJS / NestJS project. We have several custom metrics - in particular histograms (created directly using
@siimon/prom-client
APIs). As a result, the overall metrics size grows up to 8MB. This results in the NodeJS event loop lag getting to be as large as 1 second!What we see:
register.resetMetrics()
causes the lag to go down almost to 0. It then starts increasing againgetPeerCertificate
anddestroySSL
Has anyone encountered this sort of behavior? Any suggestions?
The text was updated successfully, but these errors were encountered: