Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jmx exporter javaagent stop collecting custom metrics after a while #321

Closed
finkr opened this issue Sep 18, 2018 · 5 comments
Closed

jmx exporter javaagent stop collecting custom metrics after a while #321

finkr opened this issue Sep 18, 2018 · 5 comments

Comments

@finkr
Copy link
Contributor

finkr commented Sep 18, 2018

Hello,

We have deployed JMX exporter 0.3.1 as a javaagent on many application servers (JBoss 6 with Oracle JVM 1.7). We have a configuration file with some custom metrics.

On some of our JVM, the JMX exporter has stopped collecting custom metrics after a while (a few hours). The JMX exporter "built-in" metrics are still reported.

I could make sure nothing has changed using the built-in metrics (up, process_start_time_seconds, jmx_config_reload_success_total).

Any idea on how to diagnose/solve this issue ?

Here's one of our custom metric:

# HELP web_ajp_processing_time Processing time used by the connector. Im milli-seconds. (jboss.as<subsystem=web, connector=ajp><>processingTime)
# TYPE web_ajp_processing_time untyped
web_ajp_processing_time{connector="ajp",} 1894.0
@brian-brazil
Copy link
Contributor

Have you tried a thread dump? To mitigate DOSes, there is a limit on the number of workers.

@ghost
Copy link

ghost commented Nov 5, 2018

We keep running into a similar problem, most of the time on our dockerized solr instances, but we don't even get an "up" metric anymore..
The prometheus server in that environment does not seem to have any obvious problems and collecting hundreds of additional metrics from other services just fine and yet, jmx_exporter thread seem to get stuck after a while..

threaddump_solr_jmxexporter.txt

@brian-brazil
Copy link
Contributor

Looks like it's blocking on writing back to the socket. I'd suggest an strace and tcpdump to see what's going on.

@ghost
Copy link

ghost commented Nov 5, 2018

Yikes, we've had a corrupt document in the solr index and some of the metrics took forever to render, causing the jmx_exporter threads to clog up.. Sorry for the noise.

@brian-brazil
Copy link
Contributor

Okay, odd that it's presenting in this way but it is what it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants