Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric queueSize twice fails with opentelemetry collector with a prometheus metrics exporter #5066

Closed
cmunger opened this issue Dec 28, 2022 · 5 comments
Labels
Bug Something isn't working

Comments

@cmunger
Copy link

cmunger commented Dec 28, 2022

Describe the bug

When setting up an opentelemetry collector with a metrics prometheus exporter, the exporter will fail when called
if the monitored application is a java application using a opentelemetry-log4j-appender-2.17 instrumentation
since the application has 2 metrics named queueSize one for the logs and of for the spans processor :

2022-12-28T10:39:28.664Z error prometheusexporter@v0.68.0/log.go:34 error gathering metrics: collected metric default_queueSize label:<name:"job" value:"poc/monitoring-poc" > label:<name:"spanProcessorType" value:"BatchSpanProcessor" > gauge:<value:0 > has help "The number of spans queued" but should have "The number of logs queued"
{"kind": "exporter", "data_type": "metrics", "name": "prometheus"}
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*promLogger).Println
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter@v0.68.0/log.go:34
github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1
github.com/prometheus/client_golang@v1.14.0/prometheus/promhttp/http.go:139
net/http.HandlerFunc.ServeHTTP
net/http/server.go:2109
net/http.(*ServeMux).ServeHTTP
net/http/server.go:2487
go.opentelemetry.io/collector/config/confighttp.(*decompressor).wrap.func1
go.opentelemetry.io/collector@v0.68.0/config/confighttp/compression.go:162
net/http.HandlerFunc.ServeHTTP
net/http/server.go:2109
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.37.0/handler.go:210
go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP
go.opentelemetry.io/collector@v0.68.0/config/confighttp/clientinfohandler.go:39
net/http.serverHandler.ServeHTTP
net/http/server.go:2947
net/http.(*conn).serve
net/http/server.go:1991

Steps to reproduce
A java application using opentelementry java agent log4j2 instrumentation :

io.opentelemetry.instrumentation
opentelemetry-log4j-appender-2.17
1.21.0-alpha
runtime

Java Application must do in loop log statements every seconds:

public static void main(String[] args) throws InterruptedException {
	while (true) {
		LoggerFactory.getLogger("root").info("TestLog");
		LockSupport.parkNanos(TimeUnit.SECONDS.toNanos(1));
	}
}

an otel collector (otel-collector.yaml) with a prometheus exporter :


receivers:
  otlp/2:
    protocols:
      grpc:

processors:
  batch:

exporters:
  otlphttp:
    endpoint: http://localhost:8080
  prometheus:
    endpoint: "0.0.0.0:9464"
    namespace: "default"

service:
  pipelines:
    traces:
      receivers: [ otlp/2 ]
      processors: [ batch ]
      exporters: [ otlphttp ]
    metrics:
      receivers: [ otlp/2 ]
      processors: [ batch ]
      exporters: [ prometheus ]
    logs:
      receivers: [ otlp/2 ]
      processors: [ batch ]
      exporters: [ otlphttp ]

start the collector :
docker run -p 4317:4317 -p 9464:9464 -v $(pwd)/otel-collector.yaml:/etc/otelcol/config.yaml otel/opentelemetry-collector
start the java application
call http://localhost:9464/metrics will trigger the bug in the opentelemetry collector and will produce an half populated prometheus output file

What did you expect to see?
A prometheus output file with output :

# TYPE queueSize gauge
# HELP queueSize The number of spans queued,The number of logs queued
queueSize{spanProcessorType="BatchSpanProcessor"} 0
queueSize{logRecordProcessorType="BatchLogRecordProcessor"} 0

What did you see instead?
A badly generated prometheus output file with missing queueSize{spanProcessorType="BatchSpanProcessor"} 0

# HELP default_queueSize The number of logs queued
# TYPE default_queueSize gauge
default_queueSize{logRecordProcessorType="BatchLogRecordProcessor"} 0

What version and what artifacts are you using?
oentelemetry-collector 0.68
opentelemetry-java-agent 1.21.0

Environment
OS: Ubuntu 20.04

Additional context
This issue is similar to this one : #4382

@cmunger cmunger added the Bug Something isn't working label Dec 28, 2022
@jack-berg
Copy link
Member

I believe that #5039 fixes this since it now includes the otel_scope_name attribute on each metric.

@cmunger
Copy link
Author

cmunger commented Dec 29, 2022

Hello,

Well the patch you're showing me is part of collector release 0.68 ? i guess not ? and the code of the patch you showed me is part of opentelemetry-java ? collector is written in go, i m kind of lost :)

@cmunger
Copy link
Author

cmunger commented Dec 29, 2022

if it was not clear: the architecture i was testing is using an instrumented java application with and oltp metrics exporter :

otel.sdk.disabled=false
otel.exporter.otlp.endpoint=http://localhost:4317
otel.exporter.otlp.protocol=grpc
otel.traces.exporter=otlp
otel.logs.exporter=otlp
otel.metric.export.interval=1000
otel.metrics.exporter=otlp
otel.resource.attributes=service.name=monitoring-poc,service.version=1.1,service.namespace=poc,deployment.environment=pre
otel.traces.sampler=traceidratio
otel.traces.sampler.arg=0.5

The collector then expose those metrics received using oltp with this time a prometheus exporter (look at the provided otel-collector.yaml)

@jack-berg
Copy link
Member

I see. Sorry it wasn't clear that the java app was using the OTLP exporter. This is actually a problem with the collector. The java app is producing multiple metrics named queueSize under different scopes (io.opentelemetry.sdk.logs and io.opentelemetry.sdk.trace). This is allowed and perfectly valid. Its up to the collector's prometheus exporter to handle this data, probably in a manner very similar to the java prometheus exporter's solution in #5039.

I suggest opening an issue in opentelemetry-collector-contrib, or I can transfer this issue over there.

@jack-berg
Copy link
Member

Closing this issue because its not a bug in opentelemetry-java. Can reopoen and transfer to opentelemetry-collector-contrib if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants