-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[extension/ecsobserver] Write discovered targets as Prometheus file sd #3785
Conversation
Load test failed because some UDP message is lost https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/3785/checks?check_run_id=2822779561
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
windows test failed https://app.circleci.com/pipelines/github/open-telemetry/opentelemetry-collector-contrib/15444/workflows/90c5815a-0b99-4ab5-ab15-d93e2aeb0575/jobs/134087
|
Did a rebase, unit test passed but load test failed https://app.circleci.com/pipelines/github/open-telemetry/opentelemetry-collector-contrib/15652/workflows/b630b1c3-7505-47a1-aade-878eb52a8f2a/jobs/136105
btw: it seems github action is still running, so we are testing on both github action and circle CI? |
@alolita would be good to have a small design about this, not sure why we decided to write this in prometheus format. |
@bogdandrutu it's mentioned in README https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/extension/observer/ecsobserver/README.md#notify-prometheus-receiver-of-discovered-targets Main reason is if we use receiver creator to create new receiver (i.e. pass discovered targets in process and create a new one for each target) the performance is not good #1395 (comment) . Eventually it can switch away from writing file in file sd format and support other receivers (e.g. a batch mode to reconfigure receiver in receiver creator framework). |
/cc @jrcamp I think the performance problem was discuss, is this something we should fix? |
Hi @bogdandrutu , @jrcamp , We had discussed this performance issue before and @kohrapha has explored a few options including implemented a simple This PR is the last PR we need to merge to enable ECS Prom metrics auto-discover for OTel that we'll launch this feature as Preview in AWS in a short time. Could you please help to merge this PR so we can be unblocked? We'll continue to work on the possible better solution and fixes. cc/ @alolita |
Yeah I'm going to look into the performance issues of many scrapers with receivercreator. I think this is a decent workaround in the meantime. Will look into it in #3977. |
@bogdandrutu i think it would be useful to do a design on alternatives w pros and cons esp re: perf issues. @mxiamxia can you and the team put together a short design proposal outlining the top alternatives. |
Hi @alolita, The current implementation is a good alternative for us from the previous discussion in #1395. And we have the high level design in README. I think Jay will take a first stab on perf issue. We need to get this PR merged ASAP. |
I've marked this PR ready-to-merge. @bogdandrutu please take a look and flag any other concerns you may have. |
- Stop the collector process from extension using `host.ReportFatalError` otherwiese the failure of extension just log.
prom receiver is expecting metric's job name same as the one in config. In order to keep similar behaviour as cloudwatch agent's discovery impl, we support getting job name from docker label, but it will break metric type. For long term solution, see open-telemetry/opentelemetry-collector#575 (comment)
Was using context.
Circle CI unit test failed https://app.circleci.com/pipelines/github/open-telemetry/opentelemetry-collector-contrib/16013/workflows/41d3192a-5bfb-4867-b721-7d46f891d1da/jobs/139341
|
Description:
serviceDiscovery
and support real AWS clientLink to tracking Issue:
Previous PRs
Testing:
Documentation:
No new doc. See existing doc