Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky MetricExporterConfigurationTest #5877

Merged
merged 1 commit into from
Oct 5, 2023

Conversation

laurit
Copy link
Contributor

@laurit laurit commented Oct 4, 2023

Resolves #5866
My guess is that configureMetricReader_KnownSpiExportersOnClasspath fails with java.net.BindException: Address already in use because configureReader_PrometheusOnClasspath, which is the only test that also starts PrometheusHttpServer in MetricExporterConfigurationTest had not yet shut down PrometheusHttpServer. Shutting down PrometheusHttpServer

waits for 10s for the server to shut down, running time of configureReader_PrometheusOnClasspath is 10.104s. So it think it is plausible that configureMetricReader_KnownSpiExportersOnClasspath started before PrometheusHttpServer from previous test was shut down. Also there is a 10s wait in shutting down of PrometheusHttpServer
public CompletableResultCode shutdown() {
CompletableResultCode result = new CompletableResultCode();
Thread thread =
THREAD_FACTORY.newThread(
() -> {
try {
server.stop(10);
executor.shutdownNow();
} catch (Throwable t) {
result.fail();
return;
}
result.succeed();
});
thread.start();
return result;
}
Alternatively reducing that could make test more stable.

@laurit laurit requested a review from a team October 4, 2023 13:47
@codecov
Copy link

codecov bot commented Oct 4, 2023

Codecov Report

All modified lines are covered by tests ✅

📢 Thoughts on this report? Let us know!.

@breedx-splk
Copy link
Contributor

How does the config/port=0 change address the timing? Or does that just change how the binding happens to bypass the conflict entirely?

@laurit
Copy link
Contributor Author

laurit commented Oct 4, 2023

How does the config/port=0 change address the timing? Or does that just change how the binding happens to bypass the conflict entirely?

port 0 means bind to any free port

@breedx-splk
Copy link
Contributor

How does the config/port=0 change address the timing? Or does that just change how the binding happens to bypass the conflict entirely?

port 0 means bind to any free port

Got it. Was just surprised that your discussion didn't actually mention the approach of the fix. 🙃

@@ -48,7 +49,7 @@ void configureReader_PrometheusOnClasspath() {

MetricReader reader =
MetricExporterConfiguration.configureReader(
"prometheus", EMPTY, spiHelper, (a, b) -> a, closeables);
"prometheus", CONFIG_PROPERTIES, spiHelper, (a, b) -> a, closeables);
cleanup.addCloseables(closeables);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these tests cleanup their exporters / readers. This change will help make this test more resilient against other tests starting prometheus on the default port and not cleaning up, but best case we identify / fix the test failing to clean up.

jack-berg
jack-berg approved these changes Oct 5, 2023
@jack-berg jack-berg merged commit 533c30a into open-telemetry:main Oct 5, 2023
17 of 18 checks passed
@laurit laurit deleted the flaky-prometheus-test branch October 6, 2023 05:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix test flake in MetricExporterConfigurationTest
3 participants