Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure pprof endpoint is listening on startup #17028

Merged
merged 6 commits into from
Mar 18, 2020

Conversation

jsoriano
Copy link
Member

pprof HTTP endpoint was being initialized asynchronously. In tests we
have found that sometimes after Metricbeat has started, it fails to
connect to the pprof endpoint. This can happen if the server is not
listening before the rest of the beat is initialized.

This change ensures that the listener for this endpoint is started
before continuing with the rest of the initialization. This should reduce
flakiness of tests that depend on these endpoint (test_stats tests).

Some comments and error messages have been polished.

pprof HTTP endpoint was being initialized asynchronously. In tests we
have found that sometimes after Metricbeat has started, it fails to
connect to the pprof endpoint. This can happen if the server is not
listening before the rest of the beat is initialized.

This change ensures that the listener for this endpoint is started
before continuing with the rest of the initialization.

Some comments and error messages have been polished.
@jsoriano jsoriano added review refactoring flaky-test Unstable or unreliable test cases. [zube]: In Review Team:Integrations Label for the Integrations team labels Mar 16, 2020
@jsoriano jsoriano self-assigned this Mar 16, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations (Team:Integrations)

@jsoriano jsoriano added needs_backport PR is waiting to be backported to other branches. v7.7.0 labels Mar 16, 2020
Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@jsoriano
Copy link
Member Author

@blakerouse sorry, I have added a couple more changes, could you please take another look?

Doing some tests I found that fatal errors in this code were not being logged, I think that because of the use of log instead of libbeats logp. I have changed the calls to log.Fatal by calls to logp.Error + os.Exit and they are logged now.

I have also added a changelog entry because before this change if httpprof couldn't be started it was being mostly ignored (it only logged something like this at the info level):

2020-03-16T16:37:54.167+0100	INFO	service/service.go:112	finished pprof endpoint: listen tcp 127.0.0.1:22: bind: permission denied

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the goal is to move away from the global logger, shouldn't log.NewLogger be used instead of the global logp logger?

@jsoriano
Copy link
Member Author

jsoriano commented Mar 16, 2020

I think the goal is to move away from the global logger, shouldn't log.NewLogger be used instead of the global logp logger?

Yes, you are right. I have tried to do it, but this package is formed of global methods called from beat initialization code, that is still using the global logger.
I have tried to define a global logger in this package but it didn't log anything. At the end I am instantiating a new logger on every method that needs it. I am not very happy with this change but it is some progress, at least it is using the per-instance methods.
Maybe it'd be better to move this change to its own PR, I am not sure what are the expectations regarding logging for this initialization code.
Let me know what you think.

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Agree that creating a logger for each function is rather weird, but no real other alternative.

@jsoriano jsoriano merged commit efdab6f into elastic:master Mar 18, 2020
@jsoriano jsoriano deleted the pprof-endpoint-initialization branch March 18, 2020 14:00
jsoriano added a commit to jsoriano/beats that referenced this pull request Mar 18, 2020
pprof HTTP endpoint was being initialized asynchronously. In tests we
have found that sometimes after Metricbeat has started, it fails to
connect to the pprof endpoint. This can happen if the server is not
listening before the rest of the beat is initialized.

This change ensures that the listener for this endpoint is started
before continuing with the rest of the initialization.

Some comments and error messages have been polished.

(cherry picked from commit efdab6f)
@jsoriano jsoriano removed the needs_backport PR is waiting to be backported to other branches. label Mar 18, 2020
jsoriano added a commit that referenced this pull request Mar 19, 2020
…tup (#17081)

pprof HTTP endpoint was being initialized asynchronously. In tests we
have found that sometimes after Metricbeat has started, it fails to
connect to the pprof endpoint. This can happen if the server is not
listening before the rest of the beat is initialized.

This change ensures that the listener for this endpoint is started
before continuing with the rest of the initialization.

Some comments and error messages have been polished.

(cherry picked from commit efdab6f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Unstable or unreliable test cases. refactoring review Team:Integrations Label for the Integrations team v7.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants