Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(uwsgi): Allow plugin to continue when stats server(s) are unavailable #6817

Closed
wants to merge 7 commits into from
Closed

Conversation

hkraal
Copy link

@hkraal hkraal commented Dec 20, 2019

Fixes #6795

  • Added: flag skip_errors to allow for skipping any errors
  • Changed: s.source in case of unix sockets to use the socket path instead of the hostname to keep the statistics identifiable

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

@danielnelson
Copy link
Contributor

@hkraal Thanks for sharing your solution, but I don't think we would incorporate this patch due to the new option. My thinking currently is that we need to design a system where the logging level can be set on a per plugin basis. #6584 (comment)

@hkraal
Copy link
Author

hkraal commented Dec 28, 2019

@danielnelson I'm willing to modify the PR to suit both our needs, would removing the option and skipping unreachable stats servers have more chance to make it?

My main problem was that the plugin stops when it encounters an inactive stats socket. Depending on the used uWSGI deployment this behaviour should be expected and should not stop the plugin (and Telegraf) dead in its tracks.

I figured the main issue was in the fact an error was logged. Just ignoring (and logging if desirable) unreachable sockets crossed my mind as well but it less explicit than adding skip_errors: true to the configuration and so I choose for adding the option.

@danielnelson
Copy link
Contributor

I'm willing to modify the PR to suit both our needs, would removing the option and skipping unreachable stats servers have more chance to make it?

My issue with this is that when the socket cannot be reached it is usually an error, and without the error the plugin will be harder to debug.

Almost any plugin could produce an error that isn't considered an error by the operator. For example maybe the operator doesn't consider it an error if a site connected to by the http server is down. In order to avoid adding options like this piecemeal to Telegraf, I think we need a more generic solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Telegraf is unable start when using uwsgi in on-demand mode
2 participants