Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CheckHealth RPC now logs error when failing. #3860

Merged
merged 1 commit into from
Jul 11, 2024
Merged

Conversation

blkt
Copy link
Contributor

@blkt blkt commented Jul 11, 2024

Summary

This change aims to add a little more introspection on why health check might fail. It's meant to help in scenarios like e.g. when Minder goes CLBO because of liveness check failures. Currently, it is hard to tell whether this is because of lack of resources, most likely CPU, or because it fails to connect to the database.

Change Type

  • Bug fix (resolves an issue without affecting existing features)
  • Feature (adds new functionality without breaking changes)
  • Breaking change (may impact existing functionalities or require documentation updates)
  • Documentation (updates or additions to documentation)
  • Refactoring or test improvements (no bug fixes or new functionality)

Review Checklist:

  • Reviewed my own code for quality and clarity.
  • Added comments to complex or tricky code sections.
  • Updated any affected documentation.
  • Included tests that validate the fix or feature.
  • Checked that related changes are merged.

This change aims to add a little more introspection on why health
check might fail. It's meant to help in scenarios like e.g. when
Minder goes CLBO because of liveness check failures. Currently, it is
hard to tell whether this is because of lack of resources, most likely
CPU, or because it fails to connect to the database.
@blkt blkt self-assigned this Jul 11, 2024
@blkt blkt requested a review from a team as a code owner July 11, 2024 09:48
@coveralls
Copy link

Coverage Status

coverage: 53.047% (-0.003%) from 53.05%
when pulling 90caa50 on enh/health-check-logs-error
into 4106f31 on main.

@blkt blkt merged commit a441c32 into main Jul 11, 2024
22 of 23 checks passed
@blkt blkt deleted the enh/health-check-logs-error branch July 11, 2024 10:49
dmjb pushed a commit that referenced this pull request Jul 12, 2024
This change aims to add a little more introspection on why health
check might fail. It's meant to help in scenarios like e.g. when
Minder goes CLBO because of liveness check failures. Currently, it is
hard to tell whether this is because of lack of resources, most likely
CPU, or because it fails to connect to the database.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants