Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi 😄
I'm creating this PR after facing related issues in our environment.
I think that it's a good idea to add a Readiness endpoint to Burrow.
On Burrow startup, it configures and starts the coordinators one by one in a predefined order (ZK, Storage, Evaluator, HTTP, and so on).
It makes a lot of sense, but the
healthcheck
itself is initialized on the HTTP subsystem and returnsHTTP 200
no matter what's the real status of Burrow is, like mention on the Wiki page.This causes an issue when something is not right with one of the Kafka / Zookeeper clusters and the consumer subsystem fails -
our orchestration system already got the
HTTP 200
(which initialized before even connecting to any client) and marked the deployment of Burrow as a success, and causing silent failure.In order to avoid this situation, I added a readiness probe, that will be switched only after the last subsystem (the consumer) finish to be initialized with a success, to mark that Burrow is ready to serve requests.
Thank you!