-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait on readiness service on start #102325
base: main
Are you sure you want to change the base?
Conversation
Node.start() currently does not return until the node is serving http. However, other parts of the system may not be quite ready. For example, file settings may not have been applied, and we may not yet have a master node. The readiness service's purpose is to identify when the node is actually ready for serving requests. This commit adjusts the end of Node start to wait on the readiness service being ready. The ramifications are that the main Elasticsearch thread will not exit until the node is actually ready to serve requests, and the cli will not exit (when in daemon mode) until the node is ready.
Pinging @elastic/es-core-infra (Team:Core/Infra) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, only a minor Q
CountDownLatch ready = new CountDownLatch(1); | ||
readinessService.addBoundAddressListener(address -> ready.countDown()); | ||
try { | ||
ready.await(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not introducing a timeout here - I suppose there are plenty of other places where we already timeout so there is no need for another one here, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep exactly, there are other timeouts on startup eg systemd has a 75 second default timeout. Right now we will actually report to systemd that the node is ready before the readiness probe responds true, but this change will align them.
Node.start() currently does not return until the node is serving http. However, other parts of the system may not be quite ready. For example, file settings may not have been applied, and we may not yet have a master node. The readiness service's purpose is to identify when the node is actually ready for serving requests.
This commit adjusts the end of Node start to wait on the readiness service being ready. The ramifications are that the main Elasticsearch thread will not exit until the node is actually ready to serve requests, and the cli will not exit (when in daemon mode) until the node is ready.