-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hook into continual dqlite role probes #301
Conversation
Thank you for putting this together! I like the idea of letting consumers of go-dqlite take advantage of the work that we're already doing, but I want to make sure I understand the concrete use-case in detail. Also going to comment on #303 here since they're related, and for the sake of keeping the conversation in one place.
Would the callback be heartbeating all nodes in the list? And for getting an up-to-date view of what dqlite sees, would just consulting the node store be enough?
I don't see how this callback can be used to look at node response times, since it receives only the list of cluster members and a connection to the leader---it would have to ping every node in the list itself, right? Maybe the node store should be upgraded to remember how long it took to contact each node on the most recent attempt? My impression is that there are two use-cases here: A. Find a leader. Here there is a tradeoff between latency and the number of connections we make. Is that right? I think these probably shouldn't be completely independent to avoid wasting work, but I'm not entirely sure how best to make them cooperate. |
So to give some context, microcluster sets up a dqlite cluster with To do this, each microcluster node periodically checks if it's the dqlite leader. If it is not, then it aborts until the next attempt. If it is the dqlite leader then it initiates a heartbeat that syncs dqlite's set of nodes (returned from This means that periodically, each node is querying all other nodes to find the leader, and then the leader is querying for the most up-to-date list of cluster members. Meanwhile, in parallel My thought was that since both of this information (an active connection to the current leader, and the recently updated list of nodes) is compiled by But really the use case is more abstract than that. Any time a project is using |
This is likely similar to what lxd does, as it too does a leader initiated heartbeat request to each member and as part of that does role rebalancing without having to do additional requests |
Thanks @masnax, that makes sense. The things I'm still not clear about:
|
That's fine.
It will be up to microcluster to probe the health of each of those nodes. The hook simply lets microcluster forgo fetching the above information that Each time |
I haven't yet thought too much about the actual algorithm for incrementing/decrementing the maximum concurrent connections, but the gist of it is going to to be something like this:
The new max conn value will be used until the next heartbeat when it is recomputed. |
Ah, got it, sorry for the confusion! For some reason I thought that this was going to be taken over by the pings from dqlite during leader search, but now everything makes sense. |
Signed-off-by: Max Asnaashari <max.asnaashari@canonical.com>
Signed-off-by: Max Asnaashari <max.asnaashari@canonical.com>
Signed-off-by: Max Asnaashari <max.asnaashari@canonical.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved pending CI passing (I hit retry since there were some spurious issues with apt), thanks!
After starting the
go-dqlite
App
, it continually checks for role adjustments and updates the local store of nodes. The frequency of this check can be managed byWithRolesAdjustmentFrequency
.From the perspective of a project using
go-dqlite
it would be very useful to hook into this continual check to piggy-back heartbeats off of, or generally keep an up-to-date record of what dqlite sees.To that end, this PR introduces the
WithRolesAdjustmentHook
option toapp.App
which is run after attempting a role adjustment. It will provide a client to the current dqlite leader, as well as a list of the most recently updated dqlite nodes so that the project usinggo-dqlite
can update its own state without having to send further requests over the network to determine this information.