Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hook into continual dqlite role probes #301

Merged
merged 3 commits into from
Jul 9, 2024

Conversation

masnax
Copy link
Contributor

@masnax masnax commented Jul 5, 2024

After starting the go-dqlite App, it continually checks for role adjustments and updates the local store of nodes. The frequency of this check can be managed by WithRolesAdjustmentFrequency.

From the perspective of a project using go-dqlite it would be very useful to hook into this continual check to piggy-back heartbeats off of, or generally keep an up-to-date record of what dqlite sees.

To that end, this PR introduces the WithRolesAdjustmentHook option to app.App which is run after attempting a role adjustment. It will provide a client to the current dqlite leader, as well as a list of the most recently updated dqlite nodes so that the project using go-dqlite can update its own state without having to send further requests over the network to determine this information.

@cole-miller
Copy link
Contributor

Thank you for putting this together! I like the idea of letting consumers of go-dqlite take advantage of the work that we're already doing, but I want to make sure I understand the concrete use-case in detail. Also going to comment on #303 here since they're related, and for the sake of keeping the conversation in one place.

From the perspective of a project using go-dqlite it would be very useful to hook into this continual check to piggy-back heartbeats off of, or generally keep an up-to-date record of what dqlite sees.

Would the callback be heartbeating all nodes in the list? And for getting an up-to-date view of what dqlite sees, would just consulting the node store be enough?

[From https://github.com//pull/303#issue-2393001031] The idea here is that the hook from #301 can be used to check the deviation in node response times after each role adjustment, and then increment or decrement the maximum connections that the cluster can tolerate.

I don't see how this callback can be used to look at node response times, since it receives only the list of cluster members and a connection to the leader---it would have to ping every node in the list itself, right? Maybe the node store should be upgraded to remember how long it took to contact each node on the most recent attempt?

My impression is that there are two use-cases here:

A. Find a leader. Here there is a tradeoff between latency and the number of connections we make.
B. Understand the health of the cluster by attempting to contact every node.

Is that right? I think these probably shouldn't be completely independent to avoid wasting work, but I'm not entirely sure how best to make them cooperate.

@masnax
Copy link
Contributor Author

masnax commented Jul 5, 2024

Would the callback be heartbeating all nodes in the list? And for getting an up-to-date view of what dqlite sees, would just consulting the node store be enough?

I don't see how this callback can be used to look at node response times, since it receives only the list of cluster members and a connection to the leader---it would have to ping every node in the list itself, right? Maybe the node store should be upgraded to remember how long it took to contact each node on the most recent attempt?

So to give some context, microcluster sets up a dqlite cluster with go-dqlite and then exposes a database with a table of cluster members for other projects to consume in their REST APIs. To provide some information on cluster health, the dqlite leader periodically attempts to send a heartbeat. The time since the last successful heartbeat is recorded for each cluster member. The variance in this value between nodes and over time would be what I'd define as "cluster health" for the purposes of this PR and the other one.

To do this, each microcluster node periodically checks if it's the dqlite leader. If it is not, then it aborts until the next attempt. If it is the dqlite leader then it initiates a heartbeat that syncs dqlite's set of nodes (returned from client.Cluster) with microcluster's table.

This means that periodically, each node is querying all other nodes to find the leader, and then the leader is querying for the most up-to-date list of cluster members. Meanwhile, in parallel go-dqlite is doing the exact same thing plus a role adjustment.

My thought was that since both of this information (an active connection to the current leader, and the recently updated list of nodes) is compiled by go-dqlite, then microcluster could hook into go-dqlite's own "heartbeat", and sync its own table of cluster members with what dqlite reports. And since there is an active connection to the dqlite leader, we can use that to check if we are the leader, without having to make those concurrent requests to probe for the leader again.

But really the use case is more abstract than that. Any time a project is using go-dqlite to set up a cluster, it's likely going to have some internal state that is highly coupled with the current list of dqlite cluster members. Whenever go-dqlite refreshes its current set of cluster members, the downstream project should be "notified" about this so that it can make any necessary adjustments to its services. So for that purpose this is just an arbitrary hook for such projects to consume the most current state of dqlite. That is: who is the leader, and what nodes do we have.

@tomponline
Copy link
Member

This is likely similar to what lxd does, as it too does a leader initiated heartbeat request to each member and as part of that does role rebalancing without having to do additional requests

@cole-miller
Copy link
Contributor

cole-miller commented Jul 5, 2024

Thanks @masnax, that makes sense. The things I'm still not clear about:

  • Is this callback sufficient to replace microcluster's own heartbeats as-is, or does go-dqlite also need to measure and pass in data about how long each node took to respond in order to implement the "health check" part?
  • If we implement Configure maximum concurrent leader probes #303 and the proposal for dynamically adjusting the concurrency limit for leader search, then when the cluster is in relatively good shape, the roles adjustment won't end up contacting all nodes, so it can't provide comprehensive information about cluster health to the callback here. Is that acceptable?

@masnax
Copy link
Contributor Author

masnax commented Jul 5, 2024

* Is this callback sufficient to replace microcluster's own heartbeats as-is, or does go-dqlite also need to measure and pass in data about how long each node took to respond in order to implement the "health check" part?

go-dqlite doesn't have to do anything else with the data. microcluster has its own definition of what constitutes a healthy node/cluster, and it will make those measurements across the nodes supplied by this hook.

* If we implement [Configure maximum concurrent leader probes #303](https://github.com/canonical/go-dqlite/pull/303) and the proposal for dynamically adjusting the concurrency limit for leader search, then when the cluster is in relatively good shape, the roles adjustment won't end up contacting all nodes, so it can't provide comprehensive information about cluster health to the callback here. Is that acceptable?

That's fine. go-dqlite really only needs to provide two bits of information to microcluster:

  • who is the dqlite leader
  • what are the current nodes and their roles

It will be up to microcluster to probe the health of each of those nodes. The hook simply lets microcluster forgo fetching the above information that go-dqlite already fetched, and ensures that when microcluster receives that information, it is up-to-date because dqlite isn't concurrently adjusting roles.

Each time go-dqlite fires the hook on a particular node, then in microcluster that node is going to check active leader connection to see if itself is the leader. If it is, it's going to send a REST API request to each of the other nodes, and then measure when it replies.

@masnax
Copy link
Contributor Author

masnax commented Jul 5, 2024

I haven't yet thought too much about the actual algorithm for incrementing/decrementing the maximum concurrent connections, but the gist of it is going to to be something like this:

  • Default the max concurrent connections to 1
  • If the REST API responses from each node in microcluster deviates from the mean by more than some threshold, increment the max conns
  • If the mean of all responses is larger than some threshold, also increment the max conns.

The new max conn value will be used until the next heartbeat when it is recomputed.

@cole-miller
Copy link
Contributor

cole-miller commented Jul 5, 2024

microcluster has its own definition of what constitutes a healthy node/cluster, and it will make those measurements across the nodes supplied by this hook.

Ah, got it, sorry for the confusion! For some reason I thought that this was going to be taken over by the pings from dqlite during leader search, but now everything makes sense.

app/options.go Outdated Show resolved Hide resolved
Signed-off-by: Max Asnaashari <max.asnaashari@canonical.com>
Signed-off-by: Max Asnaashari <max.asnaashari@canonical.com>
Signed-off-by: Max Asnaashari <max.asnaashari@canonical.com>
@cole-miller cole-miller self-requested a review July 9, 2024 20:30
Copy link
Contributor

@cole-miller cole-miller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved pending CI passing (I hit retry since there were some spurious issues with apt), thanks!

@cole-miller cole-miller merged commit 7523be9 into canonical:master Jul 9, 2024
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants