-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add High Availability research #685
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice research. I very much like that you put some thought into what kind of failure scenarios are even within our scope and that there is a comprehensive overview of the various possible configurations with the components at hand.
Here are some comments I thought of while reading, hope you find some useful information in there
A high-level comment would be to touch base with the LH team on the work on the NFS share manager and their HA plans/ideas. While they can't take advantage of an ingress, they share a few of the similar concerns - node/pod failure detection, recovery, etc. Perhaps there's overlap and tech we can leverage jointly. |
One item we (Longhorn) would like to know about the S3GW HA is whether it will assume the volume beneath the object store must be RWX, or whether RWO would suffice. If the gateway is active/active enough that both sides need simultaneous write access in order to transfer the work fast enough, that will make a difference. From what I gather in the discussion here, it is unacceptable to have to start a pod on the new owner as part of the failover, but should be acceptable to defer attaching the backing volume until then. If so, then a simple RWO volume would suffice. If not, the RWX volume would itself be layered on NFS, and any HA transfer would be gated by the NFS HA transfer, which currently requires significant time to clear locks, wait for grace periods, and all that. (I have the ticket to try to improve its performance, if possible.) |
Our current idea is to propose the HA model: "active/standby". |
We need features from XFS that NFS would no longer expose; and we will not support multiple We've got no plans to support (At that point, s3gw would be slowly implementing a distributed K/V object store as a backend, and ... that'd be called RADOS/Ceph :-D ) |
ad8d872
to
9fbea68
Compare
c3e5fdb
to
4af657c
Compare
@l-mb are we good to merge this? |
@giubacc there are conflicts with this PR, mind addressing them? |
- add research/ha/RATIONALE.md Related to: https://github.com/aquarist-labs/s3gw/issues/361 Signed-off-by: Giuseppe Baccini <giuseppe.baccini@suse.com>
Related to: https://github.com/aquarist-labs/s3gw/issues/361 Signed-off-by: Giuseppe Baccini <giuseppe.baccini@suse.com>
regular-localhost-incremental-fill-5k regular_localhost_load_fio_64_write regular_localhost_zeroload_400_800Kdb regular_localhost_zeroload_emptydb segfault_localhost_zeroload_emptydb Related to: https://github.com/aquarist-labs/s3gw/issues/361 Signed-off-by: Giuseppe Baccini <giuseppe.baccini@suse.com>
- scale_deployment_0_1-k3s3nodes-zeroload-emptydb - s3wl-putobj-100ms-clusterip - s3wl-putobj-100ms-ingress Related to: https://github.com/aquarist-labs/s3gw/issues/361 Signed-off-by: Giuseppe Baccini <giuseppe.baccini@suse.com>
Related to: https://github.com/aquarist-labs/s3gw/issues/361 Signed-off-by: Giuseppe Baccini <giuseppe.baccini@suse.com>
rebased on latest main |
@l-mb @jecluis @vmoutoussamy |
High Availability research
This is a first attempt to define the direction we want to take for the HA topic with s3gw.
Feedbacks, comments, requests, considerations etc; all is good at this time.
Related to: https://github.com/aquarist-labs/s3gw/issues/361
Signed-off-by: Giuseppe Baccini giuseppe.baccini@suse.com
Checklist before requesting a review