[improve][broker]PIP-340 Optimization of Probe Implementation for Automatic Failover #22133

yyj8 · 2024-02-27T11:31:22Z

For detailed improvement instructions, please refer to issues：
#22134

Motivation

The current Java client implementation has certain flaws in automatic fault switching.

org.apache.pulsar.client.impl.AutoClusterFailover.java
boolean probeAvailable(String url) {
        try {
            resolver.updateServiceUrl(url);
            InetSocketAddress endpoint = resolver.resolveHost();
            Socket socket = new Socket();
            socket.connect(new InetSocketAddress(endpoint.getHostName(), endpoint.getPort()), TIMEOUT);
            socket.close();

            return true
        } catch (Exception e) {
            log.warn("Failed to probe available, url: {}", url, e);
            return false;
        }
    }

The client only establishes a TCP connection with the exposed connection address of the cluster to determine whether the cluster is available, which cannot adapt to scenarios where the cluster is partially unavailable (half dead). In this scenario, we hope to make corresponding fault switching judgments by initiating cluster health status requests to the cluster. Then within the cluster, we provide an admin management command to update the cluster's health status. To avoid this scenario, all businesses that need to connect to this cluster need to manually switch cluster connection addresses and restart applications, resulting in inconsistent link data among multiple business team due to inconsistent operation steps.

Modifications

Add a new cluster health status request and response request;

case HEALTH_CHECK:
	checkArgument(cmd.hasHealthCheck());
	handleHealthCheck(cmd.getHealthCheck());
	break;

case HEALTH_CHECK_RESPONSE:
	checkArgument(cmd.hasHealthCheckResponse());
	handleHealthCheckResponse(cmd.getHealthCheckResponse());
	break;        

//PulsarApi.proto define two new request types.
message CommandHealthCheck {
    repeated KeyValue metadata = 1;
}

message CommandHealthCheckResponse {
    optional bool available = 1 [default = true];
}

Add a new admin management command to manually update the cluster health status;

//Update cluster health status, available or unavailable. default available
bin/pulsar-admin clusters update-health-status --status unavailable

For other detailed information, please refer to the PR code.

Verifying this change

Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

Added integration tests for end-to-end deployment with large payloads (10MB)
Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

Documentation

doc
doc-required
doc-not-needed
doc-complete

Matching PR in forked repository

PR in forked repository:
yyj8#8

Mail list: https://lists.apache.org/thread/kk8lbc92mtgt0hw3tl7dfw7fmpl4jwyq

…omatic Failover

lhotari · 2024-02-27T12:20:11Z

@yyj8 thanks for the contribution. The intention of the PIP process is that you'd first start the discussion on the mailing list before deciding the solution. There might be alternative ways to solve the problem. For example, In Pulsar we already have the Ping/Pong messages in the protocol.

lhotari · 2024-02-27T12:25:43Z

Your proposal seems to make sense a lot of sense, but the naming is perhaps not optimal. When looking at the changes, this looks like an active/passive status for a cluster and having that could make it more flexible. There might be other reasons than "health" to mark a cluster passive. Let's say when there's a need to do maintenance where it is desirable that clients move to the other cluster.

There has been some changes in Pulsar PIP process. We have a PIP template https://github.com/apache/pulsar/blob/master/pip/TEMPLATE.md for capturing the proposal in a markdown file. However, that could also happen later in the process, after the initial mailing list discussion.

lhotari · 2024-02-27T12:28:57Z

@yyj8 One of the challenges is making this proposal consistent with the Blue-Green deployment feature. PIP-188 #16551 (please note that this PIP was filed before we switched to use markdown files for capturing PIPs).
Would there be a way to make the requirements part of the Blue-Green deployment feature?

yyj8 · 2024-02-29T02:31:29Z

@yyj8 thanks for the contribution. The intention of the PIP process is that you'd first start the discussion on the mailing list before deciding the solution. There might be alternative ways to solve the problem. For example, In Pulsar we already have the Ping/Pong messages in the protocol.

@lhotari Thank you for your help.
Ping/Pong can only detect whether the process is alive, but cannot determine whether the process can provide produce and consume services normally.

yyj8 · 2024-02-29T06:14:27Z

@yyj8 One of the challenges is making this proposal consistent with the Blue-Green deployment feature. PIP-188 #16551 (please note that this PIP was filed before we switched to use markdown files for capturing PIPs). Would there be a way to make the requirements part of the Blue-Green deployment feature?

PIP-188 appears to be a very good solution for handling internal forwarding of traffic. It can forward the traffic written by clients to the Blue cluster in the Blue cluster to the Green cluster. However, what we hope for is that when the Blue cluster is abnormal, our client can directly complete the connection address switch to the Green cluster, instead of needing to occupy additional bandwidth and do another forwarding to the Blue cluster.

yyj8 · 2024-02-29T06:17:20Z

Your proposal seems to make sense a lot of sense, but the naming is perhaps not optimal. When looking at the changes, this looks like an active/passive status for a cluster and having that could make it more flexible. There might be other reasons than "health" to mark a cluster passive. Let's say when there's a need to do maintenance where it is desirable that clients move to the other cluster.

There has been some changes in Pulsar PIP process. We have a PIP template https://github.com/apache/pulsar/blob/master/pip/TEMPLATE.md for capturing the proposal in a markdown file. However, that could also happen later in the process, after the initial mailing list discussion.

This is a very good suggestion. After the discussion on the mailing list is completed, I will make modifications according to the PIP template and the suggested modifications discussed by everyone.

merlimat · 2024-02-29T07:32:21Z

@yyj8 Wouldn't this already be covered by the ControlledClusterFailover mechanism? https://pulsar.apache.org/api/client/3.2.x/org/apache/pulsar/client/api/ControlledClusterFailoverBuilder.html

merlimat · 2024-02-29T07:34:09Z

The idea for ControlledClusterFailover is to move the control of whether a cluster is "healthy" outside of the Pulsar scope. You just need to implement a HTTP service that will tell the clients where to connect.

yyj8 · 2024-02-29T12:10:02Z

The idea for ControlledClusterFailover is to move the control of whether a cluster is "healthy" outside of the Pulsar scope. You just need to implement a HTTP service that will tell the clients where to connect.

The client class AutoClusterFailover.java has implemented a timed detection service to check if the IP and port are in a normal state, and based on the detection results, determine whether it is necessary to switch cluster connection addresses. So, we only need to add a request type (CommandHealthCheck) in the detection logic to obtain the cluster health status.

Check and update the logic as follows:

  private void probeAndUpdateServiceUrl(List<String> targetServiceUrls,
                                          Map<String, Authentication> authentications,
                                          Map<String, String> tlsTrustCertsFilePaths,
                                          Map<String, String> tlsTrustStorePaths,
                                          Map<String, String> tlsTrustStorePasswords) {
        if (probeAvailable(currentPulsarServiceUrl)) {
            failedTimestamp = -1;
            return;
        }

        long currentTimestamp = System.nanoTime();
        if (failedTimestamp == -1) {
            failedTimestamp = currentTimestamp;
        } else if (currentTimestamp - failedTimestamp >= failoverDelayNs) {
            for (String targetServiceUrl : targetServiceUrls) {
                if (probeAvailable(targetServiceUrl)) {
                    log.info("Current Pulsar service is {}, it has been down for {} ms, "
                                    + "switch to the service {}. The current service down at {}",
                            currentPulsarServiceUrl, nanosToMillis(currentTimestamp - failedTimestamp),
                            targetServiceUrl, failedTimestamp);
                    updateServiceUrl(targetServiceUrl,
                            authentications != null ? authentications.get(targetServiceUrl) : null,
                            tlsTrustCertsFilePaths != null ? tlsTrustCertsFilePaths.get(targetServiceUrl) : null,
                            tlsTrustStorePaths != null ? tlsTrustStorePaths.get(targetServiceUrl) : null,
                            tlsTrustStorePasswords != null ? tlsTrustStorePasswords.get(targetServiceUrl) : null);
                    failedTimestamp = -1;
                    break;
                } else {
                    log.warn("Current Pulsar service is {}, it has been down for {} ms. "
                                    + "Failed to switch to service {}, "
                                    + "because it is not available, continue to probe next pulsar service.",
                        currentPulsarServiceUrl, nanosToMillis(currentTimestamp - failedTimestamp), targetServiceUrl);
                }
            }
        }
    }

Then, in the probe method, add a newly defined request logic for obtaining cluster health status.

boolean probeAvailable(String url) {
        try {
            resolver.updateServiceUrl(url);
            InetSocketAddress endpoint = resolver.resolveHost();
            Socket socket = new Socket();
            socket.connect(new InetSocketAddress(endpoint.getHostName(), endpoint.getPort()), TIMEOUT);
            socket.close();

            //Below is the newly added code
            ClientCnx clientCnx = pulsarClient.getCnxPool()
                    .getConnection(new InetSocketAddress(endpoint.getHostName(), endpoint.getPort()))
                    .get();
            clientCnx.ctx().writeAndFlush(Commands.newHealthCheck()).sync();

            return clientCnx.isClusterAvailable();
        } catch (Exception e) {
            log.warn("Failed to probe available, url: {}", url, e);
            return false;
        }
    }

lhotari · 2024-02-29T12:37:29Z

@yyj8 Wouldn't this already be covered by the ControlledClusterFailover mechanism? https://pulsar.apache.org/api/client/3.2.x/org/apache/pulsar/client/api/ControlledClusterFailoverBuilder.html

+1

@yyj8 Please check "PIP-121: Pulsar cluster level auto failover on client side", doc: #13315 and impl: #13316

Instead of adding yet another solution, could you simply use PIP-121?
You should be able to provide your own custom logic to customize it.

yyj8 · 2024-03-01T02:50:47Z

@yyj8 Wouldn't this already be covered by the ControlledClusterFailover mechanism? https://pulsar.apache.org/api/client/3.2.x/org/apache/pulsar/client/api/ControlledClusterFailoverBuilder.html

+1

@yyj8 Please check "PIP-121: Pulsar cluster level auto failover on client side", doc: #13315 and impl: #13316

Instead of adding yet another solution, could you simply use PIP-121? You should be able to provide your own custom logic to customize it.

@lhotari I have read "PIP-121: Pulsar cluster level auto failover on client side", only the detection of whether the IP and port can be connected has been implemented.

It is a very complex matter to determine whether a cluster can truly provide production and consumption services normally, and whether it has affected business processes. So, the goal of my PIP is to manually intervene in cluster state switching when the cluster part is unavailable and affects business processes, in order to achieve client to cluster connection switching.

Because it is difficult to accurately determine whether this partially abnormal scenario has affected the business program through program automation, it is likely to cause misjudgment and ultimately affect the business program.

Since initiating a request to obtain cluster health status is part of cluster probing, I placed the client's probing logic in the probeAvailable method of class AutoClusterFailover.java

We can consider providing an automated and accurate health detection implementation in future optimizations that can determine whether a cluster can provide services normally. Replace the current manual update of health status with automation.

[improve][broker]PIP-340 Optimization of Probe Implementation for Aut…

d333e48

…omatic Failover

github-actions bot added the doc-not-needed Your PR changes do not impact docs label Feb 27, 2024

This was referenced Feb 27, 2024

[improve][broker]PIP-340 Optimization of Probe Implementation for Automatic Failover #22134

Open

[improve][broker]PIP-340 Optimization of Probe Implementation for Automatic Failover yyj8/pulsar#8

Open

lhotari requested a review from heesung-sn February 27, 2024 12:25

yyj8 closed this Mar 1, 2024

yyj8 reopened this Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[improve][broker]PIP-340 Optimization of Probe Implementation for Automatic Failover #22133

[improve][broker]PIP-340 Optimization of Probe Implementation for Automatic Failover #22133

yyj8 commented Feb 27, 2024 •

edited by lhotari

Loading

lhotari commented Feb 27, 2024

lhotari commented Feb 27, 2024

lhotari commented Feb 27, 2024

yyj8 commented Feb 29, 2024

yyj8 commented Feb 29, 2024

yyj8 commented Feb 29, 2024 •

edited

Loading

merlimat commented Feb 29, 2024

merlimat commented Feb 29, 2024

yyj8 commented Feb 29, 2024

lhotari commented Feb 29, 2024 •

edited

Loading

yyj8 commented Mar 1, 2024 •

edited

Loading

[improve][broker]PIP-340 Optimization of Probe Implementation for Automatic Failover #22133

Are you sure you want to change the base?

[improve][broker]PIP-340 Optimization of Probe Implementation for Automatic Failover #22133

Conversation

yyj8 commented Feb 27, 2024 • edited by lhotari Loading

Motivation

Modifications

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Matching PR in forked repository

lhotari commented Feb 27, 2024

lhotari commented Feb 27, 2024

lhotari commented Feb 27, 2024

yyj8 commented Feb 29, 2024

yyj8 commented Feb 29, 2024

yyj8 commented Feb 29, 2024 • edited Loading

merlimat commented Feb 29, 2024

merlimat commented Feb 29, 2024

yyj8 commented Feb 29, 2024

lhotari commented Feb 29, 2024 • edited Loading

yyj8 commented Mar 1, 2024 • edited Loading

yyj8 commented Feb 27, 2024 •

edited by lhotari

Loading

yyj8 commented Feb 29, 2024 •

edited

Loading

lhotari commented Feb 29, 2024 •

edited

Loading

yyj8 commented Mar 1, 2024 •

edited

Loading