Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retryableClientQuarantineRefreshPercentage parameter should increase the verification #1012

Closed
holy12345 opened this issue Nov 14, 2017 · 0 comments

Comments

@holy12345
Copy link
Contributor

I was studying eureka client registered to eureka server, that there is a place in the source code can be improved。
An eureka client looks for an eureka server by setting the eureka.client.service-url.defaultZone band,In my case, my properties file reads as follows:

server.port=8081
eureka.client.service-url.defaultZone=http://localhost:8082,http://localhost:8083,http://localhost:8084,http://localhost:8085
eureka.client.transport.retryableClientQuarantineRefreshPercentage=10

I know by learning the source, when eureka client starts, it will register its own information to eureka server,Specific source code is as follows RetryableEurekaHttpClient.java

@Override
protected <R> EurekaHttpResponse<R> execute(RequestExecutor<R> requestExecutor) {
    List<EurekaEndpoint> candidateHosts = null;
    int endpointIdx = 0;
    for (int retry = 0; retry < numberOfRetries; retry++) {
        EurekaHttpClient currentHttpClient = delegate.get();
        EurekaEndpoint currentEndpoint = null;
        if (currentHttpClient == null) {
            if (candidateHosts == null) {
                candidateHosts = getHostCandidates();
                if (candidateHosts.isEmpty()) {
                    throw new TransportException("There is no known eureka server; cluster server list is empty");
                }
            }
            if (endpointIdx >= candidateHosts.size()) {
                throw new TransportException("Cannot execute request on any known server");
            }

            currentEndpoint = candidateHosts.get(endpointIdx++);
            currentHttpClient = clientFactory.newClient(currentEndpoint);
        }

        try {
            EurekaHttpResponse<R> response = requestExecutor.execute(currentHttpClient);
            if (serverStatusEvaluator.accept(response.getStatusCode(), requestExecutor.getRequestType())) {
                delegate.set(currentHttpClient);
                if (retry > 0) {
                    logger.info("Request execution succeeded on retry #{}", retry);
                }
                return response;
            }
            logger.warn("Request execution failure with status code {}; retrying on another server if available", response.getStatusCode());
        } catch (Exception e) {
            logger.warn("Request execution failed with message: {}", e.getMessage());  // just log message as the underlying client should log the stacktrace
        }

        // Connection error or 5xx from the server that must be retried on another server
        delegate.compareAndSet(currentHttpClient, null);
        if (currentEndpoint != null) {
            quarantineSet.add(currentEndpoint);
        }
    }
    throw new TransportException("Retry limit reached; giving up on completing the request");
}

The getHostCandidates () method in RetryableEurekaHttpClient.java is getting the eureka server address to connect to this time
transportConfig.getRetryableClientQuarantineRefreshPercentage()This parameter is configurable, as I did with the properties above (configured in the example above is 10)

Then the problem is coming:

The first candidateHosts size is 3, the content is

//localhost:8082,http://localhost:8083,http://localhost:8084

Assuming all three eureka server services are not available.

The second time through the background tasks to re-enter the methodquarantineSet size is 3 and candidateHosts size is 1, the content is//localhost:8085.

Assuming // localhost: 8085 is not available at this time.

Then the third time according to the original logic should be (quarantineSet.size ()> = threshold) logic, but as eureka.client.transport.retryableClientQuarantineRefreshPercentage = 10 this parameter is set to a large, so this logic is not executed, this time candidateHosts The value of 0, since then eureka server is good, the eureka client will not be registered.

My solution is to modify thegetHostCandidates() method:

private List<EurekaEndpoint> getHostCandidates() {
    List<EurekaEndpoint> candidateHosts = clusterResolver.getClusterEndpoints();
    quarantineSet.retainAll(candidateHosts);

    // If enough hosts are bad, we have no choice but start over again
    int threshold = (int) (candidateHosts.size() * transportConfig.getRetryableClientQuarantineRefreshPercentage());
    
    
    /*********Where I improve start************/
    if (threshold > candidateHosts.size()) {
  	threshold = candidateHosts.size();
    }
    /**********Where I improve end**********/

    if (quarantineSet.isEmpty()) {
        // no-op
    } else if (quarantineSet.size() >= threshold) {
        logger.debug("Clearing quarantined list of size {}", quarantineSet.size());
        quarantineSet.clear();
    } else {
        List<EurekaEndpoint> remainingHosts = new ArrayList<>(candidateHosts.size());
        for (EurekaEndpoint endpoint : candidateHosts) {
            if (!quarantineSet.contains(endpoint)) {
                remainingHosts.add(endpoint);
            }
        }
        candidateHosts = remainingHosts;
    }

    return candidateHosts;
}

The above is my confusion and given the solution, do you think so. If you also agree, I would like to submit a pr for improvement, finally thank you for your answer : )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant