-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xds: Implement circuit breaking support. #4050
xds: Implement circuit breaking support. #4050
Conversation
@@ -285,6 +285,15 @@ func (b *cdsBalancer) handleSecurityConfig(config *xdsclient.SecurityConfig) err | |||
return nil | |||
} | |||
|
|||
func getCircuitBreaking(update xdsclient.ClusterUpdate) (circuitBreaking bool, maxRequests uint32) { | |||
for _, threshold := range update.Thresholds { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can move this for loop into xds client where we read the CDS response. So in ClusterUpdate, we only need the MaxRequests int.
I don't think we have plans to support the other priorities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented this change. This logic was moved into the client package, when it creates the ClusterUpdate
from the CDS response.
@@ -42,6 +42,10 @@ type EDSConfig struct { | |||
// will be disabled. If set to the empty string, load reporting will | |||
// be sent to the same server that we obtained CDS data from. | |||
LrsLoadReportingServerName *string | |||
// If circuit breaking is enabled. | |||
CircuitBreaking bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this bool.
You can make MaxRequests
a pointer, and nil
would mean circuit breaking is unset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got rid of this bool, MaxRequests
is a pointer now.
@@ -389,6 +391,18 @@ func (edsImpl *edsBalancerImpl) handleSubConnStateChange(sc balancer.SubConn, s | |||
} | |||
} | |||
|
|||
// updateConfig handles changes to the circuit breaking configuration. | |||
func (edsImpl *edsBalancerImpl) updateConfig(edsConfig *EDSConfig) { | |||
if edsImpl.counter == nil || edsImpl.counter.ServiceName != edsConfig.EDSServiceName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's protect this with an env variable GRPC_XDS_EXPERIMENTAL_CIRCUIT_BREAKING
. So if this is set to false, edsImpl.counter
is never set (is always nil).
Add the env variable like this: https://github.com/grpc/grpc-go/blob/master/xds/internal/env/env.go#L41
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created the environment variable.
if edsImpl.counter == nil || edsImpl.counter.ServiceName != edsConfig.EDSServiceName { | ||
edsImpl.counter = &client.ServiceRequestsCounter{ServiceName: edsConfig.EDSServiceName} | ||
} | ||
edsImpl.counter.UpdateService(edsConfig.CircuitBreaking, edsConfig.MaxRequests) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't update the service of this count. You should update the counter.
There should be a map[string]counter
in the client package. And you set edsImpl.counter
to the corresponding value. When service name changes, you read the new value, and set edsImpl.counter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made this change.
// If circuit breaking is enabled. | ||
CircuitBreaking bool | ||
// Max requests when circuit breaking. | ||
MaxRequests uint32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a second thought, we might not need this field.
The reason we need this is to update the upper bound in the counter.
There are two approaches:
- keep this field, and store it inside the eds.dropPicker. So the global counter doesn't keep the upper bound, but each picker keeps it. And we update the picker when this value changes.
- remove this field (so there's no change to the EDSConfig). Inside the CDS balancer, when it receives the ClusterUpdate, it updates the upper of the corresponding counter.
Now I'm a bit in favor of 2. Maybe give it a try. (Keep them as separate commits though, in case we regret later.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented 1 in this branch and pushed it. I'll implement 2 in a separate branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/GarrettGutierrez1/grpc-go/tree/xds-circuit-breaking-no-edsconfig-change
This branch implements 2. I do think it is simpler (less files changed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched it up, this branch now implements 2, and the following link implements 1.
|
||
type servicesRequestsCounter struct { | ||
mu sync.Mutex | ||
services map[string]serviceInfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A global map[string]counter
would work. Each dropPicker would only need the counter
, not the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made this change, now the dropPicker gets the counter.
} | ||
|
||
type serviceInfo struct { | ||
circuitBreaking bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this bool.
If circuit breaking is not enabled, this struct wouldn't exist, the drop picker will see a nil
counter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this bool.
if !ok { | ||
return fmt.Errorf("service name %v not identified", c.ServiceName) | ||
} | ||
sInfo.numRequests++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to not lock and ++
. Use atomic instead.
Lock adds too much contention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a check of numRequests
(to make sure it isn't > maxRequests
) and then after that it is updated using ++
, can I still use an atomic if I am doing 2 dependent operations on it?
xds/internal/client/client_xds.go
Outdated
// circuitBreakersFromCluster extracts the circuit breakers configuration from | ||
// the received cluster resource. Returns nil if no CircuitBreakers or no | ||
// Thresholds in CircuitBreakers. | ||
func circuitBreakersFromCluster(cluster *v3clusterpb.Cluster) ([]CircuitBreakerThreshold, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to return error, right. It's always nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made this change.
@@ -113,6 +113,8 @@ type edsBalancerImplInterface interface { | |||
handleSubConnStateChange(sc balancer.SubConn, state connectivity.State) | |||
// updateState handle a balancer state update from the priority. | |||
updateState(priority priorityType, s balancer.State) | |||
// updateServiceRequestsCounter handles an update to the service name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updates the service requests counter to the one for the given service name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made this change.
@@ -131,6 +132,10 @@ func (f *fakeEDSBalancer) handleEDSResponse(edsResp xdsclient.EndpointsUpdate) { | |||
|
|||
func (f *fakeEDSBalancer) updateState(priority priorityType, s balancer.State) {} | |||
|
|||
func (f *fakeEDSBalancer) updateServiceRequestsCounter(serviceName string) { | |||
f.serviceName.Send(serviceName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where should that be verified? I'm guessing added to one of the waitFor functions below?
Yes. And call it to verify in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made this change. Wasn't 100% sure what test you are talking about, added the check to TestConfigChildPolicyUpdate
because it is the only test that invokes the policy update and has a fakeEDSBalancer
.
defer src.mu.Unlock() | ||
c, ok := src.services[serviceName] | ||
if !ok { | ||
c = &ServiceRequestsCounter{ServiceName: serviceName, maxRequests: 1024} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define a const
for 1024
, and use the const here and following.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made this change.
name: "does-not-exceed-max-requests", | ||
maxRequests: 1024, | ||
numRequests: 1024, | ||
errorExpected: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace errorExpected
with expectedSuccesses
and expectedErrors
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made this change.
}, | ||
} | ||
|
||
func counterTestInit() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this? Is this to reset the counters? If so, call it resetServiceRequestsCounter
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made this change.
} | ||
|
||
// SetMaxRequests updates the max requests for a service's counter. | ||
func SetMaxRequests(serviceName string, maxRequests *uint32) *ServiceRequestsCounter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function doesn't need the return value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 2 nits.
t.Error("no error when error expected") | ||
} | ||
if loadedError != nil { | ||
fmt.Println(loadedError.(error)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made this change.
xds: Implement circuit breaking support.