Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Half Open State Handling #254

Closed
pvmraghunandan opened this issue Jun 5, 2017 · 4 comments
Closed

Half Open State Handling #254

pvmraghunandan opened this issue Jun 5, 2017 · 4 comments

Comments

@pvmraghunandan
Copy link

pvmraghunandan commented Jun 5, 2017

During Half Open State, as per general recommendation, below is guideline

A limited number of requests from the application are allowed to pass through and invoke the operation. If these requests are successful, it's assumed that the fault that was previously causing the failure has been fixed and the circuit breaker switches to the Closed state (the failure counter is reset). If any request fails, the circuit breaker assumes that the fault is still present so it reverts back to the Open state and restarts the timeout timer to give the system a further period of time to recover from the failure.

(Source : https://docs.microsoft.com/en-us/azure/architecture/patterns/circuit-breaker)

But, currently seeing the way Polly is implemented, it sends all requests (if we get 1000 requests after half-open) and acts upon first result of request and change state to Closed/Open. Shouldn't we apply same metric calculation (in case of advanced) and limit requests probably via different configuration? Pasting switch/case below

 public override void OnActionFailure(DelegateResult<TResult> outcome, Context context)
 {
     using (TimedLock.Lock(_lock))
     {
         _lastOutcome = outcome;

         switch (_circuitState)
         {
             case CircuitState.HalfOpen:
                 Break_NeedsLock(context);
                 return;
				
             default:
                 throw new InvalidOperationException("Unhandled CircuitState.");
         }
     }
 }
@reisenberger
Copy link
Member

@pvmraghunandan I am currently on travel and will respond to this in two or three days' time.

@reisenberger
Copy link
Member

reisenberger commented Jun 8, 2017

Thanks @pvmraghunandan for this!

The possibility of a request stampede during half-open state was identified recently and we believe fixed in v5.0.5. If you think this is still an issue (after upgrading to >=v5.0.5 if necessary), please let us know. This is the code that enforces only one new trial call per break duration. Multiple requests that are already in flight when the circuit transitions to open state can of course not be prevented from still returning during the half-open state (discussion of Polly's perspective on this: #217, #218).

We've had discussion lately of a more-refined metric for transitioning out of half-open state under #239 : latest thinking here. For a consecutive-count-based breaker (the case discussed in the patterns/practices article), it seems fairly straightforward. The AdvancedCircuitBreaker case (the Hystrix circuit-breaker model) is less transparent, though solvable. Further comments welcome under #239!


EDIT: Community PRs on this welcome! (I can provide guidance). Other Polly development streams mean I would be unlikely to look at this one further for some months.

@pvmraghunandan
Copy link
Author

@reisenberger I have been quite a busy and didn't get chance to check status on this issue. I will check it out today and update accordingly. Thanks for quick response.

@reisenberger
Copy link
Member

Closing as duplicate of / covered by #239

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants