-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix close in use certificate providers after double Close()
method call on wrapper object
#7128
Conversation
…call on wrapper object Fix errors like: ``` [core][Channel grpc#39 SubChannel grpc#941] grpc: addrConn.createTransport failed to connect to ... Err: connection error: desc = "transport: authentication handshake failed: xds: fetching trusted roots from CertificateProvider failed: provider instance is closed" ```
@bozaro -- Calling Close() feels more like a User error to me and wouldnt categorize this as a bug. I agree that the API could have been better by not providing a Close() method which can be called multiple times, which would cause incorrect accounting wrt That being said, inorder to fix this, I would rather make Provider.Close a grpcsync.OnceFunc. That way we can cleanly assert that a reference can only decrement their reference when calling Close(). |
I think that even if the double The situation when all neighboring uses must be written correctly for your code to work is very fragile. I have found at least two places through which double
I can also add a wrapper in |
@bozaro : Thank you for figuring this out and sending us a PR. There are a few options available to fix this issue:
Instead the approach that we like (which is quite similar to your approach) is as follows:
type singleCloseWrappedProvider struct {
mu sync.Mutex
provider Provider
}
func (s *singleCloseWrappedProvider) KeyMaterial(ctx context.Context) (*KeyMaterial, error) {
s.mu.Lock()
p := s.provider
s.mu.Unlock()
return p.KeyMaterial(ctx)
}
func (s *singleCloseWrappedProvider) Close() {
s.mu.Lock()
s.provider.Close()
s.provider = panickingProvider{}
s.mu.Unlock()
}
type panickingProvider struct{}
func (panickingProvider) Close() {
panic("Close called more than once on certificate provider")
}
func (panickingProvider) KeyMaterial(context.Context) (*KeyMaterial, error) {
return nil, errProviderClosed
}
Please let me know your thoughts on this approach. |
385dd0d
to
fd7af42
Compare
…method call on wrapper object
fd7af42
to
64fdac1
Compare
This approach has the opposite problem - you can use Provider after closing if it is used by someone else.
I don't think double closure is a problem that needs to be dealt with:
func foo() error {
x, err := os.Open(fileName)
if err != nil {
return err
}
defer x.Close()
// ...
if err := x.Close(); err != nil {
return err
}
// ...
return nil
}
I have corrected the PR, except that I do not panic on double |
@easwars, as a result, I'm trying to make the |
I agree that there are things like files and conns, where Go supports multiple calls to close. But there are things like channels, which panic when closed more than once. But making the double close lead to I agree that both approaches (whether we panic or not on double close) result in |
@easwars, I tried adding panic when closing twice: https://github.com/bozaro/grpc-go/pull/2/files (all unit tests green). But:
|
@dfawley for second set of eyes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using the server in production with xds:/// + mTLS, there is guaranteed (or very often) to be a double Close() call at the shutdown stage.
This is concerning, as it indicates there may be a problem with either the basic usage of the provider itself, or the ownership model surrounding the provider. Can you provide any more information about how this happens?
} | ||
|
||
// singleCloseWrappedProvider wraps a provider instance with a reference count to avoid double | ||
// close still in use provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
... with a reference count to properly handle multiple calls to Close.
w.mu.Lock() | ||
defer w.mu.Unlock() | ||
|
||
w.provider.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this call without the lock held?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have replaced the use of sync.RWMutex
with atomic.Pointer
…method call on wrapper object
…method call on wrapper object
eccee0e
to
527266f
Compare
…method call on wrapper object
The most common example of double First
Second
|
@bozaro : Thank you very much for the stack traces. I think this is what is happening:
Also, I see that we have defined this error sentinel grpc-go/credentials/credentials.go Line 125 in e22436a
I feel that the xDS credentials implementation should return this error in the case where the server handshake failed, and it ended up calling @dfawley |
Interesting sequence of events. It seems if we use TLS to wrap a connection, we need to allow for double-close (either in
|
Do I need to do anything with this PR (e.g. rebase)? |
@dfawley : Based on our offline discussion, the changes here reflect mostly what we want. We can track the double close in the TLS code path separately (if required, and if we can do anything actionable there). This is waiting for your second review, apart from the minor nit that I have above. |
wrappedProvider
add refcounter tocertprovider.Provider
, but there no guards to prevent multipleClose()
call after singleBuild()
call.The situation is complicated by:
certprovider.Provider
implementation don't panic/error on multipleClose()
calls.Fix errors like:
RELEASE NOTES:
xds: fetching trusted roots from CertificateProvider failed: provider instance is closed
error