-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Share grpc connections across sdk clients #3239
Conversation
common/sdk/factory.go
Outdated
f.lock.Lock() | ||
defer f.lock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we get away without this lock?
For example, use once.Do(() => firstClient = init_first_client)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe? What should the body of the Once do if creating the client fails? There's no way to "reset" the Once and let it try again next time. The Once in GetSystemClient
can logger.Fatal in that case but I don't think we can do that here. Or can we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, no, i didn't realize the creation could fail. ok forget about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it might be reasonable to do that: we can enforce that the client for the system namespace is always created first, and we create others from that one.
as far as I can tell, sdkclient.Dial only needs to make one rpc to GetSystemInfo to succeed. it doesn't even need the requested namespace to exist yet.
and we already call GetSystemClient on startup. so we can just enforce that. let me give it another try...
common/sdk/factory.go
Outdated
f.lock.Lock() | ||
defer f.lock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, no, i didn't realize the creation could fail. ok forget about this.
4118b53
to
f3d721d
Compare
I changed it all so it just This should wait for temporalio/sdk-go#886 to merge (or else we should remove all Close calls in pernamespaceworker). Moving to draft until that's done. |
err := backoff.ThrottleRetry(func() error { | ||
sdkClient, err := sdkclient.Dial(f.options(primitives.SystemLocalNamespace)) | ||
if err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a warning log here.
What changed?
Use a new feature in temporalio/sdk-go#881 to share a single grpc connection across all sdk clients in one service.
The GetSystemClient/NewClient methods now don't take a logger, they get it from fx. This was confusing anyway since there are several calls to GetSystemClient with differently-tagged loggers and the first one will "win".
Why?
Reduce resource usage on worker and frontend when using many sdk clients.
How did you test it?
integration tests. will do another scale test later after some more related changes.
Potential risks
Is hotfix candidate?