Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Share grpc connections across sdk clients #3239

Merged
merged 9 commits into from
Sep 2, 2022
Merged

Conversation

dnr
Copy link
Member

@dnr dnr commented Aug 17, 2022

What changed?
Use a new feature in temporalio/sdk-go#881 to share a single grpc connection across all sdk clients in one service.

The GetSystemClient/NewClient methods now don't take a logger, they get it from fx. This was confusing anyway since there are several calls to GetSystemClient with differently-tagged loggers and the first one will "win".

Why?
Reduce resource usage on worker and frontend when using many sdk clients.

How did you test it?
integration tests. will do another scale test later after some more related changes.

Potential risks

Is hotfix candidate?

@dnr dnr requested a review from a team as a code owner August 17, 2022 02:38
Comment on lines 82 to 83
f.lock.Lock()
defer f.lock.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get away without this lock?
For example, use once.Do(() => firstClient = init_first_client)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? What should the body of the Once do if creating the client fails? There's no way to "reset" the Once and let it try again next time. The Once in GetSystemClient can logger.Fatal in that case but I don't think we can do that here. Or can we?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, no, i didn't realize the creation could fail. ok forget about this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it might be reasonable to do that: we can enforce that the client for the system namespace is always created first, and we create others from that one.

as far as I can tell, sdkclient.Dial only needs to make one rpc to GetSystemInfo to succeed. it doesn't even need the requested namespace to exist yet.

and we already call GetSystemClient on startup. so we can just enforce that. let me give it another try...

Comment on lines 82 to 83
f.lock.Lock()
defer f.lock.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, no, i didn't realize the creation could fail. ok forget about this.

@dnr dnr force-pushed the resolver3 branch 2 times, most recently from 4118b53 to f3d721d Compare August 20, 2022 01:12
@dnr
Copy link
Member Author

dnr commented Aug 20, 2022

I changed it all so it just Fatals for any client, and also get the logger from fx. This seems reasonable since Dial or NewClientFromExisting don't fail if the namespace doesn't exist. (If they do, we can revert that part for NewClient.)

This should wait for temporalio/sdk-go#886 to merge (or else we should remove all Close calls in pernamespaceworker). Moving to draft until that's done.

@dnr dnr marked this pull request as draft August 20, 2022 01:16
err := backoff.ThrottleRetry(func() error {
sdkClient, err := sdkclient.Dial(f.options(primitives.SystemLocalNamespace))
if err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a warning log here.

@dnr dnr marked this pull request as ready for review September 1, 2022 20:22
@dnr dnr merged commit da4784f into temporalio:master Sep 2, 2022
@dnr dnr deleted the resolver3 branch September 2, 2022 01:20
dnr added a commit to dnr/temporal that referenced this pull request Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants