Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Replace deprecated grpc.Dial with grpc.NewClient #745

Merged
merged 3 commits into from
Apr 17, 2024
Merged

Conversation

ctlong
Copy link
Member

@ctlong ctlong commented Apr 17, 2024

Description

Replaces grpc.Dial usage with grpc.NewClient. gRPC deprecated (and then briefly un-deprecated, but plan to re-deprecate) Dial and DialContext in favor of NewClient. See grpc/grpc-go#7010 for more info about NewClient.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Testing performed?

  • Unit tests
  • Integration tests
  • Acceptance tests

Checklist:

  • This PR is being made against the main branch, or relevant version branch
  • I have made corresponding changes to the documentation
  • I have added testing for my changes

If you have any questions, or want to get attention for a PR or issue please reach out on the #logging-and-metrics channel in the cloudfoundry slack

acrmp and others added 2 commits April 17, 2024 17:50
Propogates errors received from the rlp via gRPC when sending requests.
Now that we don't block on dialing the server, we need to handle errors
at the sending requests level.

Signed-off-by: Andrew Crump <andrew.crump@broadcom.com>
@ctlong ctlong requested a review from a team as a code owner April 17, 2024 20:28
Fixes the failing pool tests in `plumbing` and `rlp/internal/ingress`
following the change to use `grpc.NewClient` instead of `grpc.Dial`.
This is because they were testing that connections were initiated when a
new doppler was added. However, with the change to NewClient,
connections won't be made until the resulting grpc client connections
are used for RPC. This also means that dopplers may be added to the pool
map when in the past they would not be added due to the connection on
Dial failing.

One approach to getting the existing tests to pass would have been to
call `Connect` on the grpc client connections to force them to leave
idle mode. However, `Connect` is experimental, and really gRPC is
encouraging us not to care about the connection state when creating
client connections.

To fix the tests we ended up just asserting on the pool size. One
downside of this was that we couldn't see a nice way to assert that
`Close` was called on the gRPC client connection when `Close` was called
for a doppler address. We could have replaced the connection creation
with an interface and mocked that but it didn't seem worth it.

Signed-off-by: Carson Long <12767276+ctlong@users.noreply.github.com>
Signed-off-by: Rebecca Roberts <rebecca.roberts@broadcom.com>
@ctlong
Copy link
Member Author

ctlong commented Apr 17, 2024

@rroberts2222 and I tested these changes within CF-D and saw that they resulted in RLP and TrafficController not initiating any TCP connections to Dopplers until a client attempted to connect to them. This is in contrast to the old behaviour where the Doppler connection pools were pre-filled to the max. We think this change is fine.

@acrmp and tested the RLP Gateway changes and saw that clients will consistently receive 503 errors now when the RLP is not available, e.g. from the log-stream plugin:

unexpected status code 503: {"message":"streaming is temporarily unavailable","error":"streaming_unavailable"}

The RLP Gateway logs will not emit stack traces in those cases, but will emit an error line for every failed request:

2024/04/16 21:41:25 failed to open stream from logs provider: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:8082: connect: connection refused"

@ctlong ctlong merged commit a5b858b into main Apr 17, 2024
6 checks passed
@ctlong ctlong deleted the fix/grpc branch April 17, 2024 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants