-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(bigquery/storage/managedwriter): correct reconnection logic #8164
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signalling for an AppendRows stream when schema changes is predicated on the backend's status for the connection. For a simplex (non-multiplexed) connection, the expectation is the client closes and reconnects to signal there's a change in the schema. For a connection in multiplex mode, no reconnection is necessary and the backend will look at the schema for changes. In managedwriter, we allow a user to specify multiplex at the outset, but for connections that haven't actually sent writes for more than a single stream ID the backend doesn't recognize the multiplex status. This PR expands the interface for send optimizer to signal whether the optimizer has sent writes for multiple connections, and uses it when making the determination about schema-based reconnects. It also augments the schema evolution test to validate using multiple combinations of writer and client options.
product-auto-label
bot
added
size: m
Pull request size is medium.
api: bigquery
Issues related to the BigQuery API.
labels
Jun 21, 2023
alvarowolfx
approved these changes
Jun 23, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
gcf-merge-on-green bot
pushed a commit
that referenced
this pull request
Jun 26, 2023
🤖 I have created a release *beep* *boop* --- ## [1.52.0](https://github.com/googleapis/google-cloud-go/compare/bigquery/v1.51.2...bigquery/v1.52.0) (2023-06-23) ### Features * **bigquery/storage:** Add estimated physical file sizes to ReadAPI v1 ([94ea341](https://github.com/googleapis/google-cloud-go/commit/94ea3410e233db6040a7cb0a931948f1e3bb4c9a)) * **bigquery/storage:** Add table sampling to ReadAPI v1 ([ca94e27](https://github.com/googleapis/google-cloud-go/commit/ca94e2724f9e2610b46aefd0a3b5ddc06102e91b)) * **bigquery:** Support for tables primary and foreign keys ([#8055](https://github.com/googleapis/google-cloud-go/issues/8055)) ([93d6a1a](https://github.com/googleapis/google-cloud-go/commit/93d6a1a1a3bde8d3519acc2b7e77bf8b7ba1678a)) * **bigquery:** Update all direct dependencies ([b340d03](https://github.com/googleapis/google-cloud-go/commit/b340d030f2b52a4ce48846ce63984b28583abde6)) ### Bug Fixes * **bigquery/storage/managedwriter:** Correct reconnection logic ([#8164](https://github.com/googleapis/google-cloud-go/issues/8164)) ([a67d53d](https://github.com/googleapis/google-cloud-go/commit/a67d53ddf13b7d382d4c7856cafb068919021912)) * **bigquery:** REST query UpdateMask bug ([df52820](https://github.com/googleapis/google-cloud-go/commit/df52820b0e7721954809a8aa8700b93c5662dc9b)) * **bigquery:** RowIterator.Schema not filled when using Storage Read API ([#7671](https://github.com/googleapis/google-cloud-go/issues/7671)) ([31040e8](https://github.com/googleapis/google-cloud-go/commit/31040e8a7989b143c0c3c3f3e31c4a9dfbba8094)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
This was referenced Jul 6, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Signalling for an AppendRows stream when schema changes is predicated on the backend's status for the connection. For a simplex (non-multiplexed) connection, the expectation is the client closes and reconnects to signal there's a change in the schema.
For a connection in multiplex mode, no reconnection is necessary and the backend will look at the schema for changes.
In managedwriter, we allow a user to specify multiplex at the outset, but for connections that haven't actually sent writes for more than a single stream ID the backend doesn't recognize the multiplex status. Essentially, client state is out of sync with server state until a second stream writes on the same connection.
This PR expands the interface for send optimizer to signal whether the optimizer has sent writes for multiple connections, and uses it when making the determination about schema-based reconnects. It also augments the schema evolution test to validate using multiple combinations of writer and client options.