(#1626) fix for RPC error with BQ nested fields #1638
Merged
+86
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
First cut here - wanted to open this up for review.
In #1626, some issues with passing a list of dicts to
agate.Table.from_object
are outlined. This PR changes the BigQuery plugin to bypassfrom_object
, instead creating an object explicitly from a specified set of rows and columns. This has the very negative effect of requiring us to serialize iterables as json (if we didn't do this manually, Agate would just use theirrepr
s which is pretty unhelpful) -- see the included integration tests for an example.I think it may be appropriate to convert these nested/repeated records from dicts/lists (respectively) into json strings. Previously, dbt would fail with an error if any columns returned by
adapter.execute
were nested or repeated, so just returning without error is a marked improvement over the existing behavior.I'm not sure if there's a good way to reconcile Agate's behavior with BigQuery's nested/repeated records. Depending on how we choose to proceed, this issue either represents another nail in Agate's coffin for dbt, or a tacit decision to not leverage nested/repeated records on BigQuery. My personal stance is that we should ditch Agate as soon as we're able.
I believe BQ is the only place where a call to
execute
can return data that isn't strictly a list of dicts that map strings onto scalars. We can usetable_from_data_flat
in other adapters to bypass the type inference that Agate does -- that might be a good idea as well, though possibly one for another PR.