-
Notifications
You must be signed in to change notification settings - Fork 769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(bigquery): Improve roundtrip of typed STRUCTs #4684
Conversation
I think I'll actually go ahead and close this for now. The issue is that we canonicalize typed STRUCTs into cast, but during generation we can't deduce if the same cast was a user vs a canonicalized one. Notice how it's not always safe to transform these casts back into the typed versions, e.g:
bq> SELECT CAST(STRUCT(1 AS old_name) AS STRUCT<new_name INT64>) AS strct;
strct.new_name
1
bq> SELECT STRUCT<new_name INT64>(1 AS old_name) AS strct;
Error: STRUCT constructors cannot specify both an explicit type and field names with AS at [1:36] This is the reason I went ahead and added the |
PS: For future reference, it seems that typed structs will error only if the top-level fields are named, otherwise it's fine: bq> SELECT STRUCT<test INT64, bar STRUCT<foo INT64>>(1, STRUCT(2 AS baz));
strct
"{
""strct"": {
""test"": ""1"",
""bar"": {
""foo"": ""2""
}
}
}"
This means that we can make the Generator check a linear scan, but that still incurs hits for wide STRUCTs: +++ b/sqlglot/dialects/bigquery.py
@@ -1236,7 +1236,7 @@ class BigQuery(Dialect):
if isinstance(this, exp.Array):
return f"{self.sql(expression, 'to')}{self.sql(this)}"
- if isinstance(this, exp.Struct) and not this.find(exp.PropertyEQ):
+ if isinstance(this, exp.Struct) and not any(isinstance(expr, exp.PropertyEQ) for expr in this.expressions): |
That canonicalization change made in #3751 is losing information, which is problematic since as you've demonstrated the two forms of struct definition aren't always equivalent in BigQuery. Could we perhaps change BigQuery parsing to add some sort of annotation to such canonicalized |
@sean-rose can you point out how converting the |
It's losing the information about whether the original form was a literal |
I hear you, however I think for this particular case we'll leave it as is, since BigQuery works with either form. I'd suggest overriding the relevant parts of the dialect if this is unacceptable, e.g. if implementing a formatting tool like you mentioned. |
Context #4671
cc: @sean-rose
Docs
BQ STRUCT