-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-3698][SQL] Correctly check case sensitivity in GetField #2543
Conversation
Can one of the admins verify this patch? |
resolveNesting(nestedFields, a, resolver), | ||
nestedFields.last)() // Preserve the case of the user's field access. | ||
Some(aliased) | ||
Some(Alias(nestedFields.foldLeft(a: Expression)(UnresolvedGetField), nestedFields.last)()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For something like a.b[0].c.d
, the origin logic here only works for a
and b
. but not c
and d
. So I just simplified the logic here and let the ResolveGetField
rule to do its job.
Would you mind to file a JIRA ticket for this PR? |
QA tests have started for PR 2543 at commit
|
QA tests have finished for PR 2543 at commit
|
override def toString = s"$child.${field.name}" | ||
} | ||
|
||
object GetField { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, I think it might be clearer to keep the resolver logic in the Analyzer rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to put this logic into Analyzer rule, but found some tests depend on GetField(child, fieldName)
, so I have to create this constructor of GetField
. And these two are so similar, so I combine them together. Maybe I should fix those tests instead?
Thanks for working on this! A few minor comments. |
Hi @marmbrus , I have updated my PR according to your comments. Do you mind review it again? |
QA tests have started for PR 2543 at commit
|
QA tests have finished for PR 2543 at commit
|
@@ -73,7 +73,9 @@ case class GetItem(child: Expression, ordinal: Expression) extends Expression { | |||
/** | |||
* Returns the value of fields in the Struct `child`. | |||
*/ | |||
case class GetField(child: Expression, fieldName: String) extends UnaryExpression { | |||
case class GetField(child: Expression, fieldName: String, findField: StructField => Boolean = null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need findField
anymore do we?
@marmbrus As commented before, I think we should handle |
Ping @marmbrus @liancheng I have finished the code locally, if you vote for |
Hi @marmbrus @liancheng, I think it's better to calculate the |
Sorry for the delay merging this, but I have been concerned that we are adding unnecessary complexity to analysis by adding more types of expressions. I've built a simpler solution in #3724 based off the test case that you provided. If that looks reasonable to you I suggest we close this issue. |
This PR is a follow up to #2382
It fix a bug when resolve something like
a.b[0].c.d
, #2382 only do case sensitive check when resolveUnresolved("a.b")
toGetField(Attribute("a"), "b")
, but notc
andd
.