-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-26191][SQL] Control truncation of Spark plans via maxFields parameter #23159
Conversation
ping @hvanhovell |
Test build #99342 has finished for PR 23159 at commit
|
@gatorsmile @cloud-fan Could you look at the changes - extracted from another PR: #22429 |
Test build #99513 has finished for PR 23159 at commit
|
Test build #99520 has finished for PR 23159 at commit
|
jenkins, retest this, please |
Test build #99527 has finished for PR 23159 at commit
|
@HyukjinKwon @dongjoon-hyun @srowen @zsxwing Do you have any objections of this PR? |
Rather than change every single call to this method, if this should generally be the value of the argument, then why not make it the default value or something? |
Test build #99678 has finished for PR 23159 at commit
|
New parameter aims to solve the problem when there are multiple callers, and each of them needs different maximum fields. So, a feasible approach is to propagate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK so not every call would just pass the value SQLConf.get.maxToStringFields
? It looked like it from the code here but I didn't examine the whole diff. If it were really always SQLConf.get.maxToStringFields
and configured that way this could be simpler, but I suppose it isn't.
@@ -1777,7 +1777,7 @@ class Analyzer( | |||
|
|||
case p if p.expressions.exists(hasGenerator) => | |||
throw new AnalysisException("Generators are not supported outside the SELECT clause, but " + | |||
"got: " + p.simpleString) | |||
"got: " + p.simpleString((SQLConf.get.maxToStringFields))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: are there extra parens here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed here and other places
Test build #99715 has finished for PR 23159 at commit
|
jenkins, retest this, please |
Test build #99735 has finished for PR 23159 at commit
|
cc @cloud-fan and @gatorsmile . |
Looks okay but let me leave it to @cloud-fan, @gatorsmile and @hvanhovell |
@hvanhovell We all are waiting for your decision ;-). Please, review the PR. |
Test build #100349 has finished for PR 23159 at commit
|
I am going to close the PR since nobody interested in it :-( |
@MaxGekk can you resolve the merge conflicts? |
LGTM |
Test build #100425 has finished for PR 23159 at commit
|
merging to master. |
…rameter ## What changes were proposed in this pull request? In the PR, I propose to add `maxFields` parameter to all functions involved in creation of textual representation of spark plans such as `simpleString` and `verboseString`. New parameter restricts number of fields converted to truncated strings. Any elements beyond the limit will be dropped and replaced by a `"... N more fields"` placeholder. The threshold is bumped up to `Int.MaxValue` for `toFile()`. ## How was this patch tested? Added a test to `QueryExecutionSuite` which checks `maxFields` impacts on number of truncated fields in `LocalRelation`. Closes apache#23159 from MaxGekk/to-file-max-fields. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Signed-off-by: Herman van Hovell <hvanhovell@databricks.com>
…rameter ## What changes were proposed in this pull request? In the PR, I propose to add `maxFields` parameter to all functions involved in creation of textual representation of spark plans such as `simpleString` and `verboseString`. New parameter restricts number of fields converted to truncated strings. Any elements beyond the limit will be dropped and replaced by a `"... N more fields"` placeholder. The threshold is bumped up to `Int.MaxValue` for `toFile()`. ## How was this patch tested? Added a test to `QueryExecutionSuite` which checks `maxFields` impacts on number of truncated fields in `LocalRelation`. Closes apache#23159 from MaxGekk/to-file-max-fields. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Signed-off-by: Herman van Hovell <hvanhovell@databricks.com>
What changes were proposed in this pull request?
In the PR, I propose to add
maxFields
parameter to all functions involved in creation of textual representation of spark plans such assimpleString
andverboseString
. New parameter restricts number of fields converted to truncated strings. Any elements beyond the limit will be dropped and replaced by a"... N more fields"
placeholder. The threshold is bumped up toInt.MaxValue
fortoFile()
.How was this patch tested?
Added a test to
QueryExecutionSuite
which checksmaxFields
impacts on number of truncated fields inLocalRelation
.