[SPARK-38772][SQL] Formatting the log plan in AdaptiveSparkPlanExec #36045

wangyum · 2022-04-02T07:58:32Z

What changes were proposed in this pull request?

Use sideBySide to format the log plan in AdaptiveSparkPlanExec.
Before:

12:08:36.876 ERROR org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec: Plan changed from SortMergeJoin [key#13], [a#23], Inner
:- Sort [key#13 ASC NULLS FIRST], false, 0
:  +- ShuffleQueryStage 0
:     +- Exchange hashpartitioning(key#13, 5), ENSURE_REQUIREMENTS, [id=#110]
:        +- *(1) Filter (isnotnull(value#14) AND (value#14 = 1))
:           +- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#13, staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).value, true, false, true) AS value#14]
:              +- Scan[obj#12]
+- Sort [a#23 ASC NULLS FIRST], false, 0
   +- ShuffleQueryStage 1
      +- Exchange hashpartitioning(a#23, 5), ENSURE_REQUIREMENTS, [id=#129]
         +- *(2) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true])).a AS a#23, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true])).b AS b#24]
            +- Scan[obj#22]
 to BroadcastHashJoin [key#13], [a#23], Inner, BuildLeft, false
:- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#145]
:  +- ShuffleQueryStage 0
:     +- Exchange hashpartitioning(key#13, 5), ENSURE_REQUIREMENTS, [id=#110]
:        +- *(1) Filter (isnotnull(value#14) AND (value#14 = 1))
:           +- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#13, staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).value, true, false, true) AS value#14]
:              +- Scan[obj#12]
+- ShuffleQueryStage 1
   +- Exchange hashpartitioning(a#23, 5), ENSURE_REQUIREMENTS, [id=#129]
      +- *(2) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true])).a AS a#23, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true])).b AS b#24]
         +- Scan[obj#22]

After:

15:57:59.481 ERROR org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec: Plan changed:
!SortMergeJoin [key#13], [a#23], Inner                                                                                                                                                                                                                                                                                                                                         BroadcastHashJoin [key#13], [a#23], Inner, BuildLeft, false
!:- Sort [key#13 ASC NULLS FIRST], false, 0                                                                                                                                                                                                                                                                                                                                    :- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#145]
 :  +- ShuffleQueryStage 0                                                                                                                                                                                                                                                                                                                                                     :  +- ShuffleQueryStage 0
 :     +- Exchange hashpartitioning(key#13, 5), ENSURE_REQUIREMENTS, [id=#110]                                                                                                                                                                                                                                                                                                 :     +- Exchange hashpartitioning(key#13, 5), ENSURE_REQUIREMENTS, [id=#110]
 :        +- *(1) Filter (isnotnull(value#14) AND (value#14 = 1))                                                                                                                                                                                                                                                                                                              :        +- *(1) Filter (isnotnull(value#14) AND (value#14 = 1))
 :           +- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#13, staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).value, true, false, true) AS value#14]   :           +- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#13, staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).value, true, false, true) AS value#14]
 :              +- Scan[obj#12]                                                                                                                                                                                                                                                                                                                                                :              +- Scan[obj#12]
!+- Sort [a#23 ASC NULLS FIRST], false, 0                                                                                                                                                                                                                                                                                                                                      +- ShuffleQueryStage 1
!   +- ShuffleQueryStage 1                                                                                                                                                                                                                                                                                                                                                        +- Exchange hashpartitioning(a#23, 5), ENSURE_REQUIREMENTS, [id=#129]
!      +- Exchange hashpartitioning(a#23, 5), ENSURE_REQUIREMENTS, [id=#129]                                                                                                                                                                                                                                                                                                         +- *(2) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true])).a AS a#23, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true])).b AS b#24]
!         +- *(2) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true])).a AS a#23, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true])).b AS b#24]                                                                                                                                  +- Scan[obj#22]
!            +- Scan[obj#22]

Why are the changes needed?

Enhance readability.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manual testing.

wangyum · 2022-04-06T13:52:28Z

cc @cloud-fan

wangyum · 2022-04-07T00:54:59Z

Merged to master.

Use sideBySide format the plan

ef99def

github-actions bot added the SQL label Apr 2, 2022

cloud-fan approved these changes Apr 6, 2022

View reviewed changes

wangyum closed this in b57c93b Apr 7, 2022

wangyum deleted the SPARK-38772 branch April 7, 2022 00:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-38772][SQL] Formatting the log plan in AdaptiveSparkPlanExec #36045

[SPARK-38772][SQL] Formatting the log plan in AdaptiveSparkPlanExec #36045

wangyum commented Apr 2, 2022 •

edited

Loading

wangyum commented Apr 6, 2022

wangyum commented Apr 7, 2022

[SPARK-38772][SQL] Formatting the log plan in AdaptiveSparkPlanExec #36045

[SPARK-38772][SQL] Formatting the log plan in AdaptiveSparkPlanExec #36045

Conversation

wangyum commented Apr 2, 2022 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

wangyum commented Apr 6, 2022

wangyum commented Apr 7, 2022

wangyum commented Apr 2, 2022 •

edited

Loading