Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-46124][CORE][SQL][SS][CONNECT][DSTREAM][MLLIB][ML][PYTHON][R][AVRO][K8S][YARN][UI] Replace explicit ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray #44041

Conversation

LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Nov 27, 2023

What changes were proposed in this pull request?

There is a behavioral difference between Scala 2.13 and 2.12 for explicit ArrayOps.toSeq calls, similar to the implicit conversion from Array to Seq.

In Scala 2.12, ArrayOps.toSeq will return thisCollection , and use implicit conversion rules to wrap the Array as mutable.WrappedArray, this process does not involve any collection copy:

Welcome to Scala 2.12.18 (OpenJDK 64-Bit Server VM, Java 17.0.9).
Type in expressions for evaluation. Or try :help.

scala> Array(1,2,3).toSeq
res0: Seq[Int] = WrappedArray(1, 2, 3)

However, in Scala 2.13, it returns an immutable.ArraySeq that with collection copy.

Since we have always used the non-collection copy behavior for this explicit conversion in the era of Scala 2.12, it is safe to assume that no collection copy is needed for Scala 2.13.

Therefore, this pr replaces explicit ArrayOps.toSeq in the Spark code with s.c.immutable.ArraySeq.unsafeWrapArray to avoid a collection copy, and this pr only involves changes to the production code, and does not involve changes to the test code.

Why are the changes needed?

Replace ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray to save a collection copy, which has potential benefits for performance."

Does this PR introduce any user-facing change?

No

How was this patch tested?

Pass GitHub Action

Was this patch authored or co-authored using generative AI tooling?

No

@LuciferYang LuciferYang marked this pull request as draft November 27, 2023 13:46
@LuciferYang LuciferYang changed the title Replace ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray [WIP] Replace ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray Nov 27, 2023
@LuciferYang LuciferYang changed the title [WIP] Replace ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray [WIP][CORE][SQL][SS][CONNECT][DSTREAM][MLLIB][ML][PYTHON][R][AVRO][K8S][YARN][UI][EXAMPLES] Replace ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray Nov 27, 2023
@LuciferYang LuciferYang changed the title [WIP][CORE][SQL][SS][CONNECT][DSTREAM][MLLIB][ML][PYTHON][R][AVRO][K8S][YARN][UI][EXAMPLES] Replace ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray [WIP][CORE][SQL][SS][CONNECT][DSTREAM][MLLIB][ML][PYTHON][R][AVRO][K8S][YARN][UI][EXAMPLES] Replace explicit ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray Nov 27, 2023
@LuciferYang LuciferYang changed the title [WIP][CORE][SQL][SS][CONNECT][DSTREAM][MLLIB][ML][PYTHON][R][AVRO][K8S][YARN][UI][EXAMPLES] Replace explicit ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray [WIP][SPARK-46124][CORE][SQL][SS][CONNECT][DSTREAM][MLLIB][ML][PYTHON][R][AVRO][K8S][YARN][UI][EXAMPLES] Replace explicit ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray Nov 27, 2023
@github-actions github-actions bot removed the EXAMPLES label Nov 28, 2023
@LuciferYang LuciferYang changed the title [WIP][SPARK-46124][CORE][SQL][SS][CONNECT][DSTREAM][MLLIB][ML][PYTHON][R][AVRO][K8S][YARN][UI][EXAMPLES] Replace explicit ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray [WIP][SPARK-46124][CORE][SQL][SS][CONNECT][DSTREAM][MLLIB][ML][PYTHON][R][AVRO][K8S][YARN][UI] Replace explicit ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray Nov 28, 2023
@LuciferYang LuciferYang changed the title [WIP][SPARK-46124][CORE][SQL][SS][CONNECT][DSTREAM][MLLIB][ML][PYTHON][R][AVRO][K8S][YARN][UI] Replace explicit ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray [SPARK-46124][CORE][SQL][SS][CONNECT][DSTREAM][MLLIB][ML][PYTHON][R][AVRO][K8S][YARN][UI] Replace explicit ArrayOps#toSeq with s.c.immutable.ArraySeq.unsafeWrapArray Nov 28, 2023
@LuciferYang
Copy link
Contributor Author

cc @srowen, could you help to review this pr if you have time. Thanks

Similar to #43666, the purpose of this pr is also to save a collection copy by using immutable.ArraySeq.unsafeWrapArray . This pr does not modify similar cases in the test code, and I do not plan to fix test code parts for the time being.

@srowen
Copy link
Member

srowen commented Nov 28, 2023

Seems reasonable to me, continuing the same refactoring. Is it ready?

@LuciferYang LuciferYang marked this pull request as ready for review November 28, 2023 17:16
@LuciferYang
Copy link
Contributor Author

Yes, it's ready

@LuciferYang
Copy link
Contributor Author

@srowen this PR Is ready, should we push it forward?

@srowen srowen closed this in 8a4890d Nov 29, 2023
@srowen
Copy link
Member

srowen commented Nov 29, 2023

Merged to master

@LuciferYang
Copy link
Contributor Author

Thanks @srowen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants