Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-48021][ML][BUILD] Add
--add-modules=jdk.incubator.vector
to …
…`JavaModuleOptions` ### What changes were proposed in this pull request? The pr aims to: - add `--add-modules=jdk.incubator.vector` to `JavaModuleOptions` - remove `jdk.incubator.foreign` and `-Dforeign.restricted=warn` from `SparkBuild.scala` ### Why are the changes needed? 1.`jdk.incubator.vector` First introduction: apache#30810 https://github.com/apache/spark/pull/30810/files#diff-6f545c33f2fcc975200bf208c900a600a593ce6b170180f81e2f93b3efb6cb3e <img width="1045" alt="image" src="https://github.com/apache/spark/assets/15246973/6ac7919a-5d82-475c-b8a2-7d9de71acacc"> Why should we add `--add-modules=jdk.incubator.vector` to `JavaModuleOptions`, Because when we only add `--add-modules=jdk.incubator.vector` to `SparkBuild.scala`, it will only take effect when compiling, as follows: ``` build/sbt "mllib-local/Test/runMain org.apache.spark.ml.linalg.BLASBenchmark" ... ``` <img width="619" alt="image" src="https://github.com/apache/spark/assets/15246973/54d5f55f-cefe-4126-b255-69488f8699a6"> However, when we use `spark-submit`, it is as follows: ``` ./bin/spark-submit --class org.apache.spark.ml.linalg.BLASBenchmark /Users/panbingkun/Developer/spark/spark-community/mllib-local/target/scala-2.13/spark-mllib-local_2.13-4.0.0-SNAPSHOT-tests.jar ``` <img width="1399" alt="image" src="https://github.com/apache/spark/assets/15246973/8e02fa93-fef4-4cdc-96bd-908b3e9baea1"> Obviously, `--add-modules=jdk.incubator.vector` does not take effect in the `Spark runtime`, so I propose adding `--add-modules=jdk.incubator.vector` to the `JavaModuleOptions`(`Spark runtime options`) so that we can improve `performance` by using `hardware-accelerated BLAS operations` by default. After this patch(add `--add-modules=jdk.incubator.vector` to the `JavaModuleOptions`), as follows: <img width="1399" alt="image" src="https://github.com/apache/spark/assets/15246973/da7aa494-0d3c-4c60-9991-e7cd29a1cec5"> 2.`jdk.incubator.foreign` and `-Dforeign.restricted=warn` A.First introduction: apache#32253 https://github.com/apache/spark/pull/32253/files#diff-6f545c33f2fcc975200bf208c900a600a593ce6b170180f81e2f93b3efb6cb3e <img width="1041" alt="image" src="https://github.com/apache/spark/assets/15246973/3f526019-c389-4e60-ab2a-7777f8e99cfb"> Use `dev.ludovic.netlib:blas:1.3.2`, the class `ForeignLinkerBLAS` uses `jdk.incubator.foreign.*` in this version, so we need to add `jdk.incubator.foreign` and `-Dforeign.restricted=warn` to `SparkBuild.scala` https://github.com/apache/spark/pull/32253/files#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8 <img width="497" alt="image" src="https://github.com/apache/spark/assets/15246973/4fd35e96-0da2-4456-a3f6-6b57ad2e9b64"> https://github.com/luhenry/netlib/blob/v1.3.2/blas/src/main/java/dev/ludovic/netlib/blas/ForeignLinkerBLAS.java#L36 <img width="743" alt="image" src="https://github.com/apache/spark/assets/15246973/4b7e3bd1-4650-4c7d-bdb4-c1761d48d478"> However, with the iterative development of `dev.ludovic.netlib`, `ForeignLinkerBLAS` has experienced one `major` change, as following: luhenry/netlib@48e923c <img width="452" alt="image" src="https://github.com/apache/spark/assets/15246973/7ba30b19-00c7-4cc4-bea7-a6ab4b326ad8"> From now on (V3.0.0), `jdk.incubator.foreign.*` will not be used in `dev.ludovic.netlib` Currently, Spark has used the `dev.ludovic.netlib` of version `v3.0.3`. In this version, `ForeignLinkerBLAS` has be removed. https://github.com/apache/spark/blob/master/pom.xml#L191 Double check (`jdk.incubator.foreign` cannot be found in the `netlib` source code): <img width="674" alt="image" src="https://github.com/apache/spark/assets/15246973/5c6c6d73-6a5d-427a-9fb4-f626f02335ca"> So we can completely remove options `jdk.incubator.foreign` and `-Dforeign.restricted=warn`. B.For JDK 21 (PS: This is to explain the historical reasons for the differences between the current code logic and the initial ones) (Just because `Spark` made changes to support `JDK 21`) https://issues.apache.org/jira/browse/SPARK-44088 <img width="1350" alt="image" src="https://github.com/apache/spark/assets/15246973/34e7e7e8-4e72-470e-abc0-d79406ad25e5"> ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - Manually test - Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#46246 from panbingkun/test_spark_build. Authored-by: panbingkun <panbingkun@baidu.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
- Loading branch information