Replies: 6 comments
-
After I added spark.executor.resource.gpu.amount config option, I successfully runned a Spark LR Application with gpu. It seems there is no need to provide ML plugin, right? |
Beta Was this translation helpful? Give feedback.
-
The RAPIDS Accelerator for Apache Spark currently accelerates DataFrame operations that are often associated with ETL processing (i.e.: the data cleanup/preparation steps prior to ML training). So yes, it can accelerate ML applications as-is, but it does not accelerate Spark's mllib operations directly, rather only the standard DataFrame operations. See the list of operations supported for more details on what Spark DataFrame operations the RAPIDS Accelerator places on the GPU. We have discussed the possibility of targeting Spark's mllib more directly than just the ETL operations covered by the existing plugin, but it is not something we are actively working on at this time. |
Beta Was this translation helpful? Give feedback.
-
Hello,may I ask you a question. When you submit LR Applicaton to Spark GPU to run, can you also see the GPU being used through nvidia-smi? |
Beta Was this translation helpful? Give feedback.
-
Hi, I want to know whether you will achieve the acceleration of spark mllib on the GPU in the future, or whether it is already in progress. I know that ND4J library in DL4J(https://github.com/eclipse/deeplearning4j) implements many mathematical operations, which can be calculated by GPU. So is it possible to use java to implement some ML algorithms through this library to achieve acceleration. |
Beta Was this translation helpful? Give feedback.
-
It is totally possible to develop java bindings for cuml and expose them through Apache Spark compatible APIs. The real question is what benefit would that give an end user? Spark already has integration with a number of DL/ML libraries and most of them are python based. In fact we have seen a real shift towards python as the language of choice for DL/ML workloads. As was stated in your other question #1230 (comment) there are already ways to get access to these algorithms through Spark. It would be good if you could try that out and then comment on #1230 if they work for you, and what pain points you have for your particular use case. We can then look at those pain points and decide if we need to work on cleaning up this integration or try and go the rout of producing java APIs for doing these algorithms. |
Beta Was this translation helpful? Give feedback.
-
In fact, I want to implement a logistics regression through java, and then submit it to spark to run on the GPU. |
Beta Was this translation helpful? Give feedback.
-
hi,
It seems that user needs plugins so that NVIDIA Rapids could be combined with GPU resource to accelerate Spark Applications.
And I found that now this project provide SQLPlugin and SPARK_CUDF_JAR, so does this project plan to provide MLPlugin to accelerate Spark ML Application? Thx.
Beta Was this translation helpful? Give feedback.
All reactions