Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Clean up unused and duplicated 'org/roaringbitmap' folder in the spark3xx shims #11175

Closed
NvTimLiu opened this issue Jul 12, 2024 · 2 comments · Fixed by #11185
Closed
Assignees
Labels
bug Something isn't working

Comments

@NvTimLiu
Copy link
Collaborator

Describe the bug
Should clean up unused and duplicated 'spark3xx/META-INF/versions/11/org/roaringbitmap/' folder in the spark3xx shims,
we've had the same in ./spark-shared/com/nvidia/shaded/spark/org/roaringbitmap/ folder in the dist jar

./spark324/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark321/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark332db/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark330db/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark340/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark321cdh/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark320/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark332cdh/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark332/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark333/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark330cdh/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark342/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark330/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark334/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark341db/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark331/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark323/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark341/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark343/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark322/META-INF/versions/11/org/roaringbitmap/ArraysShim.class
./spark-shared/com/nvidia/shaded/spark/org/roaringbitmap/ArraysShim.class

Besides, the duplicated files in roaringbitmap/ will cause errors like jacoco different class with same name :
https://github.com/NVIDIA/spark-rapids/blob/branch-24.08/jenkins/Jenkinsfile-blossom.premerge#L196-L200

13:05:04  java.lang.IllegalStateException: Can't add different class with same name: com/nvidia/shaded/spark/org/roaringbitmap/ArraysShim
13:05:04  	at org.jacoco.core.analysis.CoverageBuilder.visitCoverage(CoverageBuilder.java:106)
13:05:04  	at org.jacoco.core.analysis.Analyzer$1.visitEnd(Analyzer.java:99)
13:05:04  	at org.objectweb.asm.ClassVisitor.visitEnd(ClassVisitor.java:377)
13:05:04  	at org.jacoco.core.internal.flow.ClassProbesAdapter.visitEnd(ClassProbesAdapter.java:100)
13:05:04  	at org.objectweb.asm.ClassReader.accept(ClassReader.java:725)
13:05:04  	at org.objectweb.asm.ClassReader.accept(ClassReader.java:401)
13:05:04  	at org.jacoco.core.analysis.Analyzer.analyzeClass(Analyzer.java:116)
13:05:04  	at org.jacoco.core.analysis.Analyzer.analyzeClass(Analyzer.java:132)
13:05:04  Caused: java.io.IOException: Error while analyzing /var/jenkins/jobs/rapids_premerge-github/builds/9737/jacoco/classes/org/roaringbitmap/ArraysShim.class.
13:05:04  	at org.jacoco.core.analysis.Analyzer.analyzerError(Analyzer.java:162)
@NvTimLiu NvTimLiu added bug Something isn't working ? - Needs Triage Need team to review and classify labels Jul 12, 2024
NvTimLiu added a commit to NvTimLiu/spark-rapids that referenced this issue Jul 12, 2024
To fix:  NVIDIA#11175

Clean up unused and duplicated 'org/roaringbitmap' folders in the spark3xx shims

Signed-off-by: Tim Liu <timl@nvidia.com>
NvTimLiu added a commit to NvTimLiu/spark-rapids that referenced this issue Jul 12, 2024
To fix: NVIDIA#11175

Clean up unused and duplicated 'org/roaringbitmap' in the spark320 shim folder to work around for the JACOCO error 'different class with same name', after we drop 31x shims and change the default shim to spark320

Signed-off-by: Tim Liu <timl@nvidia.com>
NvTimLiu added a commit to NvTimLiu/spark-rapids that referenced this issue Jul 12, 2024
To fix: NVIDIA#11175

Clean up unused and duplicated 'org/roaringbitmap' in the spark320 shim folder to walk around for the JACOCO error 'different class with same name', after we drop 31x shims and change the default shim to spark320

Signed-off-by: Tim Liu <timl@nvidia.com>
@pxLi pxLi reopened this Jul 15, 2024
@pxLi
Copy link
Collaborator

pxLi commented Jul 15, 2024

cc @liurenjie1024 to help thanks

@gerashegalov
Copy link
Collaborator

This looks like the first occurrence of a multi-release jar among our dependencies. binary-dedupe.sh does not handle it yet.

@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
5 participants