Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not initialize class org.apache.pdfbox.pdmodel.PDDocument in quarkus-quickstart/tika-quickstart #198

Closed
leonas5555 opened this issue Aug 15, 2024 · 16 comments
Labels
question Further information is requested

Comments

@leonas5555
Copy link

this error exists in the current quarkus-quickstart/tika-quickstart (tested in local Dockerfile.native container run) with included

RUN microdnf update && microdnf install freetype fontconfig && microdnf clean all

compiled with

./mvnw package -Dnative -Dquarkus.native.container-build=true

quarkus 3.13.2
tika extension 2.0.3

2024-08-15 09:36:03,369 ERROR [io.qua.ver.htt.run.QuarkusErrorHandler] (executor-thread-1) HTTP Request to /parse/text failed, error id: 059e29a7-5e9c-4df4-9f9d-4f903303d912-2: java.lang.NoClassDefFoundError: Could not initialize class org.apache.pdfbox.pdmodel.PDDocument
2024-08-15T09:36:03.370365335Z 	at org.apache.tika.parser.pdf.PDFParser.getPDDocument(PDFParser.java:498)
2024-08-15T09:36:03.370374339Z 	at org.apache.tika.parser.pdf.PDFParser.getPDDocument(PDFParser.java:477)
2024-08-15T09:36:03.370377891Z 	at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:191)
2024-08-15T09:36:03.370380921Z 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
2024-08-15T09:36:03.370383588Z 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
2024-08-15T09:36:03.370386195Z 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:203)
2024-08-15T09:36:03.370388985Z 	at io.quarkus.tika.TikaParser.parseStream(TikaParser.java:85)
2024-08-15T09:36:03.370391582Z 	at io.quarkus.tika.TikaParser.parse(TikaParser.java:44)
2024-08-15T09:36:03.370394260Z 	at io.quarkus.tika.TikaParser.parse(TikaParser.java:40)
2024-08-15T09:36:03.370396910Z 	at io.quarkus.tika.TikaParser.parse(TikaParser.java:32)
2024-08-15T09:36:03.370399716Z 	at io.quarkus.tika.TikaParser.getText(TikaParser.java:48)
2024-08-15T09:36:03.370416728Z 	at org.acme.tika.TikaParserResource.extractText(TikaParserResource.java:32)
2024-08-15T09:36:03.370419197Z 	at org.acme.tika.TikaParserResource$quarkusrestinvoker$extractText_dea0ba98cd1e2079a3e6e6ba3cb8339aa2d36b03.invoke(Unknown Source)
2024-08-15T09:36:03.370421691Z 	at org.jboss.resteasy.reactive.server.handlers.InvocationHandler.handle(InvocationHandler.java:29)
2024-08-15T09:36:03.370423911Z 	at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.invokeHandler(QuarkusResteasyReactiveRequestContext.java:141)
2024-08-15T09:36:03.370425977Z 	at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:147)
2024-08-15T09:36:03.370428053Z 	at io.quarkus.vertx.core.runtime.VertxCoreRecorder$14.runWith(VertxCoreRecorder.java:635)
2024-08-15T09:36:03.370430012Z 	at org.jboss.threads.EnhancedQueueExecutor$Task.doRunWith(EnhancedQueueExecutor.java:2516)
2024-08-15T09:36:03.370433559Z 	at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2495)
2024-08-15T09:36:03.370435655Z 	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1521)
2024-08-15T09:36:03.370437622Z 	at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:11)
2024-08-15T09:36:03.370439534Z 	at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:11)
2024-08-15T09:36:03.370441478Z 	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
2024-08-15T09:36:03.370443398Z 	at java.base@21.0.4/java.lang.Thread.runWith(Thread.java:1596)
2024-08-15T09:36:03.370445329Z 	at java.base@21.0.4/java.lang.Thread.run(Thread.java:1583)
2024-08-15T09:36:03.370447245Z 	at org.graalvm.nativeimage.builder/com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:896)
2024-08-15T09:36:03.370449818Z 	at org.graalvm.nativeimage.builder/com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:872)
@melloware
Copy link
Contributor

@melloware
Copy link
Contributor

Although ours is still running Quarkus 3.8 LTS not 3.13.2 so i wonder if that is the issue. Let me try it .

@melloware
Copy link
Contributor

I put up a Draft PR to watch the Native build run: https://github.com/quarkiverse/quarkus-tika/actions/runs/10403201862

@melloware melloware added the question Further information is requested label Aug 15, 2024
@melloware
Copy link
Contributor

Looks like it passed @leonas5555 ?

@leonas5555
Copy link
Author

yeah , I see , and I didn't find yet any reasons why my workflow :

./mvnw package -Dnative -Dquarkus.native.container-build=true 
docker build -f src/main/docker/Dockerfile.native -t quarkus/tika-quickstart .
docker run -i --rm -p 8080:8080 quarkus/tika-quickstart
curl -v -X POST -H "Content-type: application/pdf" --data-binary @target/classes/quarkus.pdf http://localhost:8080/parse/text

produced the error. I keep investigating.

Thank you for your effort!

@leonas5555
Copy link
Author

leonas5555 commented Aug 15, 2024

only diff I see that native app for integrations tests is running directly on builder image instead eg. registry.access.redhat.com/ubi8/ubi-minimal:8.9 (that is Dockerfile.native in main/docker)

docker run --env LANG=C --rm --user 1001:127 -v /home/runner/work/quarkus-tika/quarkus-tika/integration-tests/target/quarkus-tika-integration-tests-999-SNAPSHOT-native-image-source-jar:/project:z --entrypoint /bin/bash quay.io/quarkus/ubi-quarkus-mandrel-builder-image:jdk-21 -c objcopy --strip-debug quarkus-tika-integration-tests-999-SNAPSHOT-runner

@leonas5555
Copy link
Author

there is one more error in stack
2024-08-15 13:33:47,396 ERROR [io.qua.ver.htt.run.QuarkusErrorHandler] (executor-thread-1) HTTP Request to /parse/text failed, error id: f6626477-349b-4516-ac8b-b680baeea31e-1: java.lang.UnsatisfiedLinkError: No awt in java.library.path

@melloware
Copy link
Contributor

ahhh yeah your image must not be able to load AWT and the ubi-minimal can. See: https://github.com/quarkiverse/quarkus-tika/pull/15/files

I believe AWT is required for this to work.

@leonas5555
Copy link
Author

yeah, it is in Installed features: [awt, when starting ubi-minimal:8.9 container.

2024-08-15 13:33:41,844 INFO  [io.quarkus] (main) tika-quickstart 1.0.0-SNAPSHOT native (powered by Quarkus 3.13.2) started in 0.026s. Listening on: http://0.0.0.0:8080
2024-08-15 13:33:41,845 INFO  [io.quarkus] (main) Profile prod activated. 
2024-08-15 13:33:41,845 INFO  [io.quarkus] (main) Installed features: [awt, cdi, poi, rest, smallrye-context-propagation, tika, vertx]
2024-08-15 13:33:47,396 ERROR [io.qua.ver.htt.run.QuarkusErrorHandler] (executor-thread-1) HTTP Request to /parse/text failed, error id: f6626477-349b-4516-ac8b-b680baeea31e-1: java.lang.UnsatisfiedLinkError: No awt in java.library.path

@melloware
Copy link
Contributor

yes but i think the actual OS needs something to allow AWT to work not just the AWT library installed. The error is basically that No awt in java.library.path basically like your native container does not support AWT linked libraries.

@melloware
Copy link
Contributor

you can read all about it here: quarkusio/quarkus#35256

@melloware
Copy link
Contributor

also see this note for POI: https://github.com/quarkiverse/quarkus-poi?tab=readme-ov-file#docker

Typically .so files get stripped from the native container but they are required by AWT so make sure to dockerignore them.

@melloware
Copy link
Contributor

@leonas5555 is this safe to close did you get it resolved?

@leonas5555
Copy link
Author

@melloware I will take a look during this week and report back here, and thank you!

@leonas5555
Copy link
Author

I confirmed that tika-quickstart project works well with applied settings from awt-graphics-rest-quickstart. the issue can be closed. thank you !

dockerfile.native

FROM registry.access.redhat.com/ubi8/ubi-minimal:8.7
RUN microdnf install freetype fontconfig \
    && microdnf clean all
WORKDIR /work/
RUN chown 1001 /work \
    && chmod "g+rwX" /work \
    && chown 1001:root /work
# Shared objects to be dynamically loaded at runtime as needed,
COPY --chown=1001:root target/*.properties target/*.so /work/
COPY --chown=1001:root target/*-runner /work/application
# Permissions fix for Windows
RUN chmod "ugo+x" /work/application
EXPOSE 8080
USER 1001

CMD ["./application", "-Dquarkus.http.host=0.0.0.0"]

Dockerignore file:

*
!target/*-runner
!target/*.so
!target/*.properties
!target/*-runner.jar
!target/lib/*
!target/quarkus-app/

@melloware
Copy link
Contributor

Awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants