Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drip causes bash script to hang #95

Open
ghost opened this issue Apr 28, 2016 · 1 comment
Open

drip causes bash script to hang #95

ghost opened this issue Apr 28, 2016 · 1 comment

Comments

@ghost
Copy link

ghost commented Apr 28, 2016

I'm using Tabula (more specifically the command-line version, tabula-java) to extract data from PDFs. I have a bash script which calls tabula-java a total of four times per PDF. It's a slow process (10 sec per PDF). I have almost 200K PDFs to process, so I was hoping to see some speed-up by using drip.
Unfortunately, my script doesn't like drip. When I pipe tabula's output to tr (translate), the script hangs within tr. Here's one of those tabula calls which hangs in a piped-to tr:
export id_value=$(drip -cp tabula-0.8.0-jar-with-dependencies.jar technology.tabula.CommandLineApp -a 240.593,124.695,264.308,227.97 -p 1 $filename | tr -d '\r\n')
When I say this "hangs" I mean that it enters but never exits tr. Control-C will get me back to the prompt.
The script works just fine when I avoid drip and call tabula through java:
export id_value=$(java -cp tabula-0.8.0-jar-with-dependencies.jar technology.tabula.CommandLineApp -a 240.593,124.695,264.308,227.97 -p 1 $filename | tr -d '\r\n')
Details: OS X 10.8.5, tabula-java 0.8.0

@headius
Copy link
Collaborator

headius commented May 2, 2016

Can you get thread dumps of the processes involved? That would let us see where the Java processes are stuck, at least. I would guess there's some stdio buffering happening preventing this from working nicely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant