-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pdf_text cutting off portion of page printed in landscape #7
Comments
Can you include a reproducible example please? |
Unfortunately, they all have confidential info so I can't share it. The last page of each PDF is blank and in portrait while the rest of the PDF is in landscape. Could that cause the issue? I'm out of the office until Monday, when I'm back in I'll see if I can get a PDF that's suitable for sharing. |
Here's a PDF where the two columns on the far right are not being pulled into R via pdf_text. |
Any word on this issue? |
I don't think so, haven't heard back. You are free to subscribe to the libpoppler mailing list and post a reminder for: https://lists.freedesktop.org/archives/poppler/2016-March/011755.html |
FWIW, I have succesfully converted that PDF using Xpdf's pdftotext, which can be found here: http://www.foolabs.com/xpdf/download.html |
Looks like the poppler folks are not in a hurry to fix this, so I added a workaround that double the width of the target rectangle for landscape pages. It's not perfect but I think this will avoid the problem in most cases. |
Just encountered this again: voting_equipment_by_municipality_2_pdf_15114.pdf The whole ACCESSIBLE EQUIPMENT column is clipped on all pages. |
This should be fixed in version 1.0. Can you try updating to the latest version? |
Indeed, fixed on my example! |
That fixed my issue too! |
pdf_text is not scanning in the part of the page that is past 8.5" wide. I created an Excel and a Word doc and saved them as landscape and it is scanning entire page into R. So, maybe it is something specific to my pdf.
The text was updated successfully, but these errors were encountered: