Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unknown fontname in TrueType(Arial, TimesNewRoman) (#767) #790

Merged
merged 5 commits into from
Aug 14, 2022

Conversation

sobuen
Copy link

@sobuen sobuen commented Aug 10, 2022

Pull request

Fix #767
Support TrueType fonts(Arial, TimesNewRoman, CourierNew).
reference: Section 5.5.1 of the PDF Reference

How Has This Been Tested?

Test pdf file

# Part of confirmation code
resource_manager = PDFResourceManager()
device = PDFPageAggregator(resource_manager, laparams=LAParams(detect_vertical=True))
with open(input_file, 'rb') as fp:
    interpreter = PDFPageInterpreter(resource_manager, device)
    
    for page in PDFPage.get_pages(fp, check_extractable=False):
        interpreter.process_page(page)
        layout = device.get_result()

        for lt in layout:
            ...

with Python 3.8.13.

Confirmed worked with "Arial", "Arial,Bold", "TimesNewRoman", "TimesNewRoman,Italic", "TimesNewRoman,Bold".
While the "Arial,Italic", "Arial,BoldItalic", "TimesNewRoman,BoldItalic", "CourierNew*" not yet confirmed. (no data to confirm)

Checklist

  • I have read CONTRIBUTING.md.
  • I have added a concise human-readable description of the change to CHANGELOG.md.
  • I have tested that this fix is effective or that this feature works.
  • I have added docstrings to newly created methods and classes.
  • I have updated the README.md and the readthedocs documentation. Or verified that this is not necessary.

@pietermarsman pietermarsman merged commit ca9f75a into pdfminer:master Aug 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fontname of TrueType(Arial, TimesNewRoman) is unknown
2 participants