Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore invalid /Encoding-entries when parsing fonts (issue 14821) #14823

Merged
merged 1 commit into from
Apr 23, 2022

Conversation

Snuffleupagus
Copy link
Collaborator

@Snuffleupagus Snuffleupagus commented Apr 22, 2022

In the referenced PDF document the fonts have /Encoding-entries that are Streams (containing completely bogus data), which are thus obviously not valid here.
Hence, only when ignoreErrors is set, we'll now ignore these corrupt /Encoding-entries and fallback to the existing code to try and infer a usable encoding.

Given that this is clearly a case of corrupt PDF documents, there's no guarantee that this will "fix" all such cases, however it's the best that we do here and shouldn't really be worse than ignoring an entire font.

In the referenced PDF document the fonts have /Encoding-entries that are Streams (containing completely bogus data), which are thus obviously not valid here.
Hence, only when `ignoreErrors` is set, we'll now ignore these corrupt /Encoding-entries and fallback to the existing code to try and infer a usable encoding.

Given that this is *clearly* a case of corrupt PDF documents, there's no guarantee that this will "fix" all such cases, however it's the best that we do here and shouldn't really be worse than ignoring an entire font.
@Snuffleupagus Snuffleupagus linked an issue Apr 22, 2022 that may be closed by this pull request
@Snuffleupagus
Copy link
Collaborator Author

/botio-linux preview

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_preview from @Snuffleupagus received. Current queue size: 1

Live output at: http://54.241.84.105:8877/e802f96af694074/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/e802f96af694074/output.txt

Total script time: 2.63 mins

Published

@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/89fbec71ee9084b/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/f758c1c597ca07e/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/89fbec71ee9084b/output.txt

Total script time: 23.30 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 10
  different first/second rendering: 1

Image differences available at: http://54.241.84.105:8877/89fbec71ee9084b/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/f758c1c597ca07e/output.txt

Total script time: 24.74 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 1

Image differences available at: http://54.193.163.58:8877/f758c1c597ca07e/reftest-analyzer.html#web=eq.log

@timvandermeij timvandermeij merged commit f9e54d9 into mozilla:master Apr 23, 2022
@timvandermeij
Copy link
Contributor

Thank you for doing this!

@Snuffleupagus Snuffleupagus deleted the issue-14821 branch April 23, 2022 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Broken PDFs
3 participants