Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fail to identify fonts (name and size) #629 #630

Merged
merged 3 commits into from
Aug 11, 2023

Conversation

mbideau-atreal
Copy link
Contributor

In order to solve #629 I identified that BT commands were resetting the font name and size previously defined by a Tf command... all the time.
So I just removed the font (over)initialization from the BT code, and all the font names and size were showing up.
The (first) initialization of $fontId and $fontSize is already done before the loop starts, BTW.

PS: I don't know anything about PDF standard, so please take my fix with caution.

Thanks for this great piece of code.
Best regards.

@k00ni
Copy link
Collaborator

k00ni commented Aug 7, 2023

CC @GreyWyvern Just a hint, because this is related to fonts, which was the topic of one of recent your pull requests.

@GreyWyvern
Copy link
Contributor

My own PR that I'm working on removes these same lines from Page.php as unnecessary as well. If font can be set outside BT ... ET blocks then there is no need to reset it each time we enter a new one.

@k00ni
Copy link
Collaborator

k00ni commented Aug 9, 2023

@mbideau-atreal I need a simple unit-test to prove the change is working and to avoid a regression in the future. Can you do that? If not, I can suggest one.

@GreyWyvern
Copy link
Contributor

@mbideau-atreal May we use your sample PDF 20230803-160138-lettretype-arrete.pdf in the PdfParser test suite? Then we could add to the bottom of PageTest::testGetDataTm():

        // Check that BT and ET do not reset the font
        $config = new Config();
        $config->setDataTmFontInfoHasToBeIncluded(true);

        $filename = $this->rootDir.'/samples/bugs/Issue629.pdf';
        $parser = $this->getParserInstance($config);
        $document = $parser->parseFile($filename);
        $pages = $document->getPages();
        $page = end($pages);
        $dataTm = $page->getDataTm();

        $this->assertCount(4, $dataTm[0]);
        $this->assertEquals('F2', $dataTm[0][2]);

The first test should always be true if setDataTmFontInfoHasToBeIncluded is true. The second test is only true if $fontId and $fontSize lines are removed from Page.php.

I would also recommend removing the same lines from further down under case 'ET': as well.

Copy link
Collaborator

@k00ni k00ni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @GreyWyvern. I took your suggestion and added two tests. PR is ready to merge. @mbideau-atreal is there anything left to do?

@k00ni k00ni self-assigned this Aug 10, 2023
@mbideau-atreal
Copy link
Contributor Author

Sorry for my delay.
Great work 👍
And yes, you can use my PDF in your test case... thank you very much for asking.

@mbideau-atreal
Copy link
Contributor Author

Nothing left to do here, on my part 😉

After the merge, if you can, I would really need some feedback on the issue #570 regarding the same PDF sample document 🙏

Thanks again.

@k00ni k00ni linked an issue Aug 11, 2023 that may be closed by this pull request
@k00ni k00ni merged commit f3a5a3e into smalot:master Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fail to identify fonts (name and size)
3 participants