Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Support qualified names in update_page_form_field_values #1695

Merged
merged 5 commits into from
Mar 14, 2023

Conversation

xi
Copy link
Contributor

@xi xi commented Mar 8, 2023

PDF forms often use names like "A.1", "A.2", "B.1", "B.2", … for the fields. However, the . has a special meaning, so this creates a hierarchy instead.

It was impossible to fill those individual fields with update_page_form_field_values():

update_page_form_field_values({"A": "foo"})  # fills all "A.*" fields
update_page_form_field_values({"1": "foo"})  # fills all "*.1" fields
update_page_form_field_values({"A.1": "foo"})  # fills none of the fields

This change makes update_page_form_field_values() to also check for qualified field names.

See also #545


@hchillon took a different approach in #545. Unfortunately I don't know enough about PDF to fully understand what they did there. This simple approach worked well for my usecase.

I copied _get_qualified_field_name() from the PdfReader class. Should this be moved to a utility module so it can be shared in both classes? Where would that be?

@pubpub-zz
Copy link
Collaborator

pubpub-zz commented Mar 8, 2023

As you have noticed, period is not accepted in field names
(extract for pdf standard 1.7 App H, page 1117)
image

Your idea to extend(modify?) to take into account qualified.names sounds good and I like your idea to use _get_qualified_filed_name() 👍. I don't think we need an extra param to force to check standard names only

use the same name _get_qualified_field_name in writer (later we may be able to merge some code)

@pubpub-zz
Copy link
Collaborator

@xi,
can you fix the mypy issues ?

@codecov
Copy link

codecov bot commented Mar 10, 2023

Codecov Report

Patch coverage: 46.15% and project coverage change: -0.04 ⚠️

Comparison is base (19944fe) 92.54% compared to head (bc0458d) 92.50%.

❗ Current head bc0458d differs from pull request most recent head 0dc5aa1. Consider uploading reports for the commit 0dc5aa1 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1695      +/-   ##
==========================================
- Coverage   92.54%   92.50%   -0.04%     
==========================================
  Files          34       34              
  Lines        6502     6512      +10     
  Branches     1282     1286       +4     
==========================================
+ Hits         6017     6024       +7     
- Misses        315      316       +1     
- Partials      170      172       +2     
Impacted Files Coverage Δ
pypdf/_writer.py 86.17% <46.15%> (-0.15%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

pypdf/_writer.py Outdated Show resolved Hide resolved
@MartinThoma
Copy link
Member

Thank you for your contribution @xi 🙏

To be honest, I'm also always struggling with the form-stuff. It might take a while until I decided on how to continue with this PR. Any input (e.g. @pubpub-zz / @MasterOdin / others who know more about PDF) is very welcome :-)

@MartinThoma MartinThoma added the workflow-forms From a users perspective, forms is the affected feature/workflow label Mar 11, 2023
@pubpub-zz
Copy link
Collaborator

I added a comment about earlier and for me this is good.

@pubpub-zz
Copy link
Collaborator

@xi
Just another point : can you add tests for test coverage

Co-authored-by: Martin Thoma <info@martin-thoma.de>
@xi
Copy link
Contributor Author

xi commented Mar 11, 2023

Just another point : can you add tests for test coverage

I don't think I know enough to create a minimal PDF that demonstrates this issue. Can someone please help out?

@pubpub-zz
Copy link
Collaborator

@xi
I've started to copy/upgrade test_fill_form() in test_writer.py:

def test_fill_form_with_qualified():
    reader = PdfReader(RESOURCE_ROOT / "form.pdf")
    reader.r.add_form_topname("top")
    writer = PdfWriter()

    page = reader.pages[0]

    writer.add_page(page)
    writer.update_page_form_field_values(
        writer.pages[0], {"top.foo": "filling"}, flags=1
    )

    b = BytesIO()
    writer.write(0)

    reader2.get_fields["top.foo"]  # to be completed....

My dev environment is not available for the moment to finalize the code. Can you do it please

@xi
Copy link
Contributor Author

xi commented Mar 13, 2023

@pubpub-zz thanks for the head start. I tried to fill in the gaps in bc0458d.

@MartinThoma MartinThoma changed the title support qualified names in update_page_form_field_values ENH: Support qualified names in update_page_form_field_values Mar 14, 2023
@MartinThoma MartinThoma added the is-feature A feature request label Mar 14, 2023
@MartinThoma MartinThoma merged commit 9878034 into py-pdf:main Mar 14, 2023
@MartinThoma
Copy link
Member

Thank you for your contribution @xi 🙏 It will be in pypdf > 3.5.2, which I will likely release on Sunday.

If you want, I can add you to the contributors: https://pypdf.readthedocs.io/en/latest/meta/CONTRIBUTORS.html

Just let me know which name I should use / what I should link to :-)

MartinThoma added a commit that referenced this pull request Mar 18, 2023
New Features (ENH)
-  Extend PdfWriter.append() to PageObjects (#1704)
-  Support qualified names in update_page_form_field_values (#1695)

Robustness (ROB)
-  Tolerate streams without length field (#1717)
-  Accept DictionaryObject in /D of NamedDestination (#1720)
-  Widths def in cmap calls IndirectObject (#1719)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-feature A feature request workflow-forms From a users perspective, forms is the affected feature/workflow
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants