Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDFJS is not able to show prefilled Acroform PDF with data in xfa:dataset #14685

Closed
keyhan opened this issue Mar 17, 2022 · 5 comments · Fixed by #14735 or #14738
Closed

PDFJS is not able to show prefilled Acroform PDF with data in xfa:dataset #14685

keyhan opened this issue Mar 17, 2022 · 5 comments · Fixed by #14735 or #14738
Assignees

Comments

@keyhan
Copy link

keyhan commented Mar 17, 2022

When Using a PDF which has Type 1 embedded fonts, a prefilled PDF that has been filled with tools like IText or PDFBox cannot be handled as is shown as empty by PDF JS.

If however PDFJS has prefilled the PDF the texts are shown.

Update: After discussion further down, now we know that the PDF is actually a hybrid of Acroform and XFA and PDFJS is not able to prefill Acroform with data in the XFA:dataset, although when saving the PDF, PDFJS actually save them in XFA:dataset also.

Attach (recommended) or Link to PDF file here:
1647183160545.pdf
Configuration:

  • Web browser and its version: Chrome (99)
  • Operating system and its version:
  • PDF.js version:
  • Is a browser extension:

Steps to reproduce the problem:

  1. open the attached file in Acrobat Reader
  2. open the attached file in PDFJS

What is the expected behavior?
image

What went wrong? (add screenshot)
image

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):
1647183160545.pdf

@keyhan
Copy link
Author

keyhan commented Mar 30, 2022

After further investigation, it seems like the TYPE1 font is not the issue, we can not find the specific reason.

@calixteman
Copy link
Contributor

This file is a mix of acroform and xfa but mainly acroform
Some acroform fields have their value saved in the dataset element in the xfa data.
In pdf.js, we save filled values in acroform fields and in the xfa dataset if any but we don't read them from xfa dataset.

@calixteman calixteman self-assigned this Mar 30, 2022
@keyhan
Copy link
Author

keyhan commented Mar 30, 2022

Thanks for your analysis, Is there any workaround to this problem so we would be able to show the filled values by PDFJS? We use XFA for other PDFs as well and for them PDFJS has no problem of showing the prefilled data. It is with some specific PDFs like this one that we have issues.

@calixteman
Copy link
Contributor

I'll fix the bug in a couple of days.

@keyhan
Copy link
Author

keyhan commented Mar 30, 2022

Great to hear. Thanks again for your quick help. appreciate it.

@keyhan keyhan changed the title PDFJS is not able to show prefilled XFA PDF with embedded TYPE 1 fonts filled with ITEXT or PDFBOX PDFJS is not able to show prefilled Acroform PDF with data in xfa:dataset Mar 31, 2022
calixteman added a commit to calixteman/pdf.js that referenced this issue Mar 31, 2022
…a:datasets

- it aims to fix mozilla#14685;
- add a basic object to get values from the parsed datasets;
- these annotations don't have an appearance so we must create one when printing or saving.
calixteman added a commit to calixteman/pdf.js that referenced this issue Apr 1, 2022
…a:datasets

- it aims to fix mozilla#14685;
- add a basic object to get values from the parsed datasets;
- these annotations don't have an appearance so we must create one when printing or saving.
@marco-c marco-c removed the form-xfa label Apr 1, 2022
conghoang added a commit to conghoang/pdf.js that referenced this issue Apr 7, 2022
* commit '27e738dff951160420575216c080e22027af0a86': (198 commits)
  Refactor some xfa*** getters in document.js - it's a follow-up of PR mozilla#14735.
  Convert `web/debugger.js` to a *basic* module
  Update translations to the most recent versions
  Update dependencies to the most recent versions
  Update GitHub Actions workflow steps to the most recent versions
  Replace most loops in `web/debugger.js` with `for...of` loops
  Decode non-ASCII values found in the xfa:datasets (PR 14735 follow-up)
  [Annotations] Some annotations can have their values stored in the xfa:datasets - it aims to fix mozilla#14685; - add a basic object to get values from the parsed datasets; - these annotations don't have an appearance so we must create one when printing or saving.
  [GENERIC viewer] Try to improve a11y, for search results, in the findbar (issue 14525)
  Don't manually convert `setAttribute` values to strings (PR 14554 follow-up)
  Use `String.prototype.repeat()` in a couple of spots
  [Annotations] Add support for printing/saving choice list with multiple selections - it aims to fix issue mozilla#12189.
  Add a `<dialog>` polyfill for the `generic-legacy` build
  Try to improve a11y for the `PasswordPrompt` and `PDFDocumentProperties` dialogs
  Re-factor the `OverlayManager` class to use a `WeakMap` internally
  Convert the existing overlays to use `<dialog>` elements (issue 14698)
  [text selection] Add the whitespaces present in the pdf in the text chunk - it aims to fix issue mozilla#14627; - the basic idea of the recent text refactoring was to only consider the rendered visible whitespaces.   But sometimes, the heuristics aren't correct and although some whitespaces are in the text stream   they weren't in the text chunks because they were too small. Hence we added some exceptions, for example,   we always add a whitespace when it is between two non-whitespace chars but only when in the same Tj.   So basically, this patch removes the constraint to have the chars in the same Tj   (in using a circular buffer to save the two last chars) but don't add a space when the visible space is really   too small (hence `NOT_A_SPACE_FACTOR`).
  Change the type of the `container` property, in the `TextLayerRenderParameters` typedef (issue 14716)
  Avoid the `textLayer` becoming visible in high contrast mode (issue 13230)
  Remove the remaining `dir`-dependent CSS rules
  ...

# Conflicts:
#	package-lock.json
#	src/pdf.js
bh213 pushed a commit to bh213/pdf.js that referenced this issue Jun 3, 2022
…a:datasets

- it aims to fix mozilla#14685;
- add a basic object to get values from the parsed datasets;
- these annotations don't have an appearance so we must create one when printing or saving.
rousek pushed a commit to signosoft/pdf.js that referenced this issue Aug 10, 2022
…a:datasets

- it aims to fix mozilla#14685;
- add a basic object to get values from the parsed datasets;
- these annotations don't have an appearance so we must create one when printing or saving.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment