Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add Cloning #1371

Merged
merged 116 commits into from
Dec 11, 2022
Merged

ENH: Add Cloning #1371

merged 116 commits into from
Dec 11, 2022

Conversation

pubpub-zz
Copy link
Collaborator

@pubpub-zz pubpub-zz commented Sep 27, 2022

The method .clone(pdf_dest,[force_duplicate]) clones the objects and all referenced objects.

If an object is already cloned, the already cloned object is returned (unless force_duplicate is set)
mainly for internal use but can be used on a page
for pageObject/DictionnaryObject/[Encoded/Decoded/Content]Stream an extra parameter ignore_fields list that provide the list of fields that should not be cloned.

When available, the pointer to an object is available in indirect_obj attribute.

New API for add_page/insert_page that :

  • returns the cloned page object
  • ignore_fields can be provided as a parameter.

Others

  • file is closed at the end of PdfWriter.write when a filename is provided
  • Breaking Change: add_outline_item now has a parameter before which is not the last parameter

Update

  • The public API of PdfMerger has been added to PdfWriter (ready to make PdfMerger an alias of it)
  • Process properly Outline merging
  • Process properly Named destinated

Deals with #1194, #1322, #471, #1337

add cloning capability
includes:
* add clone function
* new  API for add_page/insert_page that returns the cloned page object
* close file when a file name is provided to PdfWriter.write
@pubpub-zz pubpub-zz marked this pull request as draft September 27, 2022 18:34
w.merge and w.append
to be iaw PDF Spec

add page clean up for destination in NameObject that are not matching TextStringObject in Names/Dests
@codecov
Copy link

codecov bot commented Oct 15, 2022

Codecov Report

Base: 94.14% // Head: 92.70% // Decreases project coverage by -1.43% ⚠️

Coverage data is based on head (4ccfbff) compared to base (7633477).
Patch coverage: 84.45% of modified lines in pull request are covered.

❗ Current head 4ccfbff differs from pull request most recent head afebcab. Consider uploading reports for the commit afebcab to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1371      +/-   ##
==========================================
- Coverage   94.14%   92.70%   -1.44%     
==========================================
  Files          31       29       -2     
  Lines        5480     5691     +211     
  Branches     1037     1112      +75     
==========================================
+ Hits         5159     5276     +117     
- Misses        193      267      +74     
- Partials      128      148      +20     
Impacted Files Coverage Δ
PyPDF2/_merger.py 97.60% <ø> (+4.42%) ⬆️
PyPDF2/generic/_data_structures.py 89.75% <79.08%> (-5.57%) ⬇️
PyPDF2/_protocols.py 81.25% <81.25%> (ø)
PyPDF2/_writer.py 86.12% <84.11%> (-3.43%) ⬇️
PyPDF2/generic/_base.py 99.64% <98.36%> (-0.36%) ⬇️
PyPDF2/_page.py 92.23% <100.00%> (+0.28%) ⬆️
PyPDF2/_reader.py 90.33% <100.00%> (+0.04%) ⬆️
PyPDF2/types.py 100.00% <100.00%> (ø)
... and 11 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@MartinThoma MartinThoma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy didn't complain when I checked. As you asked me to look at mypy, I checked all 'type: ignore' comments. Several were not necessary at all. In some cases mypy needed an assert variable is not None as a hint. And in some cases I could at least narrow the ignore down to be a bit more specific

PyPDF2/_page.py Outdated Show resolved Hide resolved
PyPDF2/generic/_base.py Outdated Show resolved Hide resolved
PyPDF2/generic/_base.py Outdated Show resolved Hide resolved
PyPDF2/generic/_base.py Outdated Show resolved Hide resolved
PyPDF2/generic/_base.py Outdated Show resolved Hide resolved
PyPDF2/generic/_data_structures.py Outdated Show resolved Hide resolved
PyPDF2/generic/_data_structures.py Outdated Show resolved Hide resolved
PyPDF2/generic/_data_structures.py Outdated Show resolved Hide resolved
PyPDF2/generic/_data_structures.py Outdated Show resolved Hide resolved
PyPDF2/generic/_data_structures.py Outdated Show resolved Hide resolved
pubpub-zz and others added 4 commits October 16, 2022 10:34
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
PyPDF2/_writer.py Outdated Show resolved Hide resolved
PyPDF2/_writer.py Outdated Show resolved Hide resolved
PyPDF2/_writer.py Outdated Show resolved Hide resolved
PyPDF2/_writer.py Outdated Show resolved Hide resolved
PyPDF2/_writer.py Outdated Show resolved Hide resolved
PyPDF2/_writer.py Outdated Show resolved Hide resolved
PyPDF2/_writer.py Outdated Show resolved Hide resolved
PyPDF2/_writer.py Outdated Show resolved Hide resolved
PyPDF2/_writer.py Outdated Show resolved Hide resolved
PyPDF2/_writer.py Outdated Show resolved Hide resolved
PyPDF2/_writer.py Outdated Show resolved Hide resolved
PyPDF2/_writer.py Outdated Show resolved Hide resolved
@MartinThoma
Copy link
Member

Finally! I'll have another quick look at the code and then merge today :-)

@MartinThoma MartinThoma merged commit 74b8a63 into py-pdf:main Dec 11, 2022
@MartinThoma
Copy link
Member

@pubpub-zz Thank you so much for this moonshot extension 🙏 ❤️

@xilopaint
Copy link
Contributor

@pubpub-zz thanks for all the effort you've put into this PR!

@MartinThoma MartinThoma removed the soon PRs that are almost ready to be merged, issues that get solved pretty soon label Dec 12, 2022
MartinThoma added a commit that referenced this pull request Dec 22, 2022
BREAKING CHANGES:
-  Deprecate features with PyPDF2==3.0.0 (#1489)
-  Refactor Fit / Zoom parameters (#1437)

New Features (ENH):
-  Add Cloning  (#1371)
-  Allow int for indirect_reference in PdfWriter.get_object (#1490)

Documentation (DOC):
-  How to read PDFs from S3 (#1509)
-  Make MyST parse all links as simple hyperlinks (#1506)
-  Changed 'latest' for 'stable' generated docs (#1495)
-  Adjust deprecation procedure (#1487)

Maintenance (MAINT):
-  Use typing.IO for file streams (#1498)

[Full Changelog](2.12.1...3.0.0)
@pubpub-zz pubpub-zz deleted the cloning branch June 24, 2023 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants