Releases · py-pdf/pypdf

26 Jan 11:48

github-actions

5.2.0

049f71e

Version 5.2.0, 2025-01-26 Latest

Latest

What's new

Deprecations (DEP)

Deprecate with replacement CCITParameters (#3019) by @j-t-1
Correct deprecation of interiour_color (#2947) by @j-t-1

New Features (ENH)

Support alternative (U)F names for embedded file retrieval (#3072) by @stefan6419846
Adding support for reading .metadata.keywords (#2939) by @Lucas-C

Bug Fixes (BUG)

Handle further Tf operators in text extraction layout mode (#3073) by @blushingpenguin
Ensure add_metadata can deal with _info = None (#3040) by @xmo-odoo
Handle IndirectObject in CCITTFaxDecode filter (#2965) by @stefan6419846
Handle chained colorspace for inline images when no filter is set (#3008) by @stefan6419846
Avoid extracting inline images twice and dropping other operators (#3002) by @stefan6419846
Fixed reference of value with str.__new__ in TextStringObject (#2952) by @thomas-forte
Handle indirect objects in font width calculations (#2967) by @nsw42
Title sometimes is bytes and not str (#2930) by @reformy
Fix undefined variable for text extraction (regression) (#2934) by @stefan6419846
Don't close stream passed to PdfWriter.write() (#2909) by @alexaryn

Robustness (ROB)

Handle zero height fonts when extracting text (#3075) by @blushingpenguin
Deal with content streams not containing streams (#3005) by @stefan6419846
Gracefully handle some text operators when the operands are missing (#3006) by @stefan6419846
Fall back to non-Adobe Ascii85 format for missing end markers (#3007) by @stefan6419846
Ignore odd-length strings when processing cmap lines (#3009) by @stefan6419846
Skip annotation destination being NullObject in PdfWriter (#2964) by @stefan6419846
Skip destination page being None in PdfWriter (#2963) by @dxsooo
Fix infinite loop case when reading null objects within an Array by @jakep-allenai
Fixing infinite loop in ArrayObject read_from_stream (#2928) by @jakep-allenai

Documentation (DOC)

Add note about default line colors (#3014) by @stefan6419846

Developer Experience (DEV)

Remove ignoring Ruff rule PGH004 (#3071) by @j-t-1
Tidy ignore array in tool.ruff.lint (#3069) by @j-t-1
Move Windows CI to Python 3.13 (#3003) by @stefan6419846
Move to Ubuntu 22.04 (#3004) by @stefan6419846

Maintenance (MAINT)

Fix formatting of warning message and include exception message (#3076) by @stefan6419846
Narrow return type for ContentStream.operations (#2941) by @kmurphy4

Testing (TST)

Fix image similarity for upcoming Ubuntu 24.04 (#3039) by @stefan6419846
Replace broken Apache Tika Corpora urls (#3041) by @stefan6419846

Code Style (STY)

Add form feed to WHITESPACES (#3054) by @j-t-1
Lots of small internal changes by @j-t-1

Full Changelog

Contributors

blushingpenguin, Lucas-C, and 10 other contributors

Assets 2

27 Oct 19:46

github-actions

5.1.0

9f647e6

Version 5.1.0, 2024-10-27

What's new

New Features (ENH)

Add layout_mode_font_height_weight argument to PageObject.extract_text() (#2920) by @hpierre001

Bug Fixes (BUG)

Fix font specificier for FreeText annotation (#2893) by @ssjkamei
Line breaks are not generated due to incorrect calculation of text leading (#2890) by @ssjkamei
Improve handling of spaces in text extraction (#2882) by @ssjkamei

Robustness (ROB)

Soft failure for flate encode image mode 1 with wrong LUT size (#2900) by @stefan6419846

Documentation (DOC)

Use latest package versions (#2907) by @stefan6419846
Correct example of reading FileAttachment annotation (#2906) by @j-t-1

Developer Experience (DEV)

Update pinned requirements (#2918) by @stefan6419846
Make make_release.py compatible with Windows environment (#2894) by @pubpub-zz

Maintenance (MAINT)

Remove references to outdated Python versions (#2919) by @stefan6419846
Generalize the method of obtaining space_code (#2891) by @ssjkamei
Unnecessary character mapping process (#2888) by @ssjkamei
New LZW decoding implementation (#2887) by @MartinThoma

Testing (TST)

Add LzwCodec for encoding (#2883) by @MartinThoma

Code Style (STY)

Capitalize error messages (#2903) by @j-t-1
Modify error messages in PdfWriter (#2902) by @j-t-1

Full Changelog

Contributors

MartinThoma, pubpub-zz, and 4 other contributors

Assets 2

29 Sep 09:55

pubpub-zz

5.0.1

ab21802

Version 5.0.1, 2024-09-29

New Features (ENH)

Add full parameter to PdfWriter constructor (#2865)

Bug Fixes (BUG)

Update pyproject.toml with minimum Python version of 3.8 (#2859)
Cope with unbalanced delimiters in dictionary object (#2878)
Cope with encoding with too many differences (#2873)
Missing spaces in extract_text() method (#1328) (#2868)
Tolerate truncated files and no warning when jumping startxref (#2855)

Robustness (ROB)

Repair PDF with invalid Root object (#2880)
Continue parsing dictionary object when error is detected (#2872)
Merge documents with invalid pages in named destinations (#2857)
Tolerate comments in arrays (#2856)

Developer Experience (DEV)

Use latest Python version for benchmarking (#2879)

Maintenance (MAINT)

Add tests to source distributions (#2874)
Refactor _update_field_annotation (#2862)

Full Changelog

Assets 2

17 Sep 17:29

pubpub-zz

5.0.0

637bc44

Version 5.0.0, 2024-09-17

This version drops support for Python 3.7 (not maintained since July 2023), PdfMerger (use PdfWriter instead) and AnnotationBuilder (use annotations instead).

Deprecations (DEP)

Remove the deprecated PfdMerger and AnnotationBuilder classes and other deprecations cleanup (#2813)
Drop Python 3.7 support (#2793)

New Features (ENH)

Add capability to remove /Info from PDF (#2820)
Add incremental capability to PdfWriter (#2811)
Add UniGB-UTF16 encodings (#2819)
Accept utf strings for metadata (#2802)
Report PdfReadError instead of RecursionError (#2800)
Compress PDF files merging identical objects (#2795)

Bug Fixes (BUG)

Fix sheared image (#2801)

Robustness (ROB)

Robustify .set_data() (#2821)
Raise PdfReadError when missing /Root in trailer (#2808)
Fix extract_text() issues on damaged PDFs (#2760)
Handle images with empty data when processing an image from bytes (#2786)

Developer Experience (DEV)

Fix coverage uploads (#2832)
Test against Python 3.13 (#2776)

Full Changelog

Assets 2

21 Jul 19:35

github-actions

4.3.1

8f62120

Version 4.3.1, 2024-07-21

Bug Fixes (BUG)

Cope with Matrix entry in field annotations (#2736)

Robustness (ROB)

Cope with fields with upside down box/rectangle (#2729)

Maintenance (MAINT)

Add deprecate_with_replacement to StreamObject.initializeFromD… (#2728)
Deal with cryptography>=43 moving ARC4 (#2765)

Full Changelog

Assets 2

14 Jul 19:51

github-actions

4.3.0

d3ef5e5

Version 4.3.0, 2024-07-14

What's new

New Features (ENH)

Accept ETen-B5 and UniCNS-UTF16 encodings (#2721) by @pubpub-zz
Add decode_as_image() to ContentStreams (#2615) by @pubpub-zz
context manager for PdfReader (#2666) by @tibor-reiss
Add capability to set font and size in fields (#2636) by @pubpub-zz
Allow to pass input file without named argument (#2576) by @pubpub-zz

Bug Fixes (BUG)

Fix deprecation for Ressources when using old constants (#2705) by @stefan6419846
Fix images issue 4 bits encoding and LUT starting with UTF16_BOM (#2675) by @pubpub-zz
Reading large compressed images takes huge time to process (#2644) by @snanda85
Highlighted Text Cannot Be Printed (#2604) by @Nifury
Fix UnboundLocalError on malformed pdf (#2619) by @farjasju

Documentation (DOC)

Various improvements on docstrings and examples by @j-t-1

Robustness (ROB)

Cope with missing Standard 14 fonts in fields (#2677) by @pubpub-zz
Improve inline image extraction (#2622) by @pubpub-zz
Cope with loops in Fields tree (#2656) by @pubpub-zz
Discard /I in choice fields for compatibility with Acrobat (#2614) by @pubpub-zz
Cope with some issues in pillow (#2595) by @pubpub-zz
Cope with some image extraction issues (#2591) by @pubpub-zz

Maintenance (MAINT)

Deprecate interiour_color with replacement interior_color (#2706) by @j-t-1
Add deprecate_with_replacement to PdfWriter.find_bookmark (#2674) by @j-t-1

Code Style (STY)

Change Link to be a non-markup annotation (#2714) by @j-t-1

Full Changelog

Contributors

snanda85, pubpub-zz, and 5 other contributors

Assets 2

07 Apr 15:38

stefan6419846

4.2.0

2ac88e6

Version 4.2.0, 2024-04-07

What's new

New Features (ENH)

Allow multiple charsets for NameObject.read_from_stream (#2585) by @pubpub-zz
Add support for /Kids in page labels (#2562) by @stefan6419846
Allow to update fields on many pages (#2571) by @pubpub-zz
Tolerate PDF with invalid xref pointed objects (#2335) by @pubpub-zz
Add Enforce from PDF2.0 in viewer_preferences (#2511) by @pubpub-zz
Add += and -= operators to ArrayObject (#2510) by @pubpub-zz

Bug Fixes (BUG)

Fix merge_page sometimes generating unknown operator 'QQ' (#2588) by @rfotino
Fix fields update where annotations are kids of field (#2570) by @pubpub-zz
Process CMYK images without a filter correctly (#2557) by @pubpub-zz
Extract text in layout mode without finding resources (#2555) by @pubpub-zz
Prevent recursive loop in some PDF files (#2505) by @pubpub-zz

Robustness (ROB)

Tolerate "truncated" xref (#2580) by @pubpub-zz
Replace error by warning for EOD in RunLengthDecode/ASCIIHexDecode (#2334) by @pubpub-zz
Rebuild xref table if one entry is invalid (#2528) by @pubpub-zz
Robustify stream extraction (#2526) by @pubpub-zz

Documentation (DOC)

Update release process for latest changes (#2564) by @stefan6419846
Encryption/decryption: Clone document instead of copying all pages (#2546) by @redfast00
Minor improvements (#2542) by @j-t-1
Update annotation list (#2534) by @j-t-1
Update references and formatting (#2529) by @j-t-1
Correct threads reference, plus minor changes (#2521) by @j-t-1
Minor readability increases (#2515) by @j-t-1
Simplify PaperSize examples (#2504) by @j-t-1
Minor improvements (#2501) by @j-t-1

Developer Experience (DEV)

Remove unused dependencies (#2572) by @stefan6419846
Remove page labels PR link from message (#2561) by @stefan6419846
Fix changelog generator regarding whitespace and handling of "Other" group (#2492) by @stefan6419846
Add REL to known PR prefixes (#2554) by @stefan6419846
Release using the REL commit instead of git tag (#2500) by @MartinThoma
Unify code between PdfReader and PdfWriter (#2497) by @pubpub-zz
Bump softprops/action-gh-release from 1 to 2 (#2514) by @dependabot[bot]

Maintenance (MAINT)

Ressources → Resources (and internal name childs) (#2550) by @pubpub-zz
Fix typos found by codespell (#2549) by @stefan6419846
Update Read the Docs configuration (#2538) by @j-t-1
Add root_object, _info and _ID to PdfReader (#2495) by @pubpub-zz

Testing (TST)

Allow loading truncated images if required (#2586) by @stefan6419846
Fix download issues from #2562 (#2578) by @pubpub-zz
Improve test_get_contents_from_nullobject to show real use-case (#2524) by @stefan6419846
Add missing test annotations (#2507) by @stefan6419846

Full Changelog

Contributors

MartinThoma, pubpub-zz, and 5 other contributors

Assets 2

03 Mar 11:50

github-actions

4.1.0

6cf47c5

Version 4.1.0, 2024-03-03

What's new

Generating name objects (NameObject) without a leading slash is considered deprecated now. Previously, just a plain warning would be logged, leading to possibly invalid PDF files. According to our deprecation policy, this will log a DeprecationWarning for now.

New Features (ENH)

Add get_pages_from_field (#2494) by @pubpub-zz
Add reattach_fields function (#2480) by @pubpub-zz
Automatic access to pointed object for IndirectObject (#2464) by @pubpub-zz

Bug Fixes (BUG)

missing error on name without leading / (#2387) by @Rak424
encode_pdfdocencoding() always returns bytes (#2440) by @sbourlon
BI in text content identified as image tag (#2459) by @pubpub-zz

Robustness (ROB)

Missing basefont entry in type 3 font (#2469) by @pubpub-zz

Documentation (DOC)

Amend robustness documentation (#2479) by @j-t-1

Developer Experience (DEV)

Fix changelog for UTF-8 characters (#2462) by @stefan6419846

Maintenance (MAINT)

Add _get_page_number_from_indirect in writer (#2493) by @pubpub-zz
Remove user assignment for feature requests (#2483) by @stefan6419846
Remove reference to old 2.0.0 branch (#2482) by @stefan6419846

Testing (TST)

Fix benchmark failures (#2481) by @stefan6419846
Resolve file naming conflict in test_iss1767 (#2445) by @sbourlon

Full Changelog

Contributors

pubpub-zz, sbourlon, and 3 other contributors

Assets 2

18 Feb 15:45

github-actions

4.0.2

cc306ad

Version 4.0.2, 2024-02-18

What's new

Bug Fixes (BUG)

Use NumberObject for /Border elements of annotations (#2451) by @rsinger417

Documentation (DOC)

Document easier way to update metadata (#2454) by @stefan6419846
Typo Polyline \xe2\x86\x92 PolyLine in adding-pdf-annotations.md (#2426) by @CWKSC

Developer Experience (DEV)

Bump codecov/codecov-action from 3 to 4 (#2430) by @dependabot[bot]

Testing (TST)

Avoid catching not emitted warnings (#2429) by @stefan6419846

Full Changelog

Contributors

dependabot, CWKSC, and 2 other contributors

Assets 2

28 Jan 15:08

github-actions

4.0.1

7579329

Version 4.0.1, 2024-01-28

What's new

Bug Fixes (BUG)

layout mode text extraction ZeroDivisionError (#2417) by @shartzog

Testing (TST)

Skip tests using fpdf2 if it's not installed (#2419) by @MartinThoma

Full Changelog

Contributors

MartinThoma and shartzog

Assets 2

Releases: py-pdf/pypdf

Version 5.2.0, 2025-01-26

What's new

Deprecations (DEP)

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Documentation (DOC)

Developer Experience (DEV)

Maintenance (MAINT)

Testing (TST)

Code Style (STY)

Contributors

Version 5.1.0, 2024-10-27

What's new

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Documentation (DOC)

Developer Experience (DEV)

Maintenance (MAINT)

Testing (TST)

Code Style (STY)

Contributors

Version 5.0.1, 2024-09-29

Version 5.0.1, 2024-09-29

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Developer Experience (DEV)

Maintenance (MAINT)

Version 5.0.0, 2024-09-17

Version 5.0.0, 2024-09-17

Deprecations (DEP)

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Developer Experience (DEV)

Version 4.3.1, 2024-07-21

Bug Fixes (BUG)

Robustness (ROB)

Maintenance (MAINT)

Version 4.3.0, 2024-07-14

What's new

New Features (ENH)

Bug Fixes (BUG)

Documentation (DOC)

Robustness (ROB)

Maintenance (MAINT)

Code Style (STY)

Contributors

Version 4.2.0, 2024-04-07

What's new

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Documentation (DOC)

Developer Experience (DEV)

Maintenance (MAINT)

Testing (TST)

Contributors

Version 4.1.0, 2024-03-03

What's new

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Documentation (DOC)

Developer Experience (DEV)

Maintenance (MAINT)

Testing (TST)

Contributors

Version 4.0.2, 2024-02-18

What's new

Bug Fixes (BUG)

Documentation (DOC)

Developer Experience (DEV)

Testing (TST)

Contributors

Version 4.0.1, 2024-01-28

What's new