Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Don't close stream passed to PdfWriter.write() #2909

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

alexaryn
Copy link

Closes #2905

Copy link

codecov bot commented Oct 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.44%. Comparing base (80c3939) to head (dafbafc).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2909   +/-   ##
=======================================
  Coverage   96.44%   96.44%           
=======================================
  Files          52       52           
  Lines        8728     8730    +2     
  Branches     1589     1589           
=======================================
+ Hits         8418     8420    +2     
  Misses        182      182           
  Partials      128      128           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pypdf/_writer.py Outdated
@@ -369,6 +367,9 @@ def __exit__(
"""Write data to the fileobj."""
if self.fileobj:
self.write(self.fileobj)
close_attr = getattr(self.fileobj, "close", None)
if callable(close_attr):
self.fileobj.close() # type: ignore[attr-defined]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you not using close_attr() directly here? Then you might even be able to omit the ignore comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to avoid the mypy issue. I do, however, think that the intent is more clearly communicated as written, but I will make the change.

@@ -249,7 +249,6 @@ def _get_clone_from(
# to prevent overwriting
self.temp_fileobj = fileobj
self.fileobj = ""
self.with_as_usage = False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not found of removing with_as_usage attribute. it may be usefull to know that the the object has been created for a context manager.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's never used. Keeping it around would mislead someone reading the code that it matters in some way. It's dead code but easy to revive if a need arises.

for i in range(4):
writer.add_page(reader.pages[i])
writer.write(tmp)
assert not tmp.file.closed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be great to also have a test for where the automatic write at the closure of the context will be done

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I managed to add a test for this. It was a bit confusing because of the double-construct that happens in __enter__(). I spent a fair amount of time trying to understand the clone_from logic before I realized that everything from the first __init__() is thrown away except for temp_fileobj.

pypdf/_writer.py Outdated
@@ -369,6 +367,9 @@ def __exit__(
"""Write data to the fileobj."""
if self.fileobj:
self.write(self.fileobj)
close_attr = getattr(self.fileobj, "close", None)
if callable(close_attr):
self.fileobj.close() # type: ignore[attr-defined]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking @MasterOdin's post

the stream closure should be done by the caller, not here, no ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it should be closed on exit to avoid leaking resources - unless I misunderstood the existing discussions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that if a stream is opened by the PdfWriter(eg. if a path is provided)the stream should be closed but if it is the stream (opened before the context of the PdfWriter) it should be closed out of the 'with' section. As written it is always closed at the closure.

Copy link
Author

@alexaryn alexaryn Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking this code isn't needed at all. The only way PdfWriter stores a closable object in self.fileobj is via a with-statement invoking __init__() passing in the object. I note that sometimes self.fileobj is readable and used for cloning and sometimes it's writable and used for output, which is confusing. Also, I think there's a bug in this line: if isinstance(fileobj, (IO, BytesIO)): in that IO is from the typing module and no real objects will inherit from it. I suspect it was meant to be more like IOBase, but I'm not sure. Finally, the fileobj.seek(-1, 2) confuses me; why go one byte back from the end? If the file's empty it'll raise OSError.

In any case, the only file I see being created is in PdfWriter.write() where it's a local variable and it gets closed.

@pubpub-zz
Copy link
Collaborator

I've created PR #2913 that also deals with Context manager. Consistency should be checked

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PdfWriter.write() in context manager closes stream when it should not
3 participants