-
-
Notifications
You must be signed in to change notification settings - Fork 705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache dict is overriden after write_pdf (breaking it) #2228
Comments
There’s definitely a problem, thanks for reporting. |
While I can't say I completely understand the code, it appears to be related to how
The data written by This also appears to be closely related to #1942, which afaict fails for the same underlying reason. |
Yes, you’re right. I’ll fix the bug as soon as I can. |
Many thanks for the quick action and finding the proper bug. I tried my best to do it but I felt uncapable (or at least I would take a lot of time and I had more work to do) due to my lack of awareness of the whole project, I gave all the info I was able to discover by myself. |
Thanks a lot for the report and the investigation. Feedback is welcome! |
I'm using WeasyPrint 61.2 in my web app to send email in bulk to many users. These emails might have attached PDF's that I generate using the library. For performance, I use the cache kwarg to avoid generating multiple times my pictures.
When I start sending emails in bulk, in the first iteration, weasyprint generates the PNG file and stores it in my cache after calling
Document.build_formatting_structure
. This method generates this cache:Where every picture has 2 entries, one with a hash identifying it, and another with the image url.
This continues executing
html.write_pdf
method normally, but after finishing, my cache has changed to this:The problem comes with next iterations of html.write_pdf.
My cache has changed and the method
Document.build_formatting_structure
is not returning it to it's previous state.When the code continues and arrives the moment of getting the data from my PNG in
RasterImage.get_x_object
,self.image.data
does not contain the PNG Bytes we see in the first picture but the Bytes we see in the second one, causing it to fail raisingUnidentifiedImageError('cannot identify image file <_io.BytesIO object at 0x11db6c5e0>')
I have tried with all my heart to solve this by myself as you might see, but I don't find the place where the cache is changing, nor understanding what is it writing on it's place.
The text was updated successfully, but these errors were encountered: