Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wicked_pdf_url_base64. #947

Merged
merged 1 commit into from
Oct 30, 2020

Conversation

joshuapinter
Copy link
Contributor

@joshuapinter joshuapinter commented Oct 21, 2020

We're using this in production to solve our problem. Looking to get some feedback from @mileszs before finalizing this properly and, hopefully, getting it merged to help others in similar situations.

Using URLs to reference images can cause a lot of problems with wkhtmltopdf, particularly when used in the header and footer. It produces errors such as "Too many open files" as well as buffer/stack overflows. These issues can be seen here, among other places:

As described here, one of the key solutions to avoiding this issue is providing the image encoded as base64. This helper method takes a URL of an image, opens the URL, reads the image data and encodes it to base64.

If the URL cannot be open (OpenURI::HTTPError is raised), it will log a warning and return nil.

TODOs:

  • Better placement (since this is not technically part of the asset pipeline)?
  • Better naming?
  • Tests.

@joshuapinter joshuapinter marked this pull request as ready for review October 26, 2020 02:21
@joshuapinter joshuapinter marked this pull request as draft October 26, 2020 02:21
# Using `image_tag` with URLs when generating PDFs (specifically large PDFs with lots of pages) can cause buffer/stack overflows.
#
def wicked_pdf_url_base64(url)
asset = URI.parse(url).open
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does read_from_uri work here instead? This project has been cautioned against using URI#open for security reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try it out. I original just had open() and Rubocop cautioned against it and in favour of URI.parse. Hang on...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, using read_from_uri won't work because it just returns a String and we need the proper response in order to get the content_type.

I was able to get it to work using:

response = Net::HTTP.get_response(URI(url))
base64 = Base64.encode64(response.body).gsub(/\s+/, '')
"data:#{response.content_type};base64,#{Rack::Utils.escape(base64)}"

Does that work?

P.s. We're still on Rails 4.2 but this made me open up an Issue with our own codebase to replace all uses of URI.parse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, updated the code and also had to change how failed responses were handled.

Note, I wanted to setup a guard like:

if !response.is_a?(Net::HTTPSuccess)
  Rails.logger.warn("[wicked_pdf] #{response.code} #{response.message}: #{url}")
  return nil
end

But your Rubocop config wanted me to change it to:

unless response.is_a?(Net::HTTPSuccess)
  Rails.logger.warn("[wicked_pdf] #{response.code} #{response.message}: #{url}")
  nil
end

And I despise that with a passion (we have that Cop changed in our Rubocop config) so went with the if/else instead as it's much more clear.

Using URLs to reference images can cause a lot of problems with wkhtmltopdf, particularly when used in the header and footer. It produces errors such as "Too many open files" as well as buffer/stack overflows.

A good solution to this is providing the image encoded as base64. This helper method takes a URL of an image, opens the URL, reads the image data and encodes it to base64.

If the URL does not have a successful response (`response.is_a?(Net::HTTPSuccess) == false`), it will log a warning and return `nil`. We could optionally make this raise but that should be a project-wide change.

TODO:

- [ ] Better placement (since this is not technically part of the asset pipeline)?
- [ ] Better naming?
- [ ] Tests.
@unixmonkey unixmonkey marked this pull request as ready for review October 30, 2020 16:32
Copy link
Collaborator

@unixmonkey unixmonkey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

@unixmonkey unixmonkey merged commit 60b6c40 into mileszs:master Oct 30, 2020
@joshuapinter
Copy link
Contributor Author

joshuapinter commented Oct 30, 2020

👍 Thanks for a great library.

@joshuapinter joshuapinter deleted the image_url_to_base64_helper branch October 30, 2020 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants