-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use HTML5 sanitizer to serve huge payloads through Speedscope remote viewer #102
Conversation
…viewer When we run the app profiler on Shopify core's boot, it is frequent that the generated profile JSON file exceeds 10MB. `Rails::Html::Sanitizer` is unable to process such huge payloads (because libxml2 limits text nodes to 10MB by default?). The `rails-html-sanitizer` introduced HTML5 parsing in its 1.6.0 version, which lifts the text node size limit. This change intends to use the new HTML5 parser when it's available to sanitize the Speedscope remote viewer's HTML, allowing bigger payloads to be served successfully.
c636cee
to
37e35d9
Compare
rack (2.2.6.3) | ||
rack-test (1.1.0) | ||
rack (>= 1.0, < 3) | ||
rails-dom-testing (2.0.3) | ||
activesupport (>= 4.2.0) | ||
nokogiri (>= 1.6) | ||
rails-html-sanitizer (1.5.0) | ||
loofah (~> 2.19, >= 2.19.1) | ||
rails-html-sanitizer (1.6.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was necessary to upgrade rails-html-sanitizer to 1.6.0 in Gemfile.lock in order to make it available when running tests.
Note that this is not a gem dependency anyway, and the gem is still meant to work with rails-html-sanitizer 1.5.0.
Ideally, I'd want a way to run the same test suite with different versions of rails-html-sanitizer (one before and one after 1.6.0, which is the version that originally defines Rails::HTML
– all caps – and implements the HTML5 sanitizer), but I had no idea how I could achieve that. (Also not sure this would be worth the effort.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it important to support RHS 1.5.x? I'd prefer if you just updated the dependency.
@@ -60,6 +60,21 @@ class MiddlewareTest < TestCase | |||
assert_match(%r(<script type="text/javascript">), html) | |||
end | |||
|
|||
test "#call show can serve huge payloads" do | |||
frames = { "1" => { name: "a" * 1e7 } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10MB, or about 10 million (1e7
) characters, is the size limit the old HTML4 sanitizer wasn't able to exceed.
Reverting rails-html-sanitizer
to 1.5.0 in Gemfile.lock, one can run this test and confirm that it fails.
html[-200..-1], | ||
message: "The generated HTML was incomplete" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I'm using a substring because html
is huge (10MB+), and this assertion would flood the terminal if it fails.
I also added an error message to clarify what is likely happening.
html = html.first | ||
|
||
assert_match( | ||
%r{'Flamegraph for .*'\);\n</script>}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the end of the "rendered" payload, as seen here:
app_profiler/lib/app_profiler/viewer/speedscope_remote_viewer/middleware.rb
Lines 50 to 51 in fbf4c3e
iframe.setAttribute('src', '/app_profiler/viewer/index.html#profileURL=' + objUrl + '&title=' + 'Flamegraph for #{name}'); | |
</script> |
If this was rendered, then there likely was no problem rendering the whole text node.
@@ -6,7 +6,13 @@ module AppProfiler | |||
module Viewer | |||
class SpeedscopeRemoteViewer < BaseViewer | |||
class BaseMiddleware | |||
class Sanitizer < Rails::Html::SafeListSanitizer | |||
sanitizer_superclass = if Rails::Html::Sanitizer::VERSION >= "1.6.0" && Rails::HTML::Sanitizer.html5_support? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rails::Html::Sanitizer::VERSION >= "1.6.0"
was necessary because HTML
(all caps) and #html5_support?
are not defined in older versions of the gem.
This line could also be:
sanitizer_superclass = if Rails::Html::Sanitizer::VERSION >= "1.6.0" && Rails::HTML::Sanitizer.html5_support? | |
sanitizer_superclass = if Rails::Html.respond_to?(:html5_support?) && Rails::Html::Sanitizer.html5_support? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can/should use Rails::HTML::Sanitizer.best_supported_vendor
for this if you pin to RHS >= 1.6.0
sanitizer_superclass = if Rails::Html::Sanitizer::VERSION >= "1.6.0" && Rails::HTML::Sanitizer.html5_support? | ||
Rails::HTML5::SafeListSanitizer | ||
else | ||
Rails::Html::SafeListSanitizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rails::Html::SafeListSanitizer
is the only sanitizer available in old versions of rails-html-sanitizer, and equivalent to Rails::HTML4::SafeListSanitizer
in 1.6.0 and later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous comments on my recommendation to use 1.6.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to gently push on the assumption that we need to support Rails::HTML::Sanitizer v1.5.x. v1.6.0 should be backwards-compatible and supports Ruby 2.7 and later.
@@ -6,7 +6,13 @@ module AppProfiler | |||
module Viewer | |||
class SpeedscopeRemoteViewer < BaseViewer | |||
class BaseMiddleware | |||
class Sanitizer < Rails::Html::SafeListSanitizer | |||
sanitizer_superclass = if Rails::Html::Sanitizer::VERSION >= "1.6.0" && Rails::HTML::Sanitizer.html5_support? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can/should use Rails::HTML::Sanitizer.best_supported_vendor
for this if you pin to RHS >= 1.6.0
rack (2.2.6.3) | ||
rack-test (1.1.0) | ||
rack (>= 1.0, < 3) | ||
rails-dom-testing (2.0.3) | ||
activesupport (>= 4.2.0) | ||
nokogiri (>= 1.6) | ||
rails-html-sanitizer (1.5.0) | ||
loofah (~> 2.19, >= 2.19.1) | ||
rails-html-sanitizer (1.6.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it important to support RHS 1.5.x? I'd prefer if you just updated the dependency.
sanitizer_superclass = if Rails::Html::Sanitizer::VERSION >= "1.6.0" && Rails::HTML::Sanitizer.html5_support? | ||
Rails::HTML5::SafeListSanitizer | ||
else | ||
Rails::Html::SafeListSanitizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous comments on my recommendation to use 1.6.0
@flavorjones Thank you for your comments! I also guess we might not want to declare If we're happy to stop supporting older versions of For reference, Rails 7.0.8 has a pretty loose version constraint on s.add_dependency "rails-html-sanitizer", "~> 1.0", ">= 1.2.0" |
@davidstosik I understand your points, and you're right about the historical expectation. But what I'm suggesting is that we don't care anymore about supporting old versions of RHS (because the new version is backwards compatible) and that the right thing to do is to declare a dependency on RHS >= 1.6.0 and simplify this code. |
@@ -1,12 +1,13 @@ | |||
# frozen_string_literal: true | |||
|
|||
gem "rails-html-sanitizer", ">= 1.6.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@flavorjones To enforce the version dependency, I had to add a peer dependency here. (There is no dependency to that gem in the gemspec.)
@shioyama: I paired with @gmcgibbon this morning and we confirmed that this change also fixes the double escaping issue you mentioned to me yesterday! 🎉
|
Fixes #92.
When we run the app profiler on Shopify core's boot, it is frequent that the generated profile JSON file exceeds 10MB.
Rails::Html::Sanitizer
is unable to process such huge payloads (because libxml2 limits text nodes to 10MB by default?).The
rails-html-sanitizer
gem introduced HTML5 parsing in its 1.6.0 version, which lifts the text node size limit.This change intends to use the new HTML5 parser when it's available to sanitize the Speedscope remote viewer's HTML, allowing bigger payloads to be served successfully.
I wasn't sure how to implement such variable superclass (depending on the version of rails-html-sanitizer used), and I was even less sure of how I could implement some tests to avoid regressions. All suggestions will be appreciated! 🙏🏻