-
-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix sanitizer making block level elements unreadable #10836
Conversation
4f1ef47
to
e3267af
Compare
I'd still prefer actual support for those tags, but this is an improvement on the current behavior. |
Thinking about it, it's a good opportunity to translate inline |
spec/lib/sanitize_config_spec.rb
Outdated
expect(Sanitize.fragment('<h1>Foo</h1>', subject)).to eq '<p>Foo</p>' | ||
end | ||
|
||
it 'converts li to p' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did i misunderstand the job of UNSUPPORTED_ELEMENTS_TRANSFORMER? to me it read like (h1 h2 h3 h4 h5 h6 blockquote pre li) would be preserved, and anything else would be converted to p
the reason i ask is because converting li to p would insert margins below it, and that causes excessive spacing. essentially, an li element has only 1 visual line break, whereas a p element has 2 visual line breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you probably misunderstood. They all get converted to <p>
.
I feel like single line breaks can be very hard to notice within longer text, so while for single-line list items it would be best to use <br />
, <p>
is more fit for anything that has its own wrapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, so it's like this then?
- keep (p br span a)
- convert (h1 h2 h3 h4 h5 h6 blockquote pre li) to p
- strip all other elements
the "unless" was throwing me off.
i still think semantically if you were trying to map ol/ul and li, then li would be most like a br element. most formatting on the rest of the internet (and indeed most browsers' default css) treats lists as single-break, with double-break around the list due to margin css.
i'm still not sure what the point is in explicitly stripping out the list elements, since they render unobtrusively anyway. my idea of converting to p was more for unsupported elements like marquee. as far as i can tell this current code just removes marquee entirely, right? that would cause multiple marquees to become a run-on sentence.
from a purely visual standpoint, everything in (h1 h2 h3 h4 h5 h6 blockquote pre li) except for pre is actually safe due to the basics.scss reset, without any other changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
app/lib/sanitize_config.rb
Outdated
@@ -19,6 +19,11 @@ module Config | |||
node['class'] = class_list.join(' ') | |||
end | |||
|
|||
UNSUPPORTED_ELEMENTS_TRANSFORMER = lambda do |env| | |||
return unless %w(h1 h2 h3 h4 h5 h6 blockquote pre li).include?(env[:node_name]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw if this line does what i think it does, then pre should be converted instead of maintained (unless you add css along the lines of overflow-x: auto
or white-space: pre-wrap
.
headers, blockquote, and li are reset by basics.scss regardless. for that matter, ol and ul are also safely reset by the current css.
For the record, I don't see why this is necessary when the user opinion has been massively in favor of full support. |
e3267af
to
4a20bd4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still prefer if we accepted those tags directly, but I think this is the best fallback we can have, even though it still loses information (nested lists for instance).
Also, this PR might be a good place for replacing incoming inlined images with links (as in glitch-soc#1071)
Fix #10834