fix: ignore unparsed entity logic for predefined xml entities #266
+63
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a bug in the implementation for resolving entities. Predefined XML entities like
<
,&
, and"
should not be interpreted, as this breaks parsing of the document.For example, see the following XML document (an SVG) and render in the browser:
We should resolve custom entities like
entity_reference
andescaped_entity_reference
. However, not&
,<
, or"
, otherwise they get interpreted by the parser instead.Currently, when parsing this XML document in sax v1.3.0 we get this error:
Related
Just an aside, if you'd like some help with maintaining sax, I'd love to lend a hand. We've been using this in SVGO, and since joining as a maintainer I've been meaning to move away from our (unmaintained) fork back to upstream. No pressure, and I'll see if I can open other pull requests in time.
If you'd accept it, I think it would be very valuable to have a task in CI that attempts to parse the XML Conformance Test Suites just to ensure no errors are thrown. That also would've caught this bug. ^-^'
https://www.w3.org/XML/Test/
Update
Since opening this pull request, I threw together a script for regression testing.
This actually helped me identify a problem with my initial solution. I didn't account for the fact that
"
isn't the only way to insert quotes. One can do"
instead, so now I'm comparing againstObject.values(XML_ENTITIES)
rather than checking if the entity string exists as a property ofXML_ENTITIES
.