Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add XHTML/XML option #15

Open
mathiasbynens opened this issue Dec 18, 2013 · 2 comments
Open

Add XHTML/XML option #15

mathiasbynens opened this issue Dec 18, 2013 · 2 comments

Comments

@mathiasbynens
Copy link
Owner

This may not be worth it, but here goes…

E.g. … → U+0085 in XHTML, while in HTML it’s U+2026.

http://www.w3.org/TR/xml/#d0e3895

Entities for these symbols are allowed in XML: http://www.w3.org/TR/xml/#NT-Char

@royfielding
Copy link

It would be nice to have an option (default off) that would exclude the characters not allowed in XML even as an entity reference. These characters cause XML validation to fail. For example,

https://github.com/MylesBorins/xml-sanitizer/

he could either strip the invalid characters (like above) or replace them with a non-entity (like ESC for 0x1B).

@cederberg
Copy link

Another XML vs. HTML issue is how to encode using only named XML entities (i.e. &, <, >, ' and "). Don't think this is possible in the API today, but perhaps it should be? Just a very minor issue of course.

Ended up with a few extra lines of JS to work-around this:

    const ENTITIES = ['&', '"', ''', '<', '>'];
    const recode = (s) => he.encode(he.decode(s), { decimal: true });
    const fix = (s) => ENTITIES.reduce((s, e) => s.replaceAll(recode(e), e), s);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants