Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define removal of SCRIPT and STYLE elements everywhere textContent is requested. #17

Closed
Zegnat opened this issue Jan 12, 2018 · 6 comments

Comments

@Zegnat
Copy link
Member

Zegnat commented Jan 12, 2018

In practice parsers are already doing this everywhere, but that is currently against the specification. I say this is a mistake in the spec and not in parsers.

When the textContent value is used in mf2 we specify the removal of <script> and <style> elements within p-x, u-x, and dt-x parsing. But do not for e-x or implied name parsing.

According to spec:

<div class="x-h">Hello <script>beautiful </script>person</div>

Results in an implied name of Hello beautiful person.

@gRegorLove
Copy link
Member

Previous issue and resolution: http://microformats.org/wiki/microformats2-parsing-issues#exclude_style_elements_before_parsing

Appears might have just missed some instances in the spec update, but need to double-check and confirm. See this revision.

@sknebel
Copy link
Member

sknebel commented Jan 12, 2018

given what @gRegorLove found I'd say the missing pieces are:
a) specify the same for the value-version of a e-property (which likely was missed since the html was explicitly excluded in the discussion)

b) in the section about implied name properties, make it clear that textContent should be postprocessed the same way as for p- properties.

@gRegorLove
Copy link
Member

Proposed updates, which I believe are in line with the resolution:

parsing a p- property
No content change, just splitting out whitespace trimming into a separate bullet point:

Original:

  • else return the textContent of the element after:
    • dropping any nested <script> & <style> elements;
    • replacing any nested <img> elements with their alt attribute, if present; otherwise their src attribute, if present, adding a space at the beginning and end, resolving any relative URLs, and removing all leading/trailing whitespace.

Updated:

  • else return the textContent of the element after:
    • dropping any nested <script> & <style> elements;
    • replacing any nested <img> elements with their alt attribute, if present; otherwise their src attribute, if present, adding a space at the beginning and end, resolving the URL if it’s relative
    • removing all leading/trailing whitespace.

parsing an e- property
Original:

  • value: the textContent of the element, replacing any nested elements with their alt attribute if present, or otherwise their src attribute if present, resolving the URL if it’s relative.

Updated:

  • value: the textContent of the element after:
    • dropping any nested <script> & <style> elements;
    • replacing any nested <img> elements with their alt attribute, if present; otherwise their src attribute, if present, adding a space at the beginning and end, resolving the URL if it’s relative
    • removing all leading/trailing whitespace.

parsing for implied properties
For implied name:

Original:

  • else use the textContent of the .h-x for name

Updated:

  • else return the textContent of the .h-x after:
    • dropping any nested <script> & <style> elements;
    • replacing any nested <img> elements with their alt attribute, if present; otherwise their src attribute, if present, adding a space at the beginning and end, resolving the URL if it’s relative
    • removing all leading/trailing whitespace.

@Zegnat
Copy link
Member Author

Zegnat commented Jan 13, 2018

LGTM

@gRegorLove
Copy link
Member

Updated in spec with this revision: http://microformats.org/wiki/index.php?title=microformats2-parsing&oldid=66660

This issue can be closed now.

@Zegnat
Copy link
Member Author

Zegnat commented Feb 24, 2018

Closing as this has been updated and I’m not sure why it was kept open anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants