Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify whitespace collapsing/trimming for text nodes ; Support in Core? #28

Open
fred-wang opened this issue Feb 22, 2019 · 9 comments
Labels
compatibility Issues affecting backward compatibility css / html5 Issues related to CSS or HTML5 interoperability MathML Core Issues affecting the MathML Core specification MathML 4 Issues affecting the MathML 4 specification need polyfill Issues requiring implementation changes need tests Issues related to writing WPT tests

Comments

@fred-wang
Copy link

"Whitespace collapsing/trimming
https://www.w3.org/TR/MathML/chapter2.html#fund.collapse

Whitespace collapsing is consistent with the default CSS property
"white-space" and people are familiar with it.

Removing "whitespace at the beginning and end of the content" is less
expected. Gecko has some code to handle this but it would be very
helpful to avoid this additional complexity. WebKit does not handle it
at the moment and it's not clear it's worth doing it... Except in the
MathML spec/test, everybody seems to just write ( and not
( . Can we deprecate this behavior in MathML4? Or maybe you should
work with the HTML5 WG to define such collapsing rules during document
parsing, so that the MathML rendering code no longer need to handle it?"

original report: https://lists.w3.org/Archives/Public/www-math/2016Aug/0000.html

@fred-wang fred-wang added MathML Core Issues affecting the MathML Core specification MathML 4 Issues affecting the MathML 4 specification css / html5 Issues related to CSS or HTML5 interoperability labels Feb 22, 2019
@davidcarlisle
Copy link
Collaborator

davidcarlisle commented Feb 22, 2019

I never like the look of <mo> + </mo> and would always use <mo>+</mo> myself but... I think the original thinking was that this isn't so different from HTML <td> x </td> being the same as <td>x</td> or similarly <p> x </p> but perhaps you are right and specifying this at level MathML does isn't the right place.

If this is causing problems in browser implementations I'd agree we should look to drop it from core and then see if that means it should be dropped in full as well.

@fred-wang
Copy link
Author

If this is causing problems in browser implementations I'd agree we should look to drop it from core and then see if that means it should be dropped in full as well.

Yes, I think a general rule of thumb for the simplify issues I opened is to decide whether we add restrictions to mathml core or full mathml 4 or none, whether we deprecate or remove, and whether we can write polyfill or other scripts to help transition.

@fred-wang fred-wang added the compatibility Issues affecting backward compatibility label Mar 20, 2019
@fred-wang fred-wang added the need resolution Issues needing resolution at MathML Refresh CG meeting label May 16, 2019
@fred-wang fred-wang changed the title Simplify whitespace collapsing/trimming Simplify whitespace collapsing/trimming for text nodes Sep 16, 2019
@fred-wang fred-wang added the need tests Issues related to writing WPT tests label Sep 16, 2019
@NSoiffer
Copy link
Contributor

This issue has been hanging around for a while. In looking at this it seems that @fred-wang is arguing for two things:

  1. The MathML spec shouldn't specify whitespace handling because that is specified by HTML and CSS. The spec currently has text that says whitespace should be collapsed.
  2. Removing "whitespace at the beginning and end of the content" is less expected.

As with other syntactic issues, I agree that we should leave that to HTML or CSS, whichever one is appropriate and remove it from the spec.

In this case, HTML is the appropriate spec to follow... and it has rules that says whitespace should be collapsed, so I disagree with the second point. The way whitespace is collapsed differs between inline, block, and inline-block elements, but for all types, whitespace at the beginning and end of content is removed. MDN has a nice summary.

Given that MathML has always said whitespace collapsing should happen, HTML does it, and Gecko does it, why would we want something different to happen?

Note: whitespace is apparently specifically included in the DOM. From the MDN article:

Any whitespace characters that are outside of HTML elements in the original document are represented in the DOM. This is needed internally so that the editor can preserve formatting of documents.

@bkardell
Copy link
Collaborator

I'm slightly confused by your comments @NSoiffer - it's possible I missed something key though... The HTML has lots of rules about whitespace handling, but as far as I can tell these aren't the kinds you're talking about - they mainly have to do with things like how to machine read things that have typed values - boolean attributes, DOMTokenLists and so on, and not about display. There are a (very) few exceptions that I know of for historical reasons (like, I think that whitespaces before or after body might be tossed or something irrc) but where are the rules in HTML you are referring to? As you note, spaces are retained in the DOM and display managed by CSS - this is generally important because you can do things like contenteditable and switch to pre formatting and so on. Even totally invisible in normal cases elements like script and style tags and heads can be made visible for these reasons, with whitespace. This seems to be what the MDN article describes too - but as I say, maybe I am missing a key bit?

@davidcarlisle
Copy link
Collaborator

@bkardell the mdn article Neil referenced describes how the HTML

<p> aaa <span> bbb </span> xxx </p>

ends up rendering like

<p>aaa <span>bbb</span> xxx</p>

I think your point is that this is a rendering feature and the white space is in the DOM.

That's true but the reason MathML3 had a white space stripping rule is to say that

<mi> a </mi> <mo> + </mo> <mi> b </mi>

renders like

<mi>a</mi><mo>+</mo><mi>b</mi>

as it didn't distinguish DOM and rendering (much).

So I don't think there are any compatibility issues if the white space stays in the DOM and the CSS/display rules mean that almost all white space doesn't affect the math rendering but the two MathML forms above should render the same way , just as the two HTML forms do.

@faceless2
Copy link

css-text-4 proposes a new text-space-collapse: discard property which removes all whitespace within the element. The example even references MathML white space handling explicitly. In this draft, white-space becomes a shortcut property and text-space-collapse one if its components.

I don't believe this property is implemented by any browser yet, but in theory all that should be needed in the user-agent stylesheet is something like"

math { text-space-collapse: discard }
mtext { white-space: normal }

@fred-wang
Copy link
Author

To clarify: In native implementation, whitespace will stay in the DOM indeed. Elements that don't allow text layout nodes will also exclude whitespace automatically, so this question is about collapsing/trimming for elements that use text nodes for rendering purpose.

All implementations rely on existing CSS behavior and, although it may depend on which kind of CSS boxes are used, I can definitely say that none of them do MathML3's behavior by default and that Gecko needs extra work that has caused security issue in the past.

When parsing the content (e.g. the mo text for accessing a single glyph for stretching or reading the operator dictionary) it is however easy to remove the whitespace. WebKit does this but not the former point, which leads to inconsistency.

AFAIK whitespace trimming for Chromium's MathML implementations has the same limitation as WebKit.

The proposal about relying on existing CSS white space property is interesting, but indeed the implementation and CSS WG status seems unclear.

So I don't think the CG is in position to decide anything here. We can't just rely on an MDN article explaining whitespace trimming to say we should do that. As I keep repeating, we need to to know more about the implementation before adding something in MathML Core. Also, this is very low priority IMHO.

@fred-wang
Copy link
Author

Adding "need polyfill" as it should be easy to implement this in JS.

@fred-wang fred-wang added the need polyfill Issues requiring implementation changes label May 19, 2020
@davidcarlisle
Copy link
Collaborator

@fred-wang accepted that there has to be implementation details but I think it is reasonable to push back on the comment with which you opened this issue:

Removing "whitespace at the beginning and end of the content" is less expected.

From an end-user perspective the white space handling of mathml <mi> is like that of html <p> or <td> : internal runs of white space act like a single space and white space at either end has no effect.

So it may be low priority and it may not make the cut for a first version of mathml core, but I do not think it is unexpected by the users or unnatural in a web context.

@fred-wang fred-wang changed the title Simplify whitespace collapsing/trimming for text nodes Simplify whitespace collapsing/trimming for text nodes ; Support in Core? May 22, 2020
@fred-wang fred-wang removed the need resolution Issues needing resolution at MathML Refresh CG meeting label Aug 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Issues affecting backward compatibility css / html5 Issues related to CSS or HTML5 interoperability MathML Core Issues affecting the MathML Core specification MathML 4 Issues affecting the MathML 4 specification need polyfill Issues requiring implementation changes need tests Issues related to writing WPT tests
Projects
None yet
Development

No branches or pull requests

5 participants