-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] A proposal for disentangling markup patterns from semantics for accessibility #217
Comments
Here is simultaneously a report on notation used in two The two books are: I will write everything in LaTeX notation. APEX writes vectors as italic letters with a arrow above |x| APEX: absolute value of x a^b exponent \left( \frac{a}{b} \right) a fraction in parentheses (a, b) could mean the open interval a < x < b, [a, b] the closed interval a \le x \le b a \cdot b single variable calculus: ordinary multiplication a \times b single variable calculus: ordinary multiplication \Delta y change in y \overline{x}_i midpoint of the ith interval \overline{PQ} line segment from P to Q \overrightarrow{PQ} vector from point P to point Q { a_n } the sequence a_n P(a, b) the point with polar coordinates r=a and \theta=b \lVert x \rVert the length of the segment x a \parallel b vector a is parallel to vector b \langle a, b \rangle the vector with coordinates a, b [ a ] the 1\by1 matrix with entry a |
@davidfarmer: were the only two differences between the books the first two you mention? The rest are calc notations used by both? |
I've moved this material to a github page at |
which appears as https://mathml-refresh.github.io/mathml/docs/layout-semantics in github pages. |
Note @NSoiffer has a modified version of the @brucemiller 's proposal at https://mathml-refresh.github.io/mathml/docs/function-semantics Neil's version has the advantage of not requiring a fixed enumeration of Exposing the element nesting (
This ensures the mfrac is well-formed but if the recursive transformations just produce I think it may be possible to combine the two proposals with the following (not fully baked) form.
Each notation/layout schema would specify the default position of the main operator and arguments. these could be over-written with
So binomial might be
or for a reversed convention
sub and sub forms (and mroot and mfrac and 2-child mrow) could I think be combined, basically in Bruce's document By default (unless over-ridden by so transpose
power
factorial
for an element with the nth derivative case could be marked as
with meaning but, especially for notations using more fancy decoration you could use the arg attributes to explictly ignore the syntax decoration:
with meaning For binary infix, eg dot product this leads to closer to Bruce's form (no
Arguably But I think we should distinguish the multiple operator case where the operators are being combined (with missing mrow) using implied precedence or associativity rules from the case where it is really an n-ary operator like
with meaning and
with meaning a+b-c+d with some implicit disambiguation rules (👋👋) Intervals would just need
as the ],[ would mo would be ignored by the default rule for the Pochammer could use a named
In fact I now realise there is no need to have a |
I completely agree with your concern about number arguments being a
slippery slope and also one that could be solved using xpath or CSS
selectors. However, saying those can be used is like saying we could use a
backhoe to accomplish something that just requires a hand shovel. Using one
of them would unify the named argument and numbered arguments in my
proposal and maybe allow some things to be done more easily, so there are
some arguments in favor of using one of them (probably xpath).
As for your proposal, it has the nice feature of only needing to tag the
operator in many cases. However, I believe you need to expand your
statement:
By default (unless over-ridden by arg= attributes) the arguments of the
operator consist of the children of the element, in order, but ignoring <mo>
, <mspace>
Other things that tweak formatting like <mstyle>, <mphantom>, and <mpadded>
may need to appear in the list, along with possibly <mrow> that has a
single child not in the previous list of formatting tweaking elements.
Embellished operators might be problematic also (e.g, +_5 for addition mod
5) because some parts of the embellished operator may want to be arguments.
Maybe the use of formatting tweaking elements are rare enough that the
solution is simply to say that @arg needs to be used. If so, I'd go with a
simpler definition and just say <mo> is skipped in that case.
The benefit of just naming the operator in many cases has an analogy in my
proposal if we go with the "skip <mo>s" and add something like making "@@"
(my notation for nary args) with nothing following mean "take all non <mo>
args". For example, notation="power(@@)" for a default meaning of <msup>.
With xpath, that's something that can be done easily also.
Comparing the two proposals with this modification:
Mine original:
<mrow notation="factorial(@0)">
<mi>a</mi>
<mo>!</mo>
</mrow>
Mine modified:
<mrow notation="factorial(@@)">
<mi>a</mi>
<mo>!</mo>
</mrow>
Yours:
<mrow notation="operator-args">
<mi>a</mi>
<mo operator="factorial">!</mo>
</mrow>
And for the other notation value you propose:
Mine original:
<msup notation="applicative-power(@0, @1)">
<mi>sin</mi>
<mn>-1</mn>
</msup>
Mine modified:
<msup notation="applicative-power(@@)">
<mi>sin</mi>
<mn>-1</mn>
</msup>
<msup notation="args" operator="applicative-power">
<mi>sin</mi>
<mn>-1</mn>
</msup>
On the surface, this simplification seems like it would work most of the
time, but I'm a bit dubious. The open-interval is one such example. You
wrote it as "mrow( ] a , b [)', but "proper" mrow structure is "mrow(]
mrow(a , b) [)," so the args would need explicit tagging. Given that mrows
with prefix/infix/postfix operators need proper structure, one should
probably assume that for bracketed notations. With a quick count (I may be
off by one or two), the examples in the table Bruce gave (and I copied)
require explicitly tagging 13 of the 28 assuming proper mrow structure; 9
of 28 with the fenced notations coded as in the table. Of course, the ones
in the table are probably less common notations than those in a standard
web page, but still...
Your handling of "a+b-c+d" has a fair bit of hand waving in it as you
acknowledge; mine does also. It's an area that requires more thought no
matter what proposal we go with.
In the end, I think our proposals are roughly similar if I drop numbered
arguments and add "@@" with no args. There is one area where I think mine
is more powerful -- you can specify implicit function arguments in mine.
For something like derivatives, that allows one to unify first and higher
derivatives into the same function:
D(f) -- notation="newton-derivative(@f, 1)
D^2(f) -- notation="newton- derivative (@f, @n)
It's not a big deal, and I don't know how often this happens outside of
derivative notation, but it seems like something that might be useful.
For completeness, here's a comparison where arguments need to be labelled:
Mine with numbered attrs:
<msup notation="transpose(@0)">
<mi>A</mi>
<mi>T</mn>
</msup>
Mine with named attrs:
<msup notation="transpose(@arg)">
<mi arg="arg">A</mi>
<mi>T</mn>
</msup>
Yours
<msup notation="operator-args">
<mi>A</mi>
<mi operator="transpose">T</mn>
</msup>
The numbered one is clearly the least work to write, but as you said, is
more fragile to changes to the MathML. The second and third ones are about
the same amount of work to do, but *I* think my version is easier to
understand.
Neil
…On Fri, Jun 19, 2020 at 4:09 PM David Carlisle ***@***.***> wrote:
Note @NSoiffer <https://github.com/NSoiffer> has a modified version of
the @brucemiller <https://github.com/brucemiller> 's proposal at
https://mathml-refresh.github.io/mathml/docs/function-semantics
Neil's version has the advantage of not requiring a fixed enumeration of
notation= layout schema. However I do not think we should rely so heavily
on counting of child elements. If we do count it should be 1-based not
0-based (both xpath and CSS selectors are 1 based) but also once you get
beyond basic child elements it means specifying a third "competing" query
construct for DOM trees
and I think it would be better to avoid that.
Exposing the element nesting ***@***.***@0 ) is tricky as it means that you may
have to remove (or expose by counting) redundant <mrow> . Previously an
mrow that wraps a single child
was always allowed and had no effect when generating presentation mathml
(eg from Content) it is very natural to eg generate
<mfrac>
<mrow> *recursively transfrom1* </mrow>
<mrow> *recursively transfrom1* </mrow>
</mfrac>
This ensures the mfrac is well-formed but if the recursive transformations
just produce <mn>1</nm> and <mn>2</nm> then the <mrow> are redundant,
depending on the transformation technology it isn't always convenient to do
a second pass to remove them.
I think it may be possible to combine the two proposals with the following
(not fully baked) form.
notation as in Bruce's proposal.
The possible values would be fixed, possibly the list as in Bruce's
document, or possibly a shorter list, but extended with a new possibility
of overriding the default determination of the arguments, so the notation
forms can be used with any layout.
Each notation/layout schema would specify the *default* position of the
main operator and arguments.
these could be over-written with operator and arg attributes.
operator takes an operator name, and is essentially a renamed meaning
from Bruce's proposal.
arg takes an integer value, if it is used then the collection of
descendant arg attributes (ignoring any in nested notation subtrees) should
produce the integer range 1 ....n for some n, and specify the arguments of
the operator.
So binomial might be
<msubsup notation="operator-args">
<mi operator="binomial">C</mi>
<mi>m</mi>
<mi>n</mi>
</msubsup>
or for a reversed convention
<msubsup notation="operator-args">
<mi operator="binomial">C</mi>
<mi arg="2">m</mi>
<mi arg="1">n</mi>
</msubsup>
sub and sub forms (and mroot and mfrac and 2-child mrow) could I think be
combined, basically in Bruce's document notation=sup is for a
two-argument operation, power or specified elsewhere and sup-operator
being a 1-argument operation specified by the second child. So here rename
to
args and operator-args
By default (unless over-ridden by arg= attributes) the arguments of the
operator consist of the children of the element, in order, but ignoring
<mo>, <mspace>
so transpose
<msup notation="operator-args">
<mi>A</mi>
<mi operator="transpose">T</mn>
</msup>
power
<msup notation="args" > <!-- operator="power" -->
<mi>x</mi>
<mi>n</mi>
</msup>
factorial
<mrow notation="operator-args">
<mi>a</mi>
<mo operator="factorial">!</mo>
</mrow>
for an element with notation=args the operator should be specified on the
element (rather than on a child) with default values power on <msup>,
division on mfrac , root on ```
the nth derivative case could be marked as
<msup notation="args" operator="derivative-implicit-variable">
<mi>f</mi>
<mrow>
<mo>(</mo>
<mi>n</mi>
<mo>)</mo>
</mrow>
</msup>
with meaning derivative(f,(n)) with the parens around n being taken as
part of the value,
but, especially for notations using more fancy decoration you could use
the arg attributes to explictly ignore the syntax decoration:
<msup notation="args" operator="derivative-implicit-variable">
<mi arg="1">f</mi>
<mrow>
<mo>(</mo>
<mi arg="2">n</mi>
<mo>)</mo>
</mrow>
</msup>
with meaning derivative(f,n) with the parens around n being ignored,
For binary infix, eg dot product this leads to closer to Bruce's form (no
@ counting)
<mrow notation="infix">
<mi mathvariant="bold">a</mi>
<mo operator="inner-product>⋅</mo>
<mi mathvariant="bold">b</mi>
</mrow>
Arguably infix isn't needed here and you could use operator-args still,
but maybe that's a simplification too far.
But I think we should distinguish the multiple operator case where the
operators are being combined (with missing mrow) using implied precedence
or associativity rules from the case where it is really an n-ary operator
like (plus... just being written as repeated infix by convention.
Something like
<mrow notation="infix" operator="plus">
<mi>a</mi>
<mo>+</mo>
<mi>b</mi>
<mo>-</mo>
<mi>c</mi>
<mo>+</mo>
<mi>d</mi>
</mrow>
with meaning (plus a b (minus c) d)
and
<mrow notation="infix">
<mi>a</mi>
<mo operator="plus">+</mo>
<mi>b</mi>
<mo operator="minus">-</mo>
<mi>c</mi>
<mo operator="plus">+</mo>
<mi>d</mi>
</mrow>
with meaning a+b-c+d with some implicit disambiguation rules (👋👋)
Intervals would just need notation=args operator=open interval eg not ***@***.***,
@EnD)"
<mrow notation="args" operator="open-interval">
<mo>]</mo>
<mi>a</mi>
<mo>,</mo>
<mi>b</mi>
<mo>[</mo>
</msup>
as the ],[ would mo would be ignored by the default rule for the args
layout,
Pochammer could use a named fenced-sub layout as in Bruce's document
(with meaning= replaced by operator= , but if this layout is not thought
sufficiently common, could use the use args but would need arg=
attributes to get inside the mrow so
<msup notation="args" operator="Pochhammer">
<mrow>
<mo>(</mo>
<mi arg="1">a</mi>
<mo>)</mo>
</mrow>
<mi arg="2">n</mi>
</msup>
In fact I now realise there is no need to have a general scheme at all,
the args and operator-args cover the general case if you add one
operator= attribute and enough arg= attributes to specify the mapping to
prefix apply form. (at this point I re-wrote most of the above:-)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#217 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALZM3AVNF5P54R6HRKMABDRXPV3HANCNFSM4NI7D6RA>
.
|
FYI: I updated my proposal
<https://mathml-refresh.github.io/mathml/docs/function-semantics> to
include some of David's comments along with the "@@" with no args notation.
Neil
…On Sat, Jun 20, 2020 at 8:41 PM Neil Soiffer ***@***.***> wrote:
I completely agree with your concern about number arguments being a
slippery slope and also one that could be solved using xpath or CSS
selectors. However, saying those can be used is like saying we could use a
backhoe to accomplish something that just requires a hand shovel. Using one
of them would unify the named argument and numbered arguments in my
proposal and maybe allow some things to be done more easily, so there are
some arguments in favor of using one of them (probably xpath).
As for your proposal, it has the nice feature of only needing to tag the
operator in many cases. However, I believe you need to expand your
statement:
> By default (unless over-ridden by arg= attributes) the arguments of the
operator consist of the children of the element, in order, but ignoring
<mo>, <mspace>
Other things that tweak formatting like <mstyle>, <mphantom>, and
<mpadded> may need to appear in the list, along with possibly <mrow> that
has a single child not in the previous list of formatting tweaking
elements. Embellished operators might be problematic also (e.g, +_5 for
addition mod 5) because some parts of the embellished operator may want to
be arguments. Maybe the use of formatting tweaking elements are rare enough
that the solution is simply to say that @arg needs to be used. If so, I'd
go with a simpler definition and just say <mo> is skipped in that case.
The benefit of just naming the operator in many cases has an analogy in my
proposal if we go with the "skip <mo>s" and add something like making "@@"
(my notation for nary args) with nothing following mean "take all non <mo>
args". For example, notation="power(@@)" for a default meaning of <msup>.
With xpath, that's something that can be done easily also.
Comparing the two proposals with this modification:
Mine original:
<mrow ***@***.***)">
<mi>a</mi>
<mo>!</mo>
</mrow>
Mine modified:
<mrow notation="factorial(@@)">
<mi>a</mi>
<mo>!</mo>
</mrow>
Yours:
<mrow notation="operator-args">
<mi>a</mi>
<mo operator="factorial">!</mo>
</mrow>
And for the other notation value you propose:
Mine original:
<msup ***@***.***, @1)">
<mi>sin</mi>
<mn>-1</mn>
</msup>
Mine modified:
<msup notation="applicative-power(@@)">
<mi>sin</mi>
<mn>-1</mn>
</msup>
<msup notation="args" operator="applicative-power">
<mi>sin</mi>
<mn>-1</mn>
</msup>
On the surface, this simplification seems like it would work most of the
time, but I'm a bit dubious. The open-interval is one such example. You
wrote it as "mrow( ] a , b [)', but "proper" mrow structure is "mrow(]
mrow(a , b) [)," so the args would need explicit tagging. Given that mrows
with prefix/infix/postfix operators need proper structure, one should
probably assume that for bracketed notations. With a quick count (I may be
off by one or two), the examples in the table Bruce gave (and I copied)
require explicitly tagging 13 of the 28 assuming proper mrow structure; 9
of 28 with the fenced notations coded as in the table. Of course, the ones
in the table are probably less common notations than those in a standard
web page, but still...
Your handling of "a+b-c+d" has a fair bit of hand waving in it as you
acknowledge; mine does also. It's an area that requires more thought no
matter what proposal we go with.
In the end, I think our proposals are roughly similar if I drop numbered
arguments and add "@@" with no args. There is one area where I think mine
is more powerful -- you can specify implicit function arguments in mine.
For something like derivatives, that allows one to unify first and higher
derivatives into the same function:
D(f) -- ***@***.***, 1)
D^2(f) -- notation="newton- derivative ***@***.***, @n)
It's not a big deal, and I don't know how often this happens outside of
derivative notation, but it seems like something that might be useful.
For completeness, here's a comparison where arguments need to be labelled:
Mine with numbered attrs:
<msup ***@***.***)">
<mi>A</mi>
<mi>T</mn>
</msup>
Mine with named attrs:
<msup ***@***.***)">
<mi arg="arg">A</mi>
<mi>T</mn>
</msup>
Yours
<msup notation="operator-args">
<mi>A</mi>
<mi operator="transpose">T</mn>
</msup>
The numbered one is clearly the least work to write, but as you said, is
more fragile to changes to the MathML. The second and third ones are about
the same amount of work to do, but *I* think my version is easier to
understand.
Neil
On Fri, Jun 19, 2020 at 4:09 PM David Carlisle ***@***.***>
wrote:
> Note @NSoiffer <https://github.com/NSoiffer> has a modified version of
> the @brucemiller <https://github.com/brucemiller> 's proposal at
>
> https://mathml-refresh.github.io/mathml/docs/function-semantics
>
> Neil's version has the advantage of not requiring a fixed enumeration of
> notation= layout schema. However I do not think we should rely so
> heavily on counting of child elements. If we do count it should be 1-based
> not 0-based (both xpath and CSS selectors are 1 based) but also once you
> get beyond basic child elements it means specifying a third "competing"
> query construct for DOM trees
> and I think it would be better to avoid that.
>
> Exposing the element nesting ***@***.***@0 ) is tricky as it means that you may
> have to remove (or expose by counting) redundant <mrow> . Previously an
> mrow that wraps a single child
> was always allowed and had no effect when generating presentation mathml
> (eg from Content) it is very natural to eg generate
>
> <mfrac>
> <mrow> *recursively transfrom1* </mrow>
> <mrow> *recursively transfrom1* </mrow>
> </mfrac>
>
> This ensures the mfrac is well-formed but if the recursive
> transformations just produce <mn>1</nm> and <mn>2</nm> then the <mrow>
> are redundant, depending on the transformation technology it isn't always
> convenient to do a second pass to remove them.
>
> I think it may be possible to combine the two proposals with the
> following (not fully baked) form.
>
> notation as in Bruce's proposal.
> The possible values would be fixed, possibly the list as in Bruce's
> document, or possibly a shorter list, but extended with a new possibility
> of overriding the default determination of the arguments, so the notation
> forms can be used with any layout.
>
> Each notation/layout schema would specify the *default* position of the
> main operator and arguments.
>
> these could be over-written with operator and arg attributes.
>
> operator takes an operator name, and is essentially a renamed meaning
> from Bruce's proposal.
>
> arg takes an integer value, if it is used then the collection of
> descendant arg attributes (ignoring any in nested notation subtrees) should
> produce the integer range 1 ....n for some n, and specify the arguments of
> the operator.
>
> So binomial might be
>
> <msubsup notation="operator-args">
>
> <mi operator="binomial">C</mi>
>
> <mi>m</mi>
>
> <mi>n</mi>
>
> </msubsup>
>
>
> or for a reversed convention
>
> <msubsup notation="operator-args">
>
> <mi operator="binomial">C</mi>
>
> <mi arg="2">m</mi>
>
> <mi arg="1">n</mi>
>
> </msubsup>
>
>
> sub and sub forms (and mroot and mfrac and 2-child mrow) could I think be
> combined, basically in Bruce's document notation=sup is for a
> two-argument operation, power or specified elsewhere and sup-operator
> being a 1-argument operation specified by the second child. So here rename
> to
> args and operator-args
>
> By default (unless over-ridden by arg= attributes) the arguments of the
> operator consist of the children of the element, in order, but ignoring
> <mo>, <mspace>
>
> so transpose
>
> <msup notation="operator-args">
>
> <mi>A</mi>
>
> <mi operator="transpose">T</mn>
>
> </msup>
>
>
> power
>
> <msup notation="args" > <!-- operator="power" -->
>
> <mi>x</mi>
>
> <mi>n</mi>
>
> </msup>
>
>
> factorial
>
> <mrow notation="operator-args">
>
> <mi>a</mi>
>
> <mo operator="factorial">!</mo>
>
> </mrow>
>
>
> for an element with notation=args the operator should be specified on
> the element (rather than on a child) with default values power on <msup>,
> division on mfrac , root on ```
>
> the nth derivative case could be marked as
>
> <msup notation="args" operator="derivative-implicit-variable">
>
> <mi>f</mi>
>
> <mrow>
>
> <mo>(</mo>
>
> <mi>n</mi>
>
> <mo>)</mo>
>
> </mrow>
>
> </msup>
>
>
> with meaning derivative(f,(n)) with the parens around n being taken as
> part of the value,
>
> but, especially for notations using more fancy decoration you could use
> the arg attributes to explictly ignore the syntax decoration:
>
> <msup notation="args" operator="derivative-implicit-variable">
>
> <mi arg="1">f</mi>
>
> <mrow>
>
> <mo>(</mo>
>
> <mi arg="2">n</mi>
>
> <mo>)</mo>
>
> </mrow>
>
> </msup>
>
>
> with meaning derivative(f,n) with the parens around n being ignored,
>
> For binary infix, eg dot product this leads to closer to Bruce's form (no
> @ counting)
>
> <mrow notation="infix">
>
> <mi mathvariant="bold">a</mi>
>
> <mo operator="inner-product>⋅</mo>
>
> <mi mathvariant="bold">b</mi>
>
> </mrow>
>
>
> Arguably infix isn't needed here and you could use operator-args still,
> but maybe that's a simplification too far.
>
> But I think we should distinguish the multiple operator case where the
> operators are being combined (with missing mrow) using implied precedence
> or associativity rules from the case where it is really an n-ary operator
> like (plus... just being written as repeated infix by convention.
> Something like
>
> <mrow notation="infix" operator="plus">
>
> <mi>a</mi>
>
> <mo>+</mo>
>
> <mi>b</mi>
>
> <mo>-</mo>
>
> <mi>c</mi>
>
> <mo>+</mo>
>
> <mi>d</mi>
>
> </mrow>
>
>
> with meaning (plus a b (minus c) d)
>
> and
>
> <mrow notation="infix">
>
> <mi>a</mi>
>
> <mo operator="plus">+</mo>
>
> <mi>b</mi>
>
> <mo operator="minus">-</mo>
>
> <mi>c</mi>
>
> <mo operator="plus">+</mo>
>
> <mi>d</mi>
>
> </mrow>
>
>
> with meaning a+b-c+d with some implicit disambiguation rules (👋👋)
>
> Intervals would just need notation=args operator=open interval eg not ***@***.***,
> @EnD)"
>
> <mrow notation="args" operator="open-interval">
>
> <mo>]</mo>
>
> <mi>a</mi>
>
> <mo>,</mo>
>
> <mi>b</mi>
>
> <mo>[</mo>
>
> </msup>
>
>
> as the ],[ would mo would be ignored by the default rule for the args
> layout,
>
> Pochammer could use a named fenced-sub layout as in Bruce's document
> (with meaning= replaced by operator= , but if this layout is not thought
> sufficiently common, could use the use args but would need arg=
> attributes to get inside the mrow so
>
> <msup notation="args" operator="Pochhammer">
>
> <mrow>
>
> <mo>(</mo>
>
> <mi arg="1">a</mi>
>
> <mo>)</mo>
>
> </mrow>
>
> <mi arg="2">n</mi>
>
> </msup>
>
>
> In fact I now realise there is no need to have a general scheme at all,
> the args and operator-args cover the general case if you add one
> operator= attribute and enough arg= attributes to specify the mapping to
> prefix apply form. (at this point I re-wrote most of the above:-)
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#217 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AALZM3AVNF5P54R6HRKMABDRXPV3HANCNFSM4NI7D6RA>
> .
>
|
Some preliminary thoughts on @davidcarlisle's suggestions; There's a lot to like ---- and dislike here. I think rather than replace "meaning" by "operator", you'd want to keep "meaning" (by whatever name), and let I'm concerned about whether the presence of a "notation" attribute properly scopes the operator/arg attributes; in a complexly nested expression, will it be completely clear which operator/arg belong to which notation? (perhaps) It's kinda cool having a generic notation (although I'm puzzled by the term "args"), so that you don't have to use any other notation keywords, and conversely that you can override one or all of a notation's default positions. But this basically puts a lot of work on the agent consuming this: For every node with a notation, it has to search all children for (in scope) operator and arg attributes, which presumably wouldn't be present very often. |
I'm pretty sure that the algorithm I gave in my proposal for how one finds
the "arg" attr (don't look inside elements with arg/notation attrs) works
for David's proposal also, so the search time is pretty trivial.
I realized last night that I left off a use case for accessibility --
synchronized highlighting of speech and text. To do this, one needs to know
the elements to highlight. Those elements often include both the operator
and operands. E.g, notation="factorial(@n)" needs to highlight both the
element referenced by "@n" and the "!". In my proposal, it is possible to
find the "!" because it is an <mo>, but there are likely notations where
there are multiple <mo>s and some shouldn't be read and some where there
are no <mo>s. One example of the latter is A^T. I believe David's proposal
handles this.
To deal with this problem in my proposal, I am considering modifying it so
that the first argument of the function is the operator (list of operators
for n-ary operators). It could be empty for functions like <msup
notation=power([], @base,@exp)>...</msup>. This solution also addresses
Bruce's concern about complicated operators. It also potentially resolves
the problems with "a+b-c+d". That would have markup like:
notation="plus([@OP1,@OP2, @op3], [@Arg1,@ARG2,@arg3,@arg4]"
or using the (still fluid) "@@" notation
notation="plus([@@], [@@])"
where "@@" in the operator position gathers up all the <mo> direct children
and "@@" in the operand position gathers up all the non-<mo> direct
children.
For most nary operators, the list of operators would be different *elements
*but would all share the same *text content*. For "plus", you could have a
"-" text content; for 'times", there also might be a mix of various times
operators including &InvisibleTimes. Speech probably cares about these
differences (and definitely needs to be able to point to every "+" element,
etc), but a translation to content MathML wouldn't care in the times case,
but would need to deal with the +/- difference in its conversion.
In David's proposal, if all the operator's are tagged (or are implicit),
then the two proposals have similar functionality in this respect.
Neil
|
@brucemiller its a bit of a stretch to call mine a "proposal", I deliberately wrote it as a comment here rather than a new draft do or a PR on one of the two existing docs as it was supposed to be just a comment, but it got long, and then I copied in code examples from the existing drafts and it got longer and crucially I completely changed my idea half way through writing the comment, as I realised the general form was possibly not too bad so could replace many of your named forms rather than just being an additional one for special cases. So naming and all details are not fully baked. I think the one line version of my comment would be: I don't think we should introduce a new selector syntax and I don't think we should use numeric element counting, but I do like from @NSoiffer 's proposal that the mechanism is open ended and doesn't involve enumerating so many layout schemes, so I was trying to do a merge of the two..... |
I suppose I can point to a prelimary https://mathml-refresh.github.io/mathml/docs/semantics-mini, |
@bruce: your doc should address David Cs comment "However I do not think we
should rely so heavily on counting of child elements." I do agree with him
that numbering is fragile (see his comments earlier in the thread and in my
document for some reasons why). If you don't agree, then I'd like to see
your reasoning why.
Thanks,
Neil
…On Mon, Jun 22, 2020 at 3:46 PM bruce miller ***@***.***> wrote:
I suppose I can point to a prelimary
https://mathml-refresh.github.io/mathml/docs/semantics-mini,
although we're still working out the kinks and adding examples.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#217 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALZM3F5IS3M7DDWDAFIELDRX7NLLANCNFSM4NI7D6RA>
.
|
Quick comment: I see the suggestion has evolved to contain the named arguments also in the notation root. If you next rename the attribute "arg" to "id", and avoid value clashes globally, you arrive at the id-based part of our selector draft. That said, the current intuition of the "arg" approach is to give the same argument of the same notation the same value, so you are also really close to the HTML "class" attribute in function. But most importantly, the moment you have annotations at both the root of a notation and the argument nodes, you are functionally equivalent to our id-pointing scheme. And that seems to be getting closer to a consensus position? As to our selector approach, the rest of the child-counting-selection was needed to 1) stay open-ended while 2) still provide a vocabulary of standard notation names. If you want to include e.g. a standard "binomial" in the specification, which assumes reasonable (unannotated) children to include as arguments, you end up having to formalize that relationship. Which, as ugly as it is, looks like a descendant-counting-path selector in the general case. So once you concede that you're introducing names with fixed expectations for positional arguments, might as well expose that capability to document authors, so that they also cover currently out-of-scope syntax. |
@NSoiffer: I've somewhat addressed the issue you raised, and changed a few examples to use ids rather than paths. Feel free to point the discussion group to it. |
I think we are converging towards a solution. At the moment, we have the
following on the table:
https://mathml-refresh.github.io/mathml/docs/semantics-mini
https://mathml-refresh.github.io/mathml/docs/function-semantics
and David C's "not a proposal" (notation="operator-args"/"args";
operator="...")
An area where we are all trying to figure out the right thing is
referencing the children. The following seem to be things that need
discussion/resolving:
- paths
- ids
- "arg" (or "class" or "name") attribute
- nary operators, especially +/- and maybe times and a few others where
symbols are mixed or missing (e.g, "2·3ay")
XPath has been suggested at times as a standard for referencing the
arguments. I definitely agree we shouldn't make up our own method if there
already are existing standards out there. I started to do an experiment
where I showed the XPath equivalent for the examples in my proposal. The
paths (@1@2, etc) trivally convert to XPath, as will using 'id's. I
couldn't figure out how to do the named arguments though, and David C
confirmed they would be a mess. The problem is that you can't simply select
on the "arg" attribute because when a notation is nested (which most will
be), you will gather up all the @Args from all the children. Of course,
what you really want to do is select on 'arg="base"', but if a MathML
generator always uses "base" for powers, then anytime you have a nested
power, you run into the problem of picking up too many children. You need
to stop when you (using the names from my proposal) hit @arg or @Notation.
That apparently can be done in XPath, but it is complicated. That
complication can be reduced some by using user-defined functions, but
user-defined functions bring in more complications. If we want to go with
some named args, I don't think XPath is a viable solution.
Looking at the options, I think the following summarizes what people have
said are problems:
- paths -- fragile, especially because of MathML's implicit mrow rule and
equivalence rules involving <mrow>
- ids -- need to be unique across the entire document, so (for example)
every instance of D^2 must be different due to different ids. Makes reusing
content painful.
- "arg" (or "class" or "name") attribute -- won't work with xpath (and
other standards) well -- hence, not standard
- nary operators -- haven't really been thought through
I have a specific comment on a new feature that Bruce/Deyan's proposal
introduced: a literal in the semantic. That can be useful to generalize
semantics where a default value might be valuable (e.g., '1' for an
nth-derivative or '2' for root). However in this example
<msup semantic="transpose(A)">
<mi>A</mi>
<mi>T</mi>
</msup>
using 'A' would be bad for AT that wants to do sync highlighting because
there is no longer a reference to the node for 'A', so it couldn't be
highlighted.
If I missed something in this summary of the proposals, please chime in on
this email thread or on the call on Thursday.
Neil
…On Wed, Jun 24, 2020 at 7:56 AM bruce miller ***@***.***> wrote:
@NSoiffer <https://github.com/NSoiffer>: I've somewhat addressed the
issue you raised, and changed a few examples to use ids rather than paths.
Feel free to point the discussion group to it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#217 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALZM3HYKPQCIZJUSIW7K7TRYIH3TANCNFSM4NI7D6RA>
.
|
Thanks Neil for this summary. I've had a schedule conflict come up for
today's meeting, so with my regrets, I'd like to respond to Neil's analysis.
I agree that we would like to avoid xpaths and ids for the reasons Neil
gives. The arg attribute looks best to me, especially since in almost all
cases, we should only need to identify the arguments within an enclosing
expression, and they will appear in the same order as they would appear in
the functional notation. So we can avoid arg(1) arg(2) ... and so on, and
simply use arg(), and the notation can refer to them in order. The most
common exceptions would be for sum/product/integral/derivative/partial
derivative, and a very few others, which could fall back to use the
numbered arguments. Finding the arguments is trivial if they are all
marked, and not much harder if we are clear about what presentation forms
imply the location of an argument (mi, mn, others?). The cost of the tree
traversal is small, since each notation just doesn't use that many
presentation elements to render its part of the tree before it gets to a
subexpression.
Apologies if any of this is unclear, I can plan to follow up with more
details as needed.
I agree with Neil that we are converging toward a solution, and I am
encouraged at the progress represented by these new ideas.
Sam
…On Wed, Jun 24, 2020 at 6:45 PM NSoiffer ***@***.***> wrote:
I think we are converging towards a solution. At the moment, we have the
following on the table:
https://mathml-refresh.github.io/mathml/docs/semantics-mini
https://mathml-refresh.github.io/mathml/docs/function-semantics
and David C's "not a proposal" (notation="operator-args"/"args";
operator="...")
An area where we are all trying to figure out the right thing is
referencing the children. The following seem to be things that need
discussion/resolving:
- paths
- ids
- "arg" (or "class" or "name") attribute
- nary operators, especially +/- and maybe times and a few others where
symbols are mixed or missing (e.g, "2·3ay")
XPath has been suggested at times as a standard for referencing the
arguments. I definitely agree we shouldn't make up our own method if there
already are existing standards out there. I started to do an experiment
where I showed the XPath equivalent for the examples in my proposal. The
paths ***@***.***@2, etc) trivally convert to XPath, as will using 'id's. I
couldn't figure out how to do the named arguments though, and David C
confirmed they would be a mess. The problem is that you can't simply select
on the "arg" attribute because when a notation is nested (which most will
be), you will gather up all the @Args from all the children. Of course,
what you really want to do is select on 'arg="base"', but if a MathML
generator always uses "base" for powers, then anytime you have a nested
power, you run into the problem of picking up too many children. You need
to stop when you (using the names from my proposal) hit @arg or @Notation.
That apparently can be done in XPath, but it is complicated. That
complication can be reduced some by using user-defined functions, but
user-defined functions bring in more complications. If we want to go with
some named args, I don't think XPath is a viable solution.
Looking at the options, I think the following summarizes what people have
said are problems:
- paths -- fragile, especially because of MathML's implicit mrow rule and
equivalence rules involving <mrow>
- ids -- need to be unique across the entire document, so (for example)
every instance of D^2 must be different due to different ids. Makes reusing
content painful.
- "arg" (or "class" or "name") attribute -- won't work with xpath (and
other standards) well -- hence, not standard
- nary operators -- haven't really been thought through
I have a specific comment on a new feature that Bruce/Deyan's proposal
introduced: a literal in the semantic. That can be useful to generalize
semantics where a default value might be valuable (e.g., '1' for an
nth-derivative or '2' for root). However in this example
<msup semantic="transpose(A)">
<mi>A</mi>
<mi>T</mi>
</msup>
using 'A' would be bad for AT that wants to do sync highlighting because
there is no longer a reference to the node for 'A', so it couldn't be
highlighted.
If I missed something in this summary of the proposals, please chime in on
this email thread or on the call on Thursday.
Neil
On Wed, Jun 24, 2020 at 7:56 AM bruce miller ***@***.***>
wrote:
> @NSoiffer <https://github.com/NSoiffer>: I've somewhat addressed the
> issue you raised, and changed a few examples to use ids rather than
paths.
> Feel free to point the discussion group to it.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <
#217 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AALZM3HYKPQCIZJUSIW7K7TRYIH3TANCNFSM4NI7D6RA
>
> .
>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#217 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABEIHUDYSNEIBMYWKBMCGPTRYKFZPANCNFSM4NI7D6RA>
.
|
Thanks for the summary Neil.
I think the basic idea of my "extended comment" (which I'm not sure can be
made to work) was to not support numeric paths at all.
Using arg= everywhere works but isn't that friendly so the idea was to use
the "named layouts" from Bruce's original document
as a kind of fixed selection paths for the common cases.
So for example if you know you are using `stacked-fenced` then you don't
need to mark the two arguments with arg= and you don't need to use a
fragile *[1]/*[2] or @0@1 numeric path to reference them as you can simply
ignore
the mo for the fences and and step inside the mfrac or mtable providing the
stacking structure and ignore any surplus mrows in the child structure.
So you'd only need to mark elements denoting the arguments in the "general"
case that did not fit one of the named forms.
|
I still find myself today wishing that we minimize the learning curve, and domain-specific novelties we introduce, in the annotation scheme. The simple references and function-call style annotations achieve that nicely, since
goes in the other direction. It assumes annotators will spend a bit of time training themselves into spotting the various notation patterns, from a list of pre-defined patterns the specification offers, and then use them judiciously. That is certainly going to add difficulty to becoming an annotator. Conversely, given a presentation MathML tree, annotated with this minimal operator structure annotation
Where all references such as My main intuition here is that it is a lot easier for an implementer who uses the spec to infer the patterns from the tree, than for an annotator to both predict the final MathML tree (they could be authoring in TeX or Word, using a transformer to MathML), and then to learn the domain-specific terminology needed to annotate it. Worst of all, we can not realistically expect to capture all notations in the specification, as they are open-ended and too many. I added a Edit: I missed mentioning another pragmatic point. Since a11y software developers can not expect to have all MathML their users consume to be perfectly annotated, it is realistic that they will retain code which will process classic presentation MathML, with no annotations. That code will have the same task of inferring the notation and its fixity, but would have to do more guessing as it doesn't in advance know the (semantic) operator tree. So I can imagine a generalized software component that can extract a "notation-used" via a pMML tree walk, with and without semantic attribute assistance. Changes would be of the kind: "we know this node is the operator" vs "can this node be the operator?". |
On Thu, 23 Jul 2020 at 15:37, Deyan Ginev ***@***.***> wrote:
I still find myself today wishing that we minimize the learning curve, and
domain-specific novelties we introduce, in the annotation scheme. The
simple references and function-call style annotations achieve that nicely,
since #op(#1,#2,#3) is a form anyone can quickly learn and use.
Meanwhile, David's statement of:
if you know you are using stacked-fenced
goes in the other direction. It assumes annotators will spend a bit of
time training themselves into spotting the various notation patterns, from
a list of pre-defined patterns the specification offers, and then use them
judiciously. That is certainly going to add difficulty to becoming an
annotator.
Yes actually I'd agree with that, but at the time I was proposing a small
fixed list of layouts as opposed to proposals for a long open list of
actual mathematical functions, so I'd say it's not going in the opposite
direction, just didn't go as far in this one:-)
Conversely, given a presentation MathML tree, annotated with this minimal
… operator structure annotation #op(#1,#2), it is easy to do a tree walk
which determines the notation which has been used. Non-exhaustive examples:
notation child pattern required parent
prefix siblings annotated: #op #1 mrow
postfix siblings annotated: #1 #op mrow
binary infix siblings annotated: #1 #op #2 mrow
n-ary infix sibling annotated #1 #op #2 [unannotated sibling(s)] #3
[unannotated sibling(s)] ... mrow
n-ary prefix sibling annotated #op #1 [unannotated sibling(s)] #2
[unannotated sibling(s)] #3 [unannotated sibling(s)] ... mrow
n-ary postfix sibling annotated #1 [unannotated sibling(s)] #2
[unannotated sibling(s)] #3 [unannotated sibling(s)] ... #op mrow
scripted operator siblings annotated #1 #op msup, msub
scripted implied op siblings annotated #1 #2 msup or msub, semantic
someliteral(#1,#2)
full scripted all siblings have arg annotations msubsup
fenced unannotated open/close fence as first/last child mrow
stacked fenced siblings <mo>(</mo> <mfrac>...</mfrac> <mo>)</mo> mrow
piecewise siblings <mo>{</mo><mtable>...</mtable> mrow
atom semantic attribute only contains a literal any
...
Where all references such as #op, #1, #2 are used for the sake of
example. I am quite open to leaving those open-ended for the convenience of
the annotator (#bvar, #denominator etc.), or since Neil expressed a
preference to simplify further - to only permit #op and consecutive
natural numbers as values. That part should be workable either way.
My main intuition here is that it is a lot easier for an implementer who
uses the spec to infer the patterns from the tree, than for an annotator to
both predict the final MathML tree (they could be authoring in TeX or Word,
using a transformer to MathML), and then to learn the domain-specific
terminology needed to annotate it. Worst of all, we can not realistically
expect to capture all notations in the specification, as they are
open-ended and too many. I added a piecewise row in my examples to throw
in something we've never talked about but we all knew about, and that's a
smaller set than the notations we don't even know about yet.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#217 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJVYASKKQ4XUA6QKOAJRVDR5BDMLANCNFSM4NI7D6RA>
.
|
My preference is for simplicity. Sam's approach of just an operator name
and numbered arguments is simple. There is no micro-syntax in the semantics
argument that implementers need to parse to know what they need to look
for. Unfortunately, based on last week's discussion, it appears that won't
work for some notations. Maybe Sam has come up with a solution for the
problematic cases so a micro-syntax can be avoided. If we need to go with
an embedded syntax, probably op(arg, ...) is reasonable, but we would be
remiss if we didn't consider other syntaxes such as JSON.
Also, there are two cases to consider as Deyan alluded to:
1. Someone who is generating semantics. They control the MathML tree
generated and likely can probably generate any format without much trouble
2. Remediators who need to take markup from Word or MathType or some
none-semantic aware TeX-to-MathML converter and add semantics. If a
proposal requires a specific form of tree to work, that makes it difficult
for them. As an example, suppose you have f(a+2b) and the generated MathML
is one flat mrow. With a micro syntax, you could add to the single mrow:
semantic = "function-apply(@f, @+(@A, times(@2, @b)))"
where I used @xxx to indicate the operators/operands (assumes no
invisibile function applies or times were included in the MathML).
Probably @Number would be used in real life. Maybe "@+" is just "plus".
In this example, the accessibility tool needs to do some (simple) recursive
parsing to figure this out, but the remediator doesn't need to do much
work. I think Sam's approach would require the addition of mrows, making
the life of the remediator harder. However, adding mrows would make the
life of implementers using the semantics attr easier. My initial impression
is that we should favor the remediator, but it isn't a slam dunk.
Neil
On Thu, Jul 23, 2020 at 8:44 AM David Carlisle <notifications@github.com>
wrote:
… On Thu, 23 Jul 2020 at 15:37, Deyan Ginev ***@***.***>
wrote:
> I still find myself today wishing that we minimize the learning curve,
and
> domain-specific novelties we introduce, in the annotation scheme. The
> simple references and function-call style annotations achieve that
nicely,
> since #op(#1,#2,#3) is a form anyone can quickly learn and use.
> Meanwhile, David's statement of:
>
> if you know you are using stacked-fenced
>
> goes in the other direction. It assumes annotators will spend a bit of
> time training themselves into spotting the various notation patterns,
from
> a list of pre-defined patterns the specification offers, and then use
them
> judiciously. That is certainly going to add difficulty to becoming an
> annotator.
>
Yes actually I'd agree with that, but at the time I was proposing a small
fixed list of layouts as opposed to proposals for a long open list of
actual mathematical functions, so I'd say it's not going in the opposite
direction, just didn't go as far in this one:-)
Conversely, given a presentation MathML tree, annotated with this minimal
> operator structure annotation #op(#1,#2), it is easy to do a tree walk
> which determines the notation which has been used. Non-exhaustive
examples:
> notation child pattern required parent
> prefix siblings annotated: #op #1 mrow
> postfix siblings annotated: #1 #op mrow
> binary infix siblings annotated: #1 #op #2 mrow
> n-ary infix sibling annotated #1 #op #2 [unannotated sibling(s)] #3
> [unannotated sibling(s)] ... mrow
> n-ary prefix sibling annotated #op #1 [unannotated sibling(s)] #2
> [unannotated sibling(s)] #3 [unannotated sibling(s)] ... mrow
> n-ary postfix sibling annotated #1 [unannotated sibling(s)] #2
> [unannotated sibling(s)] #3 [unannotated sibling(s)] ... #op mrow
> scripted operator siblings annotated #1 #op msup, msub
> scripted implied op siblings annotated #1 #2 msup or msub, semantic
> someliteral(#1,#2)
> full scripted all siblings have arg annotations msubsup
> fenced unannotated open/close fence as first/last child mrow
> stacked fenced siblings <mo>(</mo> <mfrac>...</mfrac> <mo>)</mo> mrow
> piecewise siblings <mo>{</mo><mtable>...</mtable> mrow
> atom semantic attribute only contains a literal any
> ...
>
> Where all references such as #op, #1, #2 are used for the sake of
> example. I am quite open to leaving those open-ended for the convenience
of
> the annotator (#bvar, #denominator etc.), or since Neil expressed a
> preference to simplify further - to only permit #op and consecutive
> natural numbers as values. That part should be workable either way.
>
> My main intuition here is that it is a lot easier for an implementer who
> uses the spec to infer the patterns from the tree, than for an annotator
to
> both predict the final MathML tree (they could be authoring in TeX or
Word,
> using a transformer to MathML), and then to learn the domain-specific
> terminology needed to annotate it. Worst of all, we can not realistically
> expect to capture all notations in the specification, as they are
> open-ended and too many. I added a piecewise row in my examples to throw
> in something we've never talked about but we all knew about, and that's a
> smaller set than the notations we don't even know about yet.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <
#217 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AAJVYASKKQ4XUA6QKOAJRVDR5BDMLANCNFSM4NI7D6RA
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#217 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALZM3HJKONR5HLKKHNQDITR5BLFFANCNFSM4NI7D6RA>
.
|
Simple is good! But if a syntaxy semantic cannot be avoided in many cases, it's probably better not to switch back and forth between a syntax-free semantic and a syntax based one. JSON is a nice way to provide a block of (quasi)structured data as a separate file or I'm not to clear on what While we can, and perhaps should, still encourage proper mrow structure, I think we'll limit usefulness if we require a very specific mrow structure. There may be other reasons besides laziness of remediators why a given structure is desirable that may not match the accessibility requirements. |
FWIW, and I am only reminding of this for the sake of technical completeness (not my preference, and potentially eyebrow-raising), the MathML spec conceptually allows a JSON annotation for an individual formula via something akin to: But I also think Neil's JSON point was something different. If I read it right, he's suggesting an alternative syntax for the value of <mrow semantic='{ "function-apply": ["@f", {"@+": ["@A", {"times": ["@2", "@b"]}]} ] }'>
|
I'm not to clear on what @+ is meant to imply in your example semantic =
"function-apply(@f, @+(@A, times(@2, @b)))"; are you assuming <mo
arg="+">+</mo> within the mrow?
Yes. I don't think the proposals allowed for "+" as a value for arg, which
is why I mentioned that. I still consider the functional (and JSON
equivalent) notation problematic for n-ary notations that mix different
operators. If one of the operators is to be tagged/pointed to, I think they
all should be pointed to or none of them should be.
Simple is good! But if a syntaxy semantic cannot be avoided in many
cases, it's probably better not to switch back and forth between a
syntax-free semantic and a syntax based one.
Agreed -- it's simpler to have a single way to do things. Note though that
for some things, such as ℝ, it would probably be semantics="Reals" (no
parens). I think your original proposal included that situation already.
+1 for Deyan's pros/cons for JSON. I should note that I left out a paren in
my example when I typed it and only caught it proofreading my email before
sending it, so neither is perfect. Having to add ""s everywhere for JSON is
a pain to do by hand!
Neil
…On Tue, Jul 28, 2020 at 3:56 PM Deyan Ginev ***@***.***> wrote:
FWIW, and I am only reminding of this for the sake of technical
completeness (not my preference, and potentially eyebrow-raising), the
MathML spec conceptually allows a JSON annotation for an individual formula
via something akin to: <annotation encoding="application/json"> within a
<semantics> parent. You could even host it externally, and link to it via
src.
But I also think Neil's JSON point was something different. If I read it
right, he's suggesting an alternative syntax for the value of semantic,
say:
<mrow semantic='{ "function-apply": ***@***.***", {"@+": ***@***.***", {"times": ***@***.***", ***@***.***"]}]} ] }'>
- pros: no need for custom parsing, JSON is ubiquitous on all
potential platforms. We still may need a "JSON Schema" definition to
enumerate what we expect though.
- cons: harder to read and write by humans compared to the
functional-style syntax. I had two validation errors when quickly writing
this by hand (balancing the trailing ]}]} ] } was hard). Remediators
will suffer.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#217 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALZM3BPNL2OKTQQ5XYP7NTR55JTHANCNFSM4NI7D6RA>
.
|
This discussion seemed to have served its purpose as it gradually led to a more concrete (if perhaps not yet perfect) proposal. The proposal in this issue is several levels too "Meta" to be pursued :> So, shall we close? |
closing looks good to me @brucemiller |
Background
I would like to explore approaches to adding at least minimal semantic information to Presentation MathML, primarily for the purpose of accessibility. The obvious first idea is to encode the "meaning" of each symbol as an attribute on its token. That covers a lot of cases in a natural way, but begins to fail when we encounter the various purposes of sub/superscripts (eg. powers x^2, operator application like A^T, indexing) or constructs representing special notations such as binomial coefficients. In these cases there may be no (or many) tokens which deserve this meaning attribute, and it fails to capture the fact that the entire construct (eg. msup, mrow) is relevant.
It is tempting in such cases to assign the meaning at a higher level (Say put "transpose" on the msup instead of the T, or "binomial" on the mrow), but then we must conceive a large dictionary of meanings (eg. transpose, conjugate, adjoint, ...; binomial, legendre-symbol,...) along with their corresponding markup patterns. Each such markup pattern must encode which of the node descendants will also need to be translated. Moreover, we would have to distinguish different possible markup patterns associated with the same meaning (eg. transpose as superscript, transpose as function,...; different notations for binomial coefficients).
Proposal
I would like to explore here the feasibility of abstracting a (hopefully) small set of markup patterns, separately from the meanings, for distinguishing these cases; The presentation markup would thus be annotated with 2 attributes, which (for purposes of discussion) I'll call "meaning" and "composition". The set of composition keywords would need dictionary entries, with each making clear which children play the role of arguments. For many purposes, however, the set of possible meanings could be open-ended.
Simple example
Superscript seems to be used almost entirely for the purposes of; operator application; and (tensor) indexing. Take composition=power to indicate the arg2 power of arg1. So
can be read as "x to the power of n", or an agent may choose to examine the children for special cases, like "x squared".
An example of operator application might be:
which could be read as "transpose of A". It also easily generalizes to meaning="conjugate", "adjoint" or even the not-yet-popular "Tralfamadorian inverse", without needing any additional dictionary entries.
Less simple example
A large set of notations have markup patterns like:
with various delimiters, punctuation, both with/without a visible dividing line, etc. These include common notations for binomial, Jacobi and Legendre symbols, Eulerian numbers, Pochhammer symbols, Clebsch-Gordon coefficients, 3j, 6j, etc symbols, distributions, vectors, matrices, determinants, inner-products, and so on. Of course, many of these also have other commonly used notations, so it would be a shame to lock each "semantic" to a single markup pattern.
Naming each of these patterns might simplify the task. For example, the common binomial notation might be represented by
Presumably the internal representation of the stacked-fenced composition would make clear where the arguments are and that a decent default reading is "binomial of n and m". And of course and implementation is free to special case "binomial" to obtain the reading "n choose m". In any case, this pattern is trivially extended to 2D vectors, Jacobi symbols and Eulerian numbers.
Moreover, we haven't wedded binomials to a single notation, since we can still write:
which could presumably end up with exactly the same readings as above.
Implementation note
A dictionary of the composition keywords would need to encode the paths to the children which act as arguments to whatever semantic is being applied, and provide some sort of template for output. This might be a simple pattern, possibly per-language as well as for Braille or other formats, along the lines of "the #2 power of #1" or "the #1 of #2" or similar. The appropriate translations of the children could simply be inserted into the pattern. Perhaps something fancier is needed?
The meaning attribute might well be open-ended, and read out as-is by default. Of course, that doesn't preclude recommending a standard set for common cases, nor does it preclude an implementation including a meaning dictionary to improve translations.
Summary
This idea has a lot of detail to be worked out, including at least
and need corresponding composition keywords
But before going down that road, it would be good to get general reaction and feedback.
The text was updated successfully, but these errors were encountered: