-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A role for indicating whether a given ruby represents phonetics #1620
Comments
@murata2makoto How does the T2S engine know which one of the base or the ruby it should render if this new role is encountered? Is it expected that the T2S engine will be supplied both and it decides or would the browser or AT be making this decision? |
This depends on the interface between the T2S engine and screen reader or user agent. Ideally, the T2S engine should receive the ruby base as such and receive the ruby text as such. Based on these two information items, the T2S engine should make a good decision. However, when the T2S engine receives flat text and nothing else or when the accessibility tree does not have these two information items, this idealistic scenario is not possible. Text to Speech of Electronic Documents Containing Ruby: User Requirements considers this issue thoroughly. But the rule of thumb is to use the ruby base rather than the ruby text. |
Thanks for speaking with the ARIA WG today @murata2makoto and helping us understand the examples in your slides. The WG discussion continued after you left, and I spent several hours considering your User Requirements page and discussing with colleagues. The Working Group consensus is that the distinctions you're trying to make are not limited to the use case of accessibility APIs or assistive technology, so the ARIA spec isn't the best location for them. We suggest pursuing a new content attribute, most likely on the
I'll separate my later individual suggestions into another comment to make it more clear this comment ^ is from the working group and my later comment below is just brainstorming. |
This following comment does not represent a consensus view of any organization (WG, employer, etc). It's just a collection of ideas that may be helpful in your progress to find a workable solution. You mentioned three functional distinctions in your use cases, but I see several more semantic content types in the User Requirements page. I'll attempt to outline them here, including similar examples in English. Note that category 2 (phonetic-optional) incorporates both a speech use case, as well as the mainstream use case you mentioned today. To allow the user to toggle the visible display of these optional groupings. The bolded names categories are just suggestions to help me keep them straight. I'm not particular to the names.
The semantic categories 3, 4, and 5 equate to the same functional category: "always display and speak both" so they might be combined (parenthetical) if there's no other functional need to keep them separate. What about proposing a new Once the Ruby/HTML working group agrees on a specific attribute, we could map it to accessibility and speech APIs to achieve the correct pronunciations, and use it for the other education use case of a visibility toggle on optional Ruby. |
The other feedback I'd like to share is related to the understandability of your User Requirements document. A few small changes may help others understand it more easily.
Best of luck to you! Thank you. |
+1, nice summary and crisp proposal. |
Is it potentially confusing that complementary is nothing like role=complementary? Is it more like role=note? Perhaps not a big deal. |
Another question, do the semantics only need to support 2 levels of readers (read all annotations or only difficult ones). Or are there plenty of intermediate readers where neither case fits perfectly? |
I'm not particular to the term "complementary" so we should probably not use that. I was thinking, both the base text and the ruby text are complementary of each other, so both should be displayed in all modalities. However, the ruby text is not phonetic, so TTS engines should speak it but not use it to infer pronunciation hints.
Not all of these are notes. If you want to use the semantic distinction rather than a functional one, there may need to be more than 3 values. [Update: maybe "parenthetical"?] |
I assume this question is for @murata2makoto. However, my understanding is that it would be up to the app to display the optional ones or not, based in the in-app user preference, rather than some magic in the user agent. In either case, the TTS engine would rely on the phonetic-required, and could use or ignore the phonetic-optional. |
Thank you for studying this document carefully. Shimono-san of Keio W3C converted to a ReSpec document, available at https://w3c.github.io/ruby-t2s-req/ We plan to create a W3C Note.
Both the JLreq TF and the Japan DAISY Consortium discussed the idea of creating HTML attributes rather than ARIA roles. Nobody has a strong opinion. We can go either way. In the I18N session after the AIRA WG meeting, we discussed where in WHATWG or W3C we should discuss this issue. Even within W3C, we might want to extend the charter of the HTML WG or create a community group dedicated to this issue. It is not yet clear. |
@cookiecrook and @aleventhal I sincerely welcome your suggestions. Western examples by James are very useful. But before I indulge in interesting discussions about ruby, I would like to summarize the status quo. In OWP and EPUB, always-double-reading is unfortunately very common. This is very bad for phonetic ruby, which is most common. Always-base-only-reading (hopefully combined with better T2S engines for Japanese) would be a big improvement. It might not be ideal but is OK. Base-only-reading as the default and double-reading for non-phonetic ruby would be better but improving T2S engines is probably more important. I don't believe ruby-only-reading is the right way to go forward.
I appreciate this very much! I will incorporate it into the upcoming note.
Although most Japanese think so, I know that some practitioners strongly disagree. I am not sure if I fully understand their reasons, but morphological analysis of kana-only ruby is likely to fail thus providing an unnatural accent and even incorrect pronunciation of は and へ. |
We can certainly try to introduce more levels. I know that a DAISY reader in Japan allows users to specify a grade in K12 and expose kanjis beyond that level only. But even the developer of that DAISY reader does not think that this has to be captured by markup. Their implementations examine code points of base characters. Historically, book catalogs in Japan indicate one of the three levels: ruby-free, para-ruby, and general-ruby. I thus think that we should stick to this tradition in standardization while implementors try interesting experiments. |
Can we get an understanding of where this heuristic falls down? Or, if it's highly accurate, then do we actually need markup to differentiate between para-ruby and general-ruby? If the heuristic can be accurate enough, and we only need to know when the ruby is used for a note, then all we really need is to apply role="note" to the Finally, I would like to know what the possibilities are for a heuristic that detects the note/complementary situation. Can we get an evaluation on how accurate that could be? |
I agree there are edge cases, and that the は example is likely to be pronounced better if the base text is sent to the text-to-speech engine. However, once the text-to-speech engines understand ruby context, I think exposing both (to be pronounced as a single instance) is likely to result in better results, not worse. Ruby-unaware speech engines should just attempt to pronounce the base text in those instances of "phonetic-optional." |
@murata2makoto Is it okay to close this issue and #1619, or is there more you'd like to clarify before closing? Please do link any relevant issues in other repositories. |
@murata2makoto , you mentioned that assuming we can announce the |
Should I be asking my questions about semantics in w3c/ruby-t2s-req#7 ? |
I am very sorry for this belated reply. I deeply appreciate all your suggestions. Re: "phonetic-required" and "phonetic-optional" Thank you very much for this suggestion. But I am not sure if they should be separated. First, ruby causes serious problems to some of the Japanese dyslexic people. They mistakenly think that ruby is a strange radical and fails to recognize the base character. Hiding every ruby is a sensible option. (But using a different color for ruby is sensible. Widening the gap between the base and ruby is also sensible.) Second, when the same word is repeated, it is quite common to make ruby visible only for the first occurrence. Thus, I do not think that some ruby should always be visible. |
On behalf of the Japan DAISY Consortium, I would like to request a role for indicating whether or not a given ruby represents phonetics.
If a ruby represents phonetics, the T2S engine should render either the base or ruby. If not, the T2S engine should render both.
More about this, see 4.4 in Text to Speech of Electronic Documents Containing Ruby: User Requirements.
The text was updated successfully, but these errors were encountered: