Skip to content

Commit

Permalink
[DESIGN] Update bidi design document to show proposed design (#871)
Browse files Browse the repository at this point in the history
* [DESIGN] Update bidi design document to show proposed design

The design I actually think we should adopt is the "hybrid approaches" one. This is a necessary first step on the highway to UAX31 compliance and I think is responsibly contained/managed. It is a hybrid approach, in that it permits testable strict implementations to be created (particularly for message serialization).

This PR consists of moving text around. I added one "pro" to one option also.

* Address comments
  • Loading branch information
aphillips authored Sep 2, 2024
1 parent d4fb35b commit 6657b80
Showing 1 changed file with 49 additions and 22 deletions.
71 changes: 49 additions & 22 deletions exploration/bidi-usability.md
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,39 @@ Not allowing these to mix could produce annoying parse errors.

_Describe the proposed solution. Consider syntax, formatting, errors, registry, tooling, interchange._

I propose adopting a hybrid approach in which we permit "super-loose isolation".
This allows user to include isolates and strongly directional characters into the whitespace
portions of the syntax in order to make messages appear correctly.

The second part of the hybrid approach would be to recommend ("SHOULD") the "strict isolation"
design for serializers.
(Note that "strict" and "super-loose" use non-identical productions with the name `bidi`.
These serve different purposes and are consistent with strict being narrower with super-loose.)
This syntax is a subset of the super-loose syntax and can be applied selectively to messages that
have RTL sequences or which have problematic display.


## Alternatives Considered

_What other solutions are available?_
_How do they compare against the requirements?_
_What other properties they have?_

### Nothing
We could do nothing.

A likely outcome of doing nothing is that RTL users would insert bidi controls into
_messages_ in an attempt to make the _pattern_ and/or _placeholders_ display correctly.
These controls would become part of the output of the _message_,
showing up inappropriately at runtime.
Because these characters are invisible, users might be very frustrated trying to manage
the results or debug what is wrong with their messages.

By contrast, if users insert too many or the wrong controls using the recommended design,
the _message_ would still be functional and would emit no undesired characters.

### LTR Messages with isolating sequences

The syntax of a _message_ assumes a left-to-right base direction
both for the complete text of the _message_ as well as for each line (paragraph)
contained therein.
Expand Down Expand Up @@ -383,7 +416,7 @@ ns-separator = [bidi] ":"
bidi = [ %x200E-200F / %x061C ]
```

### Open Issues with Proposed Design
**Open Issues**

The ABNF changes found above put isolates and strongly directional marks into specific locations,
such as directly next to `{`/`}`/`{{`/`}}` markers
Expand All @@ -393,32 +426,24 @@ A more permissive design would add the isolates and strongly directional marks t
whitespace in the syntax and depend on users/editors to appropriately pair or position the marks
to get optimal display.

## Alternatives Considered

_What other solutions are available?_
_How do they compare against the requirements?_
_What other properties they have?_

### Nothing
We could do nothing.

A likely outcome of doing nothing is that RTL users would insert bidi controls into
_messages_ in an attempt to make the _pattern_ and/or _placeholders_ display correctly.
These controls would become part of the output of the _message_,
showing up inappropriately at runtime.
Because these characters are invisible, users might be very frustrated trying to manage
the results or debug what is wrong with their messages.

By contrast, if users insert too many or the wrong controls using the recommended design,
the _message_ would still be functional and would emit no undesired characters.

### Super-loose isolation

Add isolates and strongly directional marks to required and optional whitespace in the syntax.
This would permit users to get the effects described by the above design,
as long as they use isolates/marks in a "responsible" way.

(Omitting other changes found in #673)
The exception to this is the namespace separator, used in `identifier`.
This requires the ability to insert isolates or strongly directional marks
between the namespace and name portions, where whitespace is not permitted.
This is the only location in the syntax where such characters might be needed
but whitespace is not at least optional.
This could be defined as:
```abnf
ns-separator = [bidi] ":" [bidi]
```

Here are the other ABNF changes:

```abnf
; strongly directional marks and bidi isolates
Expand Down Expand Up @@ -447,7 +472,7 @@ s = ( SP / HTAB / CR / LF / %x3000 )
### Strict isolation all the time

Apply bidi isolates in a strict way.
The main differences to the proposed solution is:
In this design:
1. The open/close isolate characters are syntactically required to be paired.
This introduces parse errors for unpaired invisible characters,
which could lead to bad user experiences.
Expand All @@ -467,7 +492,7 @@ markup = "{" [s] "#" identifier [bidi] *(s option) *(s attribute) [s] ["
/ "{" [s] "/" identifier [bidi] *(s option) *(s attribute) [s] "}" ; close
/ "{" LRI [s] "/" identifier [bidi] *(s option) *(s attribute) [s] close-isolate "}" ; close
identifier = [(namespace ns-separator)] name
ns-separator = [bidi] ":"
ns-separator = [bidi] ":" [bidi]
bidi = [ %x200E-200F / %x061C ]
```

Expand Down Expand Up @@ -610,6 +635,8 @@ adherence to the stricter grammar.
syntax errors
- Provides a foundation for tools to claim strict conformance and message normalization
as well as guidance to implementers to make them want to adopt it
- Messages are valid while being edited (such as when the open or close isolate has been
inserted but the corresponding opposite isolate hasn't been entered yet)

**Cons**
- Requires additional effort to maintain the grammar
Expand Down

0 comments on commit 6657b80

Please sign in to comment.