Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add base direction as a fourth element of literals. #48

Merged
merged 16 commits into from
Oct 13, 2023
Merged
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 77 additions & 27 deletions spec/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,10 @@
</ul>

<p>RDF 1.2 introduces <a>quoted triples</a> as another kind of <a>RDF term</a>
which can be used as the <a>subject</a> or <a>object</a> of another <a>triple</a>.</p>
which can be used as the <a>subject</a> or <a>object</a> of another <a>triple</a>.
RDF 1.2 also introduces <a>directional language-tagged strings</a>
containing a <a>base direction</a> element allowing the
initial text direction to be specified when presented in a user agent.</p>
gkellogg marked this conversation as resolved.
Show resolved Hide resolved

<p>RDF 1.2 Concepts introduces key concepts and terminology for RDF 1.2, discusses
datatyping, and the handling of <a>fragment identifiers</a> in IRIs within
Expand Down Expand Up @@ -153,7 +156,9 @@ <h3>Resources and Statements</h3>
<a>datatypes</a> that define the range of possible
values, such as strings, numbers, and dates. Special kind of literals,
gkellogg marked this conversation as resolved.
Show resolved Hide resolved
<a>language-tagged strings</a>, denote
gkellogg marked this conversation as resolved.
Show resolved Hide resolved
plain-text strings in a natural language.</p>
plain-text strings in a natural language, and
<a>directional language-tagged strings</a>, denote
plain-text strings in a natural language and include an initial text direction.</p>
gkellogg marked this conversation as resolved.
Show resolved Hide resolved

<p>Asserting an <a>RDF triple</a> says that <em>some relationship,
indicated by the <a>predicate</a>, holds between the
Expand Down Expand Up @@ -669,8 +674,8 @@ <h2>Literals</h2>

<p>Literals are used for values such as strings, numbers, and dates.</p>

<p>A <dfn data-local-lt="RDF literal">literal</dfn> in an <a>RDF graph</a> consists of two or three
elements:</p>
<p>A <dfn data-local-lt="RDF literal">literal</dfn> in an <a>RDF graph</a> consists of
two, three, or four elements:</p>

<ul>
<li>a <dfn>lexical form</dfn>, being a Unicode [[!UNICODE]] string,
Expand All @@ -684,32 +689,48 @@ <h2>Literals</h2>
language tag MUST be well-formed according to
<a data-cite="bcp47#section-2.2.9">section 2.2.9</a>
of [[!BCP47]].</li>
<li>if and only if the <a>datatype IRI</a> is
<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#dirLangString</code>,
a non-empty <a>language tag</a>
that MUST be well-formed according to <a data-cite="bcp47#section-2.2.9">section 2.2.9</a>
of [[!BCP47]].
and a <dfn>base direction</dfn> that MUST be either `ltr` or `rtl`.</li>
</ul>

<p>A literal is a <dfn>language-tagged string</dfn> if the third element
is present. Lexical representations of language tags MAY be converted
to lower case. The value space of language tags is always in lower
case.</p>
is present and the fourth element is not present.
gkellogg marked this conversation as resolved.
Show resolved Hide resolved
Lexical representations of language tags MAY be converted
to lower case.
The value of language tags is always treated as being in lower case.</p>

<p>A literal is a <dfn id="dfn-dir-lang-string">directional language-tagged string</dfn>
if both the third element and fourth elements are present.
Language tags are treated identically to <a>language-tagged string</a>,
and the fourth element , <a>base direction</a> MUST BE either `ltr` or `rtl` in lower case.</p>
gkellogg marked this conversation as resolved.
Show resolved Hide resolved

<p>Please note that concrete syntaxes MAY support
<dfn data-lt="simple literal" class="export">simple literals</dfn> consisting of only a
gkellogg marked this conversation as resolved.
Show resolved Hide resolved
<a>lexical form</a> without any <a>datatype IRI</a> or <a>language tag</a>.
<a>lexical form</a> without any <a>datatype IRI</a>, <a>language tag</a>, or <a>base direction</a>.
Simple literals are syntactic sugar for abstract syntax
<a>literals</a>
with the <a>datatype IRI</a>
<code>http://www.w3.org/2001/XMLSchema#string</code>
(which is commonly abbreviated as <code>xsd:string</code>).
Similarly, most concrete syntaxes represent
<a>language-tagged strings</a> without
the <a>datatype IRI</a> because it always equals
<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#langString</code> (<code>rdf:langString</code>).</p>
<a>language-tagged strings</a> and <a>directional language-tagged strings</a> without
the <a>datatype IRI</a> because it always equals either
<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#langString</code> (<code>rdf:langString</code>)
or <code>http://www.w3.org/1999/02/22-rdf-syntax-ns#dirLangString</code> (<code>rdf:dirLangString</code>), respectively.</p>

<p>The <dfn>literal value</dfn> associated with a <a>literal</a> is:</p>

<ol>
<li>If the literal is a <a>language-tagged string</a>,
then the literal value is a pair consisting of its <a>lexical form</a>
and its <a>language tag</a>, in that order.</li>
<li>if the literal is a <a>directional language-tagged string</a>, then the literal value is
a tuple of its <a>lexical form</a>, its <a>language tag</a>, and its <a>base direction</a>,
likewise in that order.</li>

<li>If the literal's <a>datatype IRI</a> is in the set of
<a>recognized datatype IRIs</a>, let <var>d</var> be the
Expand All @@ -732,12 +753,16 @@ <h2>Literals</h2>
not defined by this specification.</li>
</ol>

<p><dfn data-local-lt="term-equal">Literal term equality</dfn>: Two literals are term-equal (the same
RDF literal) if and only if the two <a>lexical forms</a>,
the two <a>datatype IRIs</a>, and the two
<a>language tags</a> (if any) compare equal,
character by character. Thus, two literals can have the same value
without being the same RDF term. For example:</p>
<p><dfn data-local-lt="term-equal">Literal term equality</dfn>:
Two literals are term-equal (the same <a>RDF literal</a>)
if and only if the two <a>lexical forms</a>,
the two <a>datatype IRIs</a>,
gkellogg marked this conversation as resolved.
Show resolved Hide resolved
the two <a>language tags</a> (if any), and
the two <a>base directions</a> (if any),
all compare equal, character by character.
Thus, two literals can have the same value
without being the same <a>RDF term</a>.
For example:</p>

<pre>
`"1"^^xs:integer`
Expand All @@ -747,7 +772,32 @@ <h2>Literals</h2>
<p>denote the same <a data-lt="literal value">value</a>, but are not the
same literal <a>RDF terms</a> and are not
<a>term-equal</a> because their
<a>lexical form</a> differs.</p>
<a>lexical forms</a> differ.</p>
gkellogg marked this conversation as resolved.
Show resolved Hide resolved

<section id="section-text-direction" class="informative">
<h3>Initial Text Direction</h3>

<p>The <a>base direction</a> of a <a>directional language-tagged string</a>
provides a means of establishing the initial direction of text,
including text which is mixture of right-to-left and left-to-right scripts.
gkellogg marked this conversation as resolved.
Show resolved Hide resolved
The [[[?UAX9]]] [[?UAX9]] provides support for automatically rendering
a sequence of characters in logical order,
so that they are visually ordered as expected.
However, this is not sufficient to correctly render bidirectional text.</p>
gkellogg marked this conversation as resolved.
Show resolved Hide resolved

<p>For example, the text `"פעילות הבינאום, W3C"` with language tag "he" (for Hebrew) may be displayed
incorrectly unless the <a>base direction</a> of `rtl` (for right-to-left)
is also provided.
When this is provided to a user agent,
gkellogg marked this conversation as resolved.
Show resolved Hide resolved
for example using the HTML `dir` attribute,
it would be correctly presented as follows: <div lang="he" dir="rtl">פעילות הבינאום, W3C</div></p>
gkellogg marked this conversation as resolved.
Show resolved Hide resolved

<p class="note">The absence of a <a>base direction</a> does not necessarily imply that
the text has no initial text direction;
as described in [[[?UAX9]]],
strings may be embedded within structures which establish an <em>embedding direction</em>,
which determines the default bidirectional orientation of text.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly misleading. The bidi algorithm determines the base direction in any case. And "embedding" is an overloaded term in the bidi algorithm (strings can be "embedded", but "embedding" in bidi refers to stacking bidirectional states...)

I'm not sure what the note is trying to convey. Are you trying to say "if the direction is not provided as metadata, the string can still be rendered"? Generally, what we say is either (a) when there is no base direction provided for a given string, the auto (first-strong detection) direction should be used; or (b) when the base direction is not provided, the direction of the enclosing document (or content??) is used

gkellogg marked this conversation as resolved.
Show resolved Hide resolved
</section>
</section>

<section id="section-blank-nodes">
Expand Down Expand Up @@ -1483,17 +1533,14 @@ <h2>Security Considerations</h2>

<section id="internationalization" class="appendix informative">
<h2>Internationalization Considerations</h2>
<p>RDF is restricted to representing string values with left-to-right or right-to-left direction indicators.
RDF provides a mechanism for specifying the language associated with
a string (<a>language-tagged string</a>),
but does not provide a means of indicating the base direction of the string.</p>

<p>Unicode provides a mechanism for signaling direction within a string
(see [[[UAX9]]] [[UAX9]]),
however, when a string has an overall base direction which cannot be determined by the
(see [[[UAX9]]] [[UAX9]]).
RDF provides a mechanism for specifying the <a>base direction</a>
of a <a>directional language-tagged string</a>
to signal the initial text direction of a string.
When a string has an overall base direction which cannot be determined by the
beginning of the string, an external indicator is required,
such as the [[HTML]] <a data-cite="HTML/dom.html#the-dir-attribute">dir attribute</a>,
which currently has no counterpart for <a>RDF literals</a>.</p>
such as the [[HTML]] <a data-cite="HTML/dom.html#the-dir-attribute">dir attribute</a>.</p>
gkellogg marked this conversation as resolved.
Show resolved Hide resolved

<p>[[[JSON-LD11]]] [[JSON-LD11]] introduced the
<a data-cite="JSON-LD11#the-i18n-namespace">i18n namespace</a> to use
Expand Down Expand Up @@ -1636,6 +1683,9 @@ <h2>Changes between RDF 1.1 and RDF 1.2</h2>
for informative definition of a <a>quad</a>.</li>
<li>Added <a href="#section-quoted-triples" class="sectionRef"></a>
and definitions for <a>quoted triple</a> and <a>asserted triple</a>.</li>
<li>Added the <a>base direction</a> element as part of
a <a>literal</a>,
and a description of its use in <a href="#section-text-direction" class="sectionRef"></a>.</li>
</ul>

<p class="note">A detailed overview of the differences between RDF versions&nbsp;1.0
Expand Down