From 15badae72c45d53da5388ef981ca4ba8045cf878 Mon Sep 17 00:00:00 2001 From: r12a Date: Mon, 9 Dec 2024 09:39:26 +0000 Subject: [PATCH] arb Complete char usage data. Redo maps. --- arab/arb-details.html | 407 +++++++++++++++++++++- arab/arb-langdata.js | 2 +- arab/arb.css | 5 +- arab/arb.html | 791 +++++++++++++++++------------------------- arab/block.html | 6 +- 5 files changed, 725 insertions(+), 486 deletions(-) diff --git a/arab/arb-details.html b/arab/arb-details.html index 4d9bfdc40..a694f894a 100644 --- a/arab/arb-details.html +++ b/arab/arb-details.html @@ -110,6 +110,7 @@ '\u{0609}': `

؉

+

Per mille sign.

`, @@ -139,6 +140,8 @@ '\u{060C}': `

،

+ +

Comma.

`, @@ -149,6 +152,7 @@ '\u{060D}': `

؍

+

Date separator.

`, @@ -2292,6 +2296,7 @@ '\u{0660}': `

٠

+

0 digit.

`, @@ -2302,6 +2307,7 @@ '\u{0661}': `

١

+

1 digit.

`, @@ -2312,6 +2318,7 @@ '\u{0662}': `

٢

+

2 digit.

`, @@ -2322,6 +2329,7 @@ '\u{0663}': `

٣

+

3 digit.

`, @@ -2332,6 +2340,7 @@ '\u{0664}': `

٤

+

4 digit.

`, @@ -2342,6 +2351,7 @@ '\u{0665}': `

٥

+

5 digit.

`, @@ -2352,6 +2362,7 @@ '\u{0666}': `

٦

+

6 digit.

`, @@ -2362,6 +2373,7 @@ '\u{0667}': `

٧

+

7 digit.

`, @@ -2372,6 +2384,7 @@ '\u{0668}': `

٨

+

8 digit.

`, @@ -2382,6 +2395,7 @@ '\u{0669}': `

٩

+

9 digit.

`, @@ -2423,6 +2437,7 @@ '\u{066D}': `

٭

+

Five-pointed star.

`, @@ -4907,7 +4922,8 @@ '\u{08B2}': `

-

Sometimes used for writing Berber sounds.lpz

+ +

consonant. Sometimes used for writing Berber sounds.lpz

`, @@ -6162,6 +6178,8 @@ '\u{2018}': `

+ +

Closing quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text.

`, @@ -6171,6 +6189,8 @@ '\u{2019}': `

+ +

Opening quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text.

`, @@ -6180,6 +6200,8 @@ '\u{201C}': `

+ +

Closing quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text.

`, @@ -6189,6 +6211,8 @@ '\u{201D}': `

+ +

Opening quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text.

`, @@ -6198,12 +6222,393 @@ '\u{2014}': `

+ +

Em dash.

+`, + + + +'\u{0021}': ` +

!

+ +

Exclamation mark.

+`, + + + + + +'\u{0025}': ` +

%

+ +

Percentage mark.

+`, + + + + + +'\u{0028}': ` +

(

+ +

Opeining parenthesis.

+ +

The words 'left' and 'right' in the Unicode names for parentheses, brackets, and other paired characters should be ignored. LEFT should be read as if it said START, and RIGHT as END. The direction in which the glyphs point will be automatically determined according to the base direction of the text.

+`, + + + + + +'\u{0029}': ` +

)

+ +

Closing parenthesis.

+ +

The words 'left' and 'right' in the Unicode names for parentheses, brackets, and other paired characters should be ignored. LEFT should be read as if it said START, and RIGHT as END. The direction in which the glyphs point will be automatically determined according to the base direction of the text.

+`, + + + + + +'\u{002E}': ` +

.

+ +

Full stop.

+`, + + + + + +'\u{0030}': ` +

0

+ +

Digit.

+`, + + + + + +'\u{0031}': ` +

1

+ +

Digit.

+`, + + + + + +'\u{0032}': ` +

2

+ +

Digit.

+`, + + + + + +'\u{0033}': ` +

3

+ +

Digit.

+`, + + + + + +'\u{0034}': ` +

4

+ +

Digit.

+`, + + + + + +'\u{0035}': ` +

5

+ +

Digit.

+`, + + + + + +'\u{0036}': ` +

6

+ +

Digit.

+`, + + + + + +'\u{0037}': ` +

7

+ +

Digit.

+`, + + + + + +'\u{0038}': ` +

8

+ +

Digit.

+`, + + + + + +'\u{0039}': ` +

9

+ +

Digit.

`, +'\u{003A}': ` +

:

+ +

Colon.

+`, + + + + + +'\u{00AB}': ` +

«

+ +

Opening quotation mark.

+ +

The words 'left' and 'right' in the Unicode names for parentheses, brackets, and other paired characters should be ignored. LEFT should be read as if it said START, and RIGHT as END. The direction in which the glyphs point will be automatically determined according to the base direction of the text.

+`, + + + + + +'\u{00BB}': ` +

»

+ +

Closing quotation mark.

+ +

The words 'left' and 'right' in the Unicode names for parentheses, brackets, and other paired characters should be ignored. LEFT should be read as if it said START, and RIGHT as END. The direction in which the glyphs point will be automatically determined according to the base direction of the text.

+`, + + + + + +'\u{034F}': ` +

͏

+ +

Combining grapheme joiner.

+ +

Used to produce special ordering of diacritics. The name is a misnomer, as it is generally used to break the normal sequence of diacritics.

+ +

More details:

+ + +`, + + + + + + + + + +'\u{2013}': ` +

+ +

En dash.

+`, + + + + + +'\u{2026}': ` +

+ +

Ellipsis.

+`, + + + + + + +'\u{2030}': ` +

+ +

Per mille sign.

+`, + + + + + +'\u{2039}': ` +

+ +

Quotation mark.

+`, + + + + + +'\u{203A}': ` +

+ +

Quotation mark.

+`, + + + + + +// zwnj +'\u{200C}': ` +

+ +

Zero-width non-joiner (ZWNJ).

+ +

An invisible character, that prevents two adjacent letters forming a visual connection with each other when rendered. Especially useful for educational illustrations, but also has real-world applications.

+ +

It is used to interrupt the shaping of joining glyphs in cursive scripts, and also used to manage the visual interactions of glyphs in other scripts, eg. to prevent the formation of conjuncts, position diacritics, etc.

+ +

More details:

+ + +`, + + + + + +// zwj +'\u{200D}': ` +

+ +

Zero-width joiner (ZWJ).

+ +

An invisible character, that permits a letter to form a cursive connection without a visible neighbour. Especially useful for educational illustrations, but also has some real-world applications.

+ +

Also used with complex scripts to manage the visual representation of glyphs that normally interact, eg. to form conjuncts, position diacritics, etc.

+ +

More details:

+ +`, + + + + + +// LRM +'\u{200E}': ` +

An invisible character with strong LTR directional properties that can be used to produce the correct ordering of text, especially where there is a risk of spillover effects while the Unicode Bidirectional Algorithm is at work.

+ +

Generally referred to as LRM.

+`, + + + + +// RLM +'\u{200F}': ` +

An invisible character with strong RTL directional properties that can be used to produce the correct ordering of text, especially where there is a risk of spillover effects while the Unicode Bidirectional Algorithm is at work.

+ +

Generally referred to as RLM.

+`, + + + + +// LRE +'\u{202A}': ` +

Sets the start point for a range of inline text when applying a base direction of left-to-right. The range is terminated by 202C (PDF).

+ +

Use 2066 (LRI) rather than this character.

+`, + +// RLE +'\u{202B}': ` +

Sets the start point for a range of inline text when applying a base direction of right-to-left. The range is terminated by 202C (PDF).

+ +

Use 2067 (RLI) rather than this character.

+`, + +// PDF +'\u{202C}': ` +

Sets the end point for a range of inline text when applying a base direction. The range is started with either 202A (LRE) or 202B (RLE).

+ +

Use 2069 (PDI) and its associated range starters rather than this character.

+`, + + + + + +// LRI +'\u{2066}': ` +

Sets the start point for a range of inline text when applying a base direction of left-to-right, and isolates the text within that range from text outside it. The isolation prevents unintended spill-over effects when the text is reordered by the Unicode Bidirectional Algorithm. The range is terminated by 2069 (PDI).

+ +

This character should be used rather than 202A (LRE).

+`, + + +// RLI +'\u{2067}': ` +

Sets the start point for a range of inline text when applying a base direction of right-to-left, and isolates the text within that range from text outside it. The isolation prevents unintended spill-over effects when the text is reordered by the Unicode Bidirectional Algorithm. The range is terminated by 2069 (PDI).

+ +

This character should be used rather than 202B (RLE).

+`, + +// FSI +'\u{2068}': ` +

Sets the start point for a range of inline text when applying a base direction, and isolates the text within that range from text outside it. The base direction set is determined by that of the first strong directional character in the range. The isolation prevents unintended spill-over effects when the text is reordered by the Unicode Bidirectional Algorithm. The range is terminated by 2069 (PDI).

+`, + +// PDI +'\u{2069}': ` +

Sets the end point for a range of inline text when applying a base direction. The range is started with either 2066 (LRI), 2066 (RLI) or 2068 (FSI).

+ +

This character should be used rather than 202C (PDF).

+`, + + + + + + + } \ No newline at end of file diff --git a/arab/arb-langdata.js b/arab/arb-langdata.js index 43eb5717e..480b81a83 100755 --- a/arab/arb-langdata.js +++ b/arab/arb-langdata.js @@ -11,7 +11,7 @@ var langs = { "arb": { name:"Arabic, Standard", local:"العَرَبِيَّة‎", localtrans:"[alʕaraˈbijja]", silcode:"", rtl:true, source:"5d92060d0376b6a659e408ddc1c289cc0cecbba8,cldr_ar,udhr_arb", region:"wasia", countries:"Saudi Arabia, Egypt, Mali, Algeria, Iraq, Sudan, Yemen, Syria, Morocco, etc.", script:"arab", speakers:"273989700", -letter:"ءآأؤإئابةتثجحخدذرزسشصضطظعغـفقكلمنهوىيپچڢڤڧࢲﷲﷺﷻ", letteraux:"ٱ", mark:"͏ًٌٍَُِّْٰ", markaux:"ٕٓٔ", number:"٠١٢٣٤٥٦٧٨٩", numberaux:"", punctuation:"«»؉،؍؛؟٪٫٬٭–—‘’“”…‰‹›﴾﴿", punctuationaux:"۔", symbol:"﷼﷽", symbolaux:"", other:"\u{061C}\u{200C}\u{200D}\u{200E}\u{200F}\u{202A}\u{202B}\u{202C}\u{2066}\u{2067}\u{2068}\u{2069}", otheraux:"", aux:"", deprecated:"", +letter:"ءآأؤإئابةتثجحخدذرزسشصضطظعغـفقكلمنهوىيٱپچڢڤڧࢲﷲﷴﷺﷻ", mark:"͏ًٌٍَُِّْٰؔ", markaux:"ٕٓٔۜ", number:"٠١٢٣٤٥٦٧٨٩", punctuation:"«»؉،؍؛؟٪٫٬٭–—‘’“”…‰‹›﴾﴿", symbol:"﵀﵁﵂﵃﵄﵅﵆﵇﵈﵉﵊﵋﵌﵍﵎﵏﷏﷼﷽﷾﷿", symbolaux:"﮲﮳﮴﮵﮶﮷﮸﮹﮺﮻﮼﮽﮾﮿﯀﯁﯂", other:"\u{061C}\u{200C}\u{200D}\u{200E}\u{200F}\u{202A}\u{202B}\u{202C}\u{2066}\u{2067}\u{2068}\u{2069}", otheraux:"\u{06DD}", aux:"", orth:`Arabic.   Naskh style. Details.`, related:`Macrolanguage is Arabic [ar]. Legacy applications often use [ar] rather than arb.`, linked:"arab/arb", picker:"arab-ar", font:"", diff --git a/arab/arb.css b/arab/arb.css index 5df4aa589..ec75e1af7 100755 --- a/arab/arb.css +++ b/arab/arb.css @@ -88,7 +88,8 @@ figure#vowel_grid { .map .charExample .ex { font-size: 1.6rem; -} - + } +.mapItem .phone + div { width: 100%; } +.mapItem .posn { display: inline-block; } diff --git a/arab/arb.html b/arab/arb.html index 6ac487186..4b48b3330 100755 --- a/arab/arb.html +++ b/arab/arb.html @@ -59,7 +59,7 @@

Contents

Updated - 22 November, 2024 + 9 December, 2024

@@ -77,7 +77,7 @@

Contents

Referencing this document -

Richard Ishida, Modern Standard Arabic Orthography Notes, 22-Nov-2024, https://r12a.github.io/scripts/arab/arb

+

Richard Ishida, Modern Standard Arabic Orthography Notes, 09-Dec-2024, https://r12a.github.io/scripts/arab/arb

@@ -201,17 +201,17 @@

Joining forms

Left-joining glyphs are commonly called initial; dual-joining are called medial; and right-joining are called final. Glyphs that don't join on either side are called isolated. However, these glyph shapes can be found in various places within a single word.

-

Word-initial characters usually have initial glyph shapes (eg. 064A ). However, characters that only join to the right will use an isolated glyph shape (eg. 062F ). +

Word-initial characters usually have initial glyph shapes (eg. 064A ). However, characters that only join to the right will use an isolated glyph shape (eg. 062F ). Furthermore, words beginning with a vowel are always preceded by a vowel carrier, which is normally ا -(eg. 0627 06CC or 0627 064E ).

+(eg. 0627 06CC or 0627 064E ).

Word-medial characters will typically join on both sides -(eg. 064A ) but those that only join to the right will use a final glyph (eg. 062F ). -However, if either of those is preceded by another character that only joins to the right, the glyph shapes rendered will be initial (eg. 064A ) -and isolated (eg. 062F ), respectively.

+(eg. 064A ) but those that only join to the right will use a final glyph (eg. 062F ). +However, if either of those is preceded by another character that only joins to the right, the glyph shapes rendered will be initial (eg. 064A ) +and isolated (eg. 062F ), respectively.

-

Word-final characters will typically use a final glyph shape (eg. 064A and 062F ). -However, if the previous character joins only to the right, they will use isolated glyph shapes (eg.064A and 062F ).

+

Word-final characters will typically use a final glyph shape (eg. 064A and 062F ). +However, if the previous character joins only to the right, they will use isolated glyph shapes (eg.064A and 062F ).

In all this contextual glyph shaping the basic shapes used for a character can vary significantly in a script like Arabic. This also includes some characters that only have ijam dots in certain contexts.

@@ -1287,9 +1287,11 @@

Vowel sounds to characters

-

This section maps Modern Standard Arabic vowel sounds to common graphemes in the Arabic orthography. Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.

+

This section maps Modern Standard Arabic vowel sounds to common graphemes in the Arabic orthography.

-

The columns run right to left and indicate typical word-initial, word-medial, and word-final usage. The joining forms shown are illustrative; alternative shapes may occur (see joining_forms). They are also fully-vowelled, although the examples show normal unvowelled usage as well as vowelled.

+

The entries show typical word-initial, word-medial, and word-final usage. The joining forms shown are illustrative; alternative shapes may occur (see joining_forms). They are also fully-vowelled, although the examples show normal unvowelled usage as well as vowelled.

+ +

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc. Light coloured characters occur infrequently.