Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLDR-15861 Add additional spec text around resource inheritance #2746

Merged
merged 1 commit into from
Mar 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/ldml/tr35-info.md
Original file line number Diff line number Diff line change
Expand Up @@ -267,7 +267,7 @@ _For information about preferred units and unit conversion, see [Unit Conversion

### <a name="rgScope" href="#rgScope">`<rgScope>`: Scope of the “rg” Locale Key</a>

The supplemental `<rgScope>` element specifies the data paths for which the region used for data lookup is determined by the value of any “rg” key present in the locale identifier (see [Region Override](tr35.md#RegionOverride)). If no “rg” key is present, the region used for lookup is determined as usual: from the unicode_region_subtag if present, else inferred from the unicode_language_subtag. The DTD structure is as follows:
The supplemental `<rgScope>` element specifies the data paths for which the region used for data lookup is determined by the value of any “rg” key present in the locale identifier (see [Region Override](tr35.md#RegionOverride) and [Region Priority Inheritance](tr35.md#Region_Priority_Inheritance)). If no “rg” key is present, the region used for lookup is determined as usual: from the unicode_region_subtag if present, else inferred from the unicode_language_subtag. The DTD structure is as follows:

```xml
<!ELEMENT rgScope ( rgPath* ) >
Expand Down
33 changes: 32 additions & 1 deletion docs/ldml/tr35.md
Original file line number Diff line number Diff line change
Expand Up @@ -1477,7 +1477,7 @@ If a language has more than one script in customary modern use, then the CLDR fi
lang
lang_script
lang_script_region
lang_region (aliases to lang_script_region)
lang_region (aliases to lang_script_region based on likely subtags)
```

#### <a name="Bundle_vs_Item_Lookup" href="#Bundle_vs_Item_Lookup">Bundle vs Item Lookup</a>
Expand All @@ -1490,6 +1490,8 @@ The table [Lookup Differences](#Lookup-Differences) uses the naïve resource bun

If the naïve resource bundle lookup is used, the desired locale needs to be canonicalized using 4.3 [Likely Subtags](#Likely_Subtags) and the supplemental alias information, so that locales that CLDR considers identical are treated as such. Thus eng-Latn-GB should be mapped to en-GB, and cmn-TW mapped to zh-Hant-TW.

The initial bundle accessed during resource bundle lookup should not contain a script subtag unless, according to likely subtags, the script is required to disambiguate the locale. For example, `zh-Hant-TW` should start lookup at `zh-TW` (since `zh-TW` implies `Hant`), and `de-Latn-LI` should start at `de-LI` (since `de` implies `Latn` and `de-LI` does not have its own entry in likely subtags).

For the purposes of CLDR, everything with the `<ldml>` dtd is treated logically as if it is one resource bundle, even if the implementation separates data into separate physical resource bundles. For example, suppose that there is a main XML file for Nama (naq), but there are no `<unit>` elements for it because the units are all inherited from root. If the `<unit>` elements are separated into a separate data tree for modularity in the implementation, the Nama `<unit>` resource bundle would be empty. However, for purposes of resource-bundle lookup the resource bundle lookup still stops at naq.xml.

###### Table: <a name="Lookup-Differences" href="#Lookup-Differences">Lookup Differences</a>
Expand Down Expand Up @@ -1728,6 +1730,35 @@ There are certain invariants that must always be true:
4. There must never be cycles, such as: X parent of Y ... parent of X.
5. Following the inheritance path, using parentLocale where available and otherwise truncating the locale, must always lead eventually to the root locale.

#### <a name="Region_Priority_Inheritance" href="#Region_Priority_Inheritance">Region-Priority Inheritance</a>

Certain data may be more appropriate to store with the region as the primary key instead of language. This is often needed for regional user preferences, such as week info, calendar system, and measurement system. All resources matched by an entry in <a href="tr35-info.md#rgScope">&lt;rgScope&gt;</a> should use this type of inheritance.

The default search chain for region-priority inheritance removes the language subtag before the region subtag, as follows:

```
en_US_someVariant
en_US
US
001
```

Equivalently as BCP-47:

```
en-US-variant
en-US
und-US
und
```

Before running region-priority inheritance, the locale should be normalized as follows:

1. If the locale contains the `-u-rg` Unicode BCP-47 locale extension, the region subtag should be set to the `-u-rg` region. For example, `en-US-u-rg-gbzzzz` should normalize to `en-GB` when running region-priority inheritance.
2. If, after performing step 1, the locale is missing the region subtag (`language` or `language_script`), the region subtag should be filled in from likely subtags data. For example, `en` should become `en-US` before running region-priority inheritance.

Note that region-priority inheritance does not currently make use of parent locales or territory containment, but it may in the future.

### <a name="Inheritance_and_Validity" href="#Inheritance_and_Validity">Inheritance and Validity</a>

The following describes in more detail how to determine the exact inheritance of elements, and the validity of a given element in LDML.
Expand Down