Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure components::Bag will always generate a result #586

Open
Tracked by #645
gregtatum opened this issue Mar 29, 2021 · 10 comments
Open
Tracked by #645

Ensure components::Bag will always generate a result #586

gregtatum opened this issue Mar 29, 2021 · 10 comments
Labels
C-datetime Component: datetime, calendars, time zones help wanted Issue needs an assignee S-medium Size: Less than a week (larger bug fix or enhancement) T-core Type: Required functionality

Comments

@gregtatum
Copy link
Member

gregtatum commented Mar 29, 2021

I'm modifying this big to be a bit more subtle, which is to ensure that every components::Bag returns a result. The appendItems support is one way of doing that, but currently the components bag may not return a result. This wouldn't be compatible with ECMA-402, so we should have a solution for it. This may be appendItems, or something else.


Original discussion:

https://unicode.org/reports/tr35/tr35-dates.html#Matching_Skeletons

In case the best match does not include all the requested calendar fields, the appendItems element describes how to append needed fields to one of the existing formats. Each appendItem element covers a single calendar field. In the pattern, {0} represents the format string, {1} the data content of the field, and {2} the display name of the field (see Calendar Fields).

For the skeleton matching code, there is an appendItems list. It is used to add items to the matching patterns. This is the only way to add time zones to the patterns. However, there was some debate on the quality of results from using the appendItems list. In the 2020-02-05 ICU4X meeting, it was determined to not use this appendItems list. However, we clearly need it for time zones.

I'm labeling this issue as discussion, as we have some outstanding issues to figure out:

  • Which appendItems must we support? Time zones are the minimum as of now.
  • Which appendItems should we not support.
  • What happens with Skeleton missing fields?
    • Right now we are actively not planning on supporting appendItems, as it can generate really bad localizations.
    • In the ICU4C implementation, the skeleton matching attempts a few smart heuristics like adding the minutes field if only hour and seconds are in the skeleton. Should we determine and document if we will fill in "holes" in the components::Bag like this?
@gregtatum gregtatum added discuss Discuss at a future ICU4X-SC meeting C-datetime Component: datetime, calendars, time zones labels Mar 29, 2021
@gregtatum gregtatum self-assigned this Mar 29, 2021
@sffc
Copy link
Member

sffc commented Mar 31, 2021

I like using append items for time zones, since they are their own orthogonal component.

As far as eras, and other things like weekdays and fractional seconds... I would rather see a world where all of those patterns are already present in data, so we don't need to use append items at runtime. Either we get those patterns from translators, or we apply the append items at build time only (in the CLDR transformer) in order to make sure we cover all the key sets of fields.

@gregtatum
Copy link
Member Author

I'm trying wrap my brain around the math of potential combinations. I'm documenting it here.

Combinations for "date, time, era, timezone, weekday"

date, era
date, era, time
date, era, time, timezone
date, era, time, timezone, weekday
date, era, time, weekday
date, era, timezone
date, era, timezone, weekday
date, era, weekday
date, time
date, time, timezone
date, time, timezone, weekday
date, time, weekday
date, timezone
date, timezone, weekday
date, weekday
era, time
era, time, timezone
era, time, timezone, weekday
era, time, weekday
era, timezone
era, timezone, weekday
era, weekday
time, timezone
time, timezone, weekday
time, weekday
timezone, weekday

@gregtatum
Copy link
Member Author

Next, let's assume weekday will be included in either date or time during the initial field symbol matching.

date, era
date, era, time
date, era, time, timezone
date, era, timezone
date, time
date, time, timezone
date, timezone
era, time
era, time, timezone
era, timezone
time, timezone

@sffc
Copy link
Member

sffc commented Mar 31, 2021

Era should go inside date. It's strongly coupled with the year, like day period is strongly coupled with the hour.

I think we have either three of four orthogonal components:

  1. Date
  2. Time
  3. Time Zone
  4. Weekday (could be part of Date)

Then, within each component, we just need to exhaustively enumerate all valid sets of fields.

@sffc sffc removed the discuss Discuss at a future ICU4X-SC meeting label Apr 2, 2021
@gregtatum
Copy link
Member Author

2020-04-02: Deep dive conclusion: Overall, appendItems for every field is something we do not want to support. Instead, the API should be a clear cartesian product of available inputs. We will still support appendItems as necessary "glue patterns" for certain fields, although it may eventually be rolled up into the data provider to generate the full cartesian product.

From my own perspective, this means that we can use them as needed for things like time zone, but to keep an eye out for ways to factor them out.

@gregtatum gregtatum added the S-medium Size: Less than a week (larger bug fix or enhancement) label Apr 7, 2021
@sffc
Copy link
Member

sffc commented Apr 19, 2021

@gregtatum: Please file smaller follow-up issues to track the appendItems that we actually want to add. I am putting this issue on the v1 backlog to revisit once we have figured out the larger story for datetime skeletons.

@sffc sffc added backlog question Unresolved questions; type unclear labels Apr 19, 2021
@gregtatum
Copy link
Member Author

gregtatum commented Sep 28, 2021

Here are the current CLDR append items.

{
 "appendItems": {
   "Day": "{0} ({2}: {1})",
   "Day-Of-Week": "{0} {1}",
   "Era": "{0} {1}",
   "Hour": "{0} ({2}: {1})",
   "Minute": "{0} ({2}: {1})",
   "Month": "{0} ({2}: {1})",
   "Quarter": "{0} ({2}: {1})",
   "Second": "{0} ({2}: {1})",
   "Timezone": "{0} {1}",
   "Week": "{0} ({2}: {1})",
   "Year": "{0} {1}"
 }
}

I'm not going to file the bugs quite yet, as I'd like to go through the next deep dive. But I believe the valid list is:

  • Timezone
  • Weekday

@gregtatum gregtatum changed the title In skeleton matching, use the appendItems field where appropriate Ensure components::Bag will always generate a result Nov 18, 2021
@gregtatum
Copy link
Member Author

I didn't file follow-up issues as the new requirements for this bug are to ensure that the components::Bag always returns a result, rather than specifically implementing appendItems (which may be the actual work, depending on consensus and design.) The Ideal Components Bag #1317 may work around this issue by having higher quality results.

@gregtatum gregtatum removed their assignment Nov 18, 2021
@sffc sffc added the help wanted Issue needs an assignee label Dec 9, 2021
@sffc sffc added this to the ICU4X 1.0 milestone Apr 1, 2022
@sffc sffc added T-core Type: Required functionality and removed backlog labels Apr 1, 2022
@sffc sffc removed help wanted Issue needs an assignee question Unresolved questions; type unclear labels Apr 1, 2022
@dminor
Copy link
Contributor

dminor commented May 13, 2022

We've agreed to postpone implementing this until post ICU4X 1.0

@sffc
Copy link
Member

sffc commented May 23, 2024

With neo skeleta (#1317), the place where it seems we may still need AppendItems is when constructing patterns that are omitted from certain calendars, such as a week calendar in Chinese, for which there simply isn't data. We could also consider just falling back to Generic/Gregorian or erroring for these cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-datetime Component: datetime, calendars, time zones help wanted Issue needs an assignee S-medium Size: Less than a week (larger bug fix or enhancement) T-core Type: Required functionality
Projects
None yet
Development

No branches or pull requests

3 participants