Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement ordering on AnyCalendarInner? #3469

Open
sffc opened this issue Jun 1, 2023 · 3 comments
Open

How to implement ordering on AnyCalendarInner? #3469

sffc opened this issue Jun 1, 2023 · 3 comments
Labels
C-calendar Component: Calendars S-medium Size: Less than a week (larger bug fix or enhancement) T-core Type: Required functionality

Comments

@sffc
Copy link
Member

sffc commented Jun 1, 2023

There are a few general approaches:

  1. Compare calendar first, calendar date second
  2. Compare ISO date first, calendar second
  3. Compare ISO date and don't look at the calendar

Temporal uses approach (3), but it's a non-starter for the Ord impl because it must be a total order.

If we can get to (2), that's probably the best for ICU4X's purposes, but it's actually a bit tricky because there is not currently a space into which we can project the full range of dates in all calendars. fixed_date is limited to i32, and the range of years in some calendars covers a different span than the range of years in others. One way to fix that problem is to change fixed_date to be an i64, which I think is a change we should make anyway.

However, approach (1) works around both of these problems, and we can even use derive(Ord). It's maybe not that bad from a user perspective, too.

Another way to get out of this conundrum is to add different types of Ord functions like we did to Locale (#1709).

Discuss with:

Optional:

@sffc sffc added discuss Discuss at a future ICU4X-SC meeting C-calendar Component: Calendars labels Jun 1, 2023
sffc added a commit to sffc/omnicu that referenced this issue Jun 1, 2023
@robertbastian
Copy link
Member

  1. Implement correct comparisons for all combinations of calendars

@sffc sffc added discuss-triaged The stakeholders for this issue have been identified and it can be discussed out-of-band and removed C-calendar Component: Calendars labels Jun 15, 2023
@Manishearth
Copy link
Member

Manishearth commented Jun 16, 2023

I think approach (3) should be a separate function. I think we should use preferably (2) but (1) is also fine.

I think date comparison is a complex situation anyway and we should have good docs telling people what the different options are. In such a scenario derive(Ord) doesn't seem too bad.

But also, the RataDie type fixed the problem wrt bounds.

If people want to use these dates as BTreeMap keys then derive(Ord) is probably better.

@sffc
Copy link
Member Author

sffc commented Jul 6, 2023

  • @robertbastian - Do we need Ord? With PartialOrd we can just say you don't compare across calendars.
  • @sffc - We could not implement Ord and just let people compare them on their own, but if we can get an answer here, we may as well implement it.
  • @sffc - Temporal does strict equality and fuzzy (option 3) ordering.
  • @robertbastian - What do we do for equality?
  • @Manishearth - The calendars don't all consistently implement Eq. Also, if we were to do so for Japanese, two dates created with different data could not be fully equal.
  • @Manishearth - Putting a date in a map seems like a reasonable thing to do, so I would like that to work. I'm okay exposing two comparison functions.
  • @Manishearth - Note that the ISO comparison is going to be slower.

Proposals:

  1. Derive Ord; Expose the ISO comparison function 2/3 as its own function
  2. Do not derive Ord; expose both comparison functions (the ones corresponding to options 1 and 2/3). If the user wants to stick it in a BTreeMap, they need to write their own Ord.
  • @robertbastian - I don't buy the BTreeMap argument. There are many Rust types you can't put in a BTreeMap. But these are not physical dates in time; they are calendared objects.
  • @robertbastian - I'm not a big fan of exposing comparison functions. They can write them on their own by converting to ISO.
  • @sffc - That mostly works but note that not all dates can be round-tripped to ISO due to range issues.
  • @atcupps - It's not clear why we need comparison across calendars.
  • @sotam1069 - Yeah, I need more information
  • @echeran - +1
  • @Manishearth - It's not clear that a semantic comparator is necessary. The use case for having an Ord impl, which may or may not make sense semantically, is for you to put these things inside BTreeMaps. Is it actually useful to put calendared dates in a BTree? Not clear if that use case is motivated.
  • @sotam1069 - When would a BTreeMap be used?
  • @Manishearth - If you want to create a Map object, and you have various dates you've written about, you may want to map dates to strings. It makes sense to put dates as map keys. It's not clear though whether you want to put AnyCalendar dates as map keys.
  • @robertbastian - I find myself putting things in a BTreeMap and then iterating over them in semantic order. So I think it's not correct to separate Ord from semantic ordering.
  • @sffc - I think there are abolutely use cases for putting these as keys in a map. (1) Temporal needs it; they are always AnyCalendar. (2) A calendar app with events indexed in different calendar systems.
  • @robertbastian - The Eq and Ord traits are very strong and there are ways to put these in maps without having to implement those traits.
  • @sffc - If we implement Eq on individual calendars, then we can also trivially implement it on the Any calendar.
  • @Manishearth - For Japanese, there are ways to get data that can cause the inner representation to be different. The design in calendar code has been, if you have a DateTime Japanese and another DateTime Japanese, we assume they are the same thing. But in terms of calendars that load tablular data, this becomes a big deal. We have to decide when people try comparing one Islamic date to another when they are based on different data tables. We can just decide whether or not to compare the data. For example, we just assume there's only one set of data loaded in the application, and just don't try to load more than one set of data.
  • @echeran - If ECMAScript only implements dates as AnyCalendar, then that's a use case. If you have a pile of Japanese dates, it makes sense to order them. And we should then implement Ord for Any calendar dates to support ECMAScript. I hope that the calendar system combined with RataDie would be enough.
  • @robertbastian - I'm fine with implementing the Eq trait.
  • @sffc - The question of whether to compare data is really an edge case. It seems that if it's efficient, we should do it for maximal correctness. Is it slow?
  • @Manishearth - Tying Ord to ECMAScript doesn't entirely make sense; ECMAScript can have its own rules for ordering the objects, which could be different from some other standard library.
  • @echeran - Makes sense; I agree
  • @robertbastian - Japanese changes so slowly that there's a very small risk that we have 2 sets of data at runtime.
  • @Manishearth - In a HashMap, you call Hash once and Eq one or more times, but not too many. In a BTreeMap, you need to call Ord a whole lot. Hash can be very fast. For Eq, should we ignore the data? Probably.
  • @robertbastian - If we implement Eq, it should be the fast one. Hash on the calendar can return the constant; we can ignore fields.
  • @sffc - I would like equality to be strict. We can optimize cases where the data pointer is the same. Fine with Hash ignoring it.
  • @Manishearth - I'm okay with Eq iff it is optimized.

Consensus:

  1. Do not implement Ord on Date<AnyCalendar>; we are not convinced that there are sufficiently motivated use cases for a general implementation, and clients like ECMAScript can implement their own.
  2. Implement Eq and Hash on Date<AnyCalendar>, including data comparison, including on whichever internal types are required to get it to bubble up to Date<AnyCalendar>: AnyDateInner and AnyCalendar, including all DateInner types and all calendar types, with or without data. Make sure they are implemented optimally. We are assuming that we can make an efficient Eq by comparing data pointers; if not, we need to revisit the implementation with the group to decide the best path forward. OK to ignore the data in Hash. If a DateInner has a cached field, we should ignore the cache.
  3. We are okay implementing PartialOrd that returns None if the calendar types are different.

LGTM: @robertbastian @Manishearth @sffc @echeran @sotam1069 @atcupps

@sffc sffc added T-core Type: Required functionality S-medium Size: Less than a week (larger bug fix or enhancement) C-calendar Component: Calendars and removed discuss Discuss at a future ICU4X-SC meeting discuss-triaged The stakeholders for this issue have been identified and it can be discussed out-of-band labels Jul 6, 2023
@sffc sffc added this to the Priority Backlog ⟨P3⟩ milestone Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-calendar Component: Calendars S-medium Size: Less than a week (larger bug fix or enhancement) T-core Type: Required functionality
Projects
None yet
Development

No branches or pull requests

3 participants