Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache the era code maps #4199

Merged
merged 5 commits into from
Oct 21, 2023
Merged

Conversation

Manishearth
Copy link
Member

No description provided.

sffc
sffc previously approved these changes Oct 20, 2023
.collect()
});
map.get(calendar)
.unwrap_or_else(|| panic!("Unknown calendar {}", calendar))
Copy link
Member

@sffc sffc Oct 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Keep the error message "Era map unknown for {calendar}" to make it more greppable

("0".to_string(), tinystr!(16, "roc-inverse")),
("1".to_string(), tinystr!(16, "roc")),
fn get_era_code_map(calendar: &str) -> &'static BTreeMap<String, TinyStr16> {
static MAP: once_cell::sync::OnceCell<BTreeMap<String, BTreeMap<String, TinyStr16>>> =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't really do static caching, DatagenProviders can be mutated and are not singletons, this might bite us later. We have some caches on CldrCache, which itself is immutable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but this data is static. There's no way to affect this via changing CLDRCache

(The japanese era data used for symbols is static; it's verified by datagen but it's statically loaded)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right. Wouldn't it suffice to cache the result of crate::transform::cldr::calendar::japanese::compute_era_code_map? This is parsing the japanese file even if only gregorian is requested. Also, this doesn't need to return maps, the only call site just iterates over the pairs, so building BTreeMaps is overkill. For everything except japanese this can even be completely allocation-free if you pass a Fn((&str, TinyAsciiStr<16>)) closure, and then japanese can be cached only if it's requested.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is being reused in #4192 . We might be able to clean this up, I don't want to do that now; ideally we can do a second cleanup pass once the old symbols datagen code is removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of scope of improving this, I'm trying to incrementally improve its performance given that I'm writing code that will end up calling this map 6 times as often as we do today.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, fair.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I remember now, I didn't want to allocate the small BTreeMaps a ton of times.

We can switch to a different model without the BTreeMaps but I'd like to do that later

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try a little bit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it to return an iterator

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice solution!

@Manishearth Manishearth requested a review from a team as a code owner October 20, 2023 23:31
@Manishearth Manishearth merged commit a88ee84 into unicode-org:main Oct 21, 2023
26 of 27 checks passed
@Manishearth Manishearth deleted the era-code-cache branch October 21, 2023 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants