-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to pre-fetching model for thread-safety #382
Conversation
@NHDaly let me know what you think of this alternate approach. |
Codecov Report
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. @@ Coverage Diff @@
## master #382 +/- ##
==========================================
- Coverage 95.58% 95.57% -0.02%
==========================================
Files 38 38
Lines 1766 1762 -4
==========================================
- Hits 1688 1684 -4
Misses 78 78
|
2d1e733
to
ec35171
Compare
Single nightly failure is:
Time to remove these tests |
@NHDaly let me know if you have any feedback |
Yeah from the description that sounds excellent! @nickrobinson251 told me that we can also construct new timezones if the user constructs a custom timezone or something? is that something we need to consider? Otherwise, if not, i FULLY agree that preloading all of these makes sense! One other possibility to add is that maybe you can make this an optional prefetch() / init() method that users can call, and they can optionally provide which timezones to load, if some users don't want to load all of them? But that seems like overkill and i think i agree this sounds great to me! I haven't looked at the code yet, but the description sounds good |
if isfile(tz_path) | ||
open(TZJFile.read, tz_path, "r")(str) | ||
elseif occursin(FIXED_TIME_ZONE_REGEX, str) | ||
tz, class = get(_TZ_CACHE, str) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused: why is there still a get() do here?
Shouldn't all of the timezones be pre-fetched now? Or do we only cache the variable time zones, not the FixedTimeZones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use get
here as we previously supported TimeZone("UTC+4")
which isn't a time zone which would be automatically added to the cache. This allows this behaviour to continue but with the caveat that the construction of this new FixedTimeZone
would have previously have been added to the cache but no longer. I very much suspect this is a non-issue as the caching would only be beneficial when calling TimeZone("UTC+4")
multiple times.
It is possible that users can create custom
That's a reasonable idea if people really want to minimize the memory footprint or initialization time of the package. Since we're talking about KB here I think we can always add this functionality in later. |
thanks, that's a helpful clarification. This looks good then! :) |
I've self-reviewed this code again and it seems like a better path forward for handling thread safety. I'm re-running the tests again here to verify things are working with the latest version of Julia and then will proceed with a new release. |
While working on #359 I took a look over our caching logic for compiled tzdata. We are currently using a cache per-thread to avoid situations where read/writes for the same entry would occur concurrently. This approach ended up being the fastest as it avoided using any locks but does introduce some memory bloat as we have a cache per-thread.
As the compiled tzdata isn't all that large it seems more sensible to load the entirety of the contents into memory upon package initialization and have all threads just read from a single cache. The situation where we would want to write to this cache is rather rare so this seems like a simpler approach. The one issue I can see is that the only other time we mutate the cache is when
TimeZones.build
is called. As this function really only used to: solve build failures or experiment with other tzdata versions it seems reasonable to mark this function as thread-unsafe and call it a day.