-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster cache in find
#561
Faster cache in find
#561
Conversation
How does this compare in run time to the original? I am happy with the implementation, a shame for the little extra complexity. @ianthomas23 , do you think there is an argument for implementing dircache with dictionaries rather than lists in general? We do iterating of these mostly, but also |
Almost identical, brought the 20s>10m regression on my use case down to like 21s. Hashing a dict (or set) is very fast compared to list as you suggested when the size grows large. As mentioned I tried the Here's what the cache entries actually look like for a file:
These all look hashable which would allow converting to a and a directory:
I used |
@martindurant It looks like but there are some uses of it that are |
It is a dict of lists; I am talking about a dict of dicts.
I think using the name is fine. In this case, we are not doing versioned listing, which isn't available via the list API anyway. |
Fixes #560.
set
didn't immediately work becauseprevious
is a dict and not hashable. Converting to afrozenset
is also not straightforward because these are nested dicts.Couple questions (haven't even run tests yet so these may be answered by CI):
do we need to convert this back to ayes, donelist
inself.dircache.update(cache_entries)
?previous["id"]
or some other/better attribute to handle versioned objects? I seeid
has this information for files, butDIRECTORY
doesn't have anid
attr so we would need a conditional there to usename
.