-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
limited effect of caching #3
Comments
The search does not fetch all package data that is visible in the details panel. With the default config, the most performing Now when you navigate through the search result (package) list, there is a delay of 500ms applied before further information is being requested (this is only valid for AUR packages) so that there no "unnecessary" API calls to fetch package information while you quickly navigate through the list... -> This delay is 500ms by default and can be changed in the settings "AUR search delay (ms)" You might ask: why don't you use the Caching: Once you performed a search or navigate to another package to show it's details, this data is stored in a cache. |
Thanks for providing details. Regarding type=search: then why not use type=info in combination with type=suggest? You can put up to 5000 terms in one POST info request, and have all metadata available.
About the caching, I guess my point is that with the metadata archives available, having an instantaneous lookup is possible in every case. And the required metadata archive can either be retrieved at application startup, or in a given interval. Though with the size of the archive (9mb compressed), the on-demand approach using requests might be preferable in some cases.
…On Wed, 06 Apr 2022 12:14:33 -0700 moson-mo ***@***.***> wrote:
The search does not fetch all package data that is visible in the details panel. With the default config, the most performing `type=suggest` API call is used to get a list of packages from the AUR.
Now when you navigate through the search result (package) list, there is a delay of 500ms applied before further information is being requested (this is only valid for AUR packages) so that there no "unnecessary" API calls to fetch package information while you quickly navigate through the list... -> This delay is 500ms by default and can be changed in the settings "AUR search delay (ms)"
You might ask: why don't you use the `type=search` API when doing the search and immediately get the information for all packages instead of querying the package data individually -> Because it only contains very limited information. Dependencies, Provides, Conflicts, etc. are not included)
Caching:
Once you performed a search or navigate to another package to show it's details, this data is stored in a cache.
Hence when you navigate back to a package that you have already looked at before, it retrieves the data from the cache instead of performing the AUR lookup again.
The time how long this data should remain in the cache, before it is being requested from the /rpc endpoint again, can be configured in the settings `Cache expiry (m)`
--
Reply to this email directly or view it on GitHub:
#3 (comment)
You are receiving this because you authored the thread.
Message ID: ***@***.***>
--
alad ***@***.***>
|
Good point. I more or less want to request / transfer information only if needed. Regarding the metadata: I never really liked that approach because as you mentioned you'd transfer 9 MB of data. With a reasonable fast machine and internet connection that process (download, decompress and load data) takes about 2 to 3 seconds already. That's quite a waste of resources if you just intend to do a quick search for a single package or so (where you just need a couple of KB of data). Actually Manjaro does it that way with their "pamac". Once that got implemented, within a week or so the traffic limit of the AUR webserver was reached. Since then, they self-host IMHO the /rpc endpoint is the way to go where you just fetch what you need. |
Instead of fetching package info data when a package is highted, the lookup is now being performed immediately after searching. Once the search result is returned, another request is being made to retrieve data for ALL packages returned by the search Technically this seems to be the better approach than firing off several search queries for individual packages. See issue #3 for more information on this topic.
I ran a couple of benchmarks and it turns out that fetching the info for multiple packages at once seems to be the better option. Fetching information for 100 packages vs. 1 single package is about ~10 times slower (1800 requests/s vs 18000) However, server to server (2.5 GBit/s) with reverse proxy and TLS encryption, 100 vs. 1 is just about 2 times slower. So I've now changed the implementation to immediately run the info call with all packages returned by the suggest/search one and throw these things into the cache. With that, there are only 2 calls performed per search no matter how much you navigate through the result list... Most likely I'll push another release today with the changes. |
Awesome, thanks!
…On Thu, 07 Apr 2022 05:23:30 -0700 moson-mo ***@***.***> wrote:
I ran a couple of benchmarks and it turns out that fetching the info for multiple packages at once seems to be the better option.
Doing the benchmarks locally (without network transfer) showed:
Fetching information for 100 packages vs. 1 single package is about ~10 times slower (1800 requests/s vs 18000)
10 vs 1 is 3x slower (6000 r/s vs 18000)
However, server to server (2.5 GBit/s) with reverse proxy and TLS encryption, 100 vs. 1 is just about 2 times slower.
So I've now changed the implementation to immediately run the info call with all packages returned by the suggest/search one and throw these things into the cache. With that, there are only 2 calls performed per search no matter how much you navigate through the result list...
Most likely I'll push another release today with the changes.
--
Reply to this email directly or view it on GitHub:
#3 (comment)
You are receiving this because you authored the thread.
Message ID: ***@***.***>
--
alad ***@***.***>
|
According to the README, some kind of caching is implemented for package data. My expectation would be
What I get instead is:
The text was updated successfully, but these errors were encountered: