Improve Search performance by introducing per-tag index hashes #555

Difegue · 2021-11-26T13:45:00Z

I don't think search can get faster than it currently is if we keep looking at every file each time.
The most simple solution imo is to build index tables for each namespace: The structure for this already exists in the stats minion job, it's mostly a matter of extending it so that Search can rely on those stat tables as well.

In case the stat tables aren't built (first searches), we'll probably have to fallback on the existing look-in-every-redis-key algorithm.

title sorting is optimized thanks to the title sorted set, other namespaces less so.

Difegue · 2022-12-16T01:09:14Z

Some notes on the search rewrite that's currently landing (wow, it's been more than a year...):

Spaces are no longer a delimiter for different tags in a search prompt, commas should now be used instead. This matches what the autocompletion box does and how metadata as a whole is entered into the app, so I don't think this will cause too much pain.
Tags will for now only be accepted as exact matches without any wildcards, aka category:doujinshi (or doujinshi if no namespaces). Spaces before/after a tag will be accepted. You might have to rewrite some of your dynamic categories.
Title searches will work the same as before (quotes or $ for an exact match, ? * _ % as wildcards

The new system makes heavy use of Redis sets and sorted sets, which are all lodged in a separate, third database.
See the DB Architecture document. (the config keys moving to a fourth database is a pending change.)

Difegue · 2022-12-16T01:33:02Z

Additional comment: Search now requires the index tables to be built (does so on app start, and then updates them as tags/titles/etc are modified), so large instances might take a few mins to get a usable search again after updating.

I haven't added a visual warning for this, but it'd certainly be a possibility.

Difegue added enhancement [BIG SHOT] feature aw heck that's a lotta work chief API archive index labels Nov 26, 2021

Difegue added this to the 2021 User Survey Requests milestone Nov 26, 2021

Difegue self-assigned this Nov 26, 2021

Difegue pinned this issue Nov 26, 2021

Difegue unpinned this issue Dec 9, 2021

Difegue mentioned this issue Dec 13, 2021

Total entries doesn't match the showing archives and actual archives inside the folder. #563

Closed

Difegue pinned this issue Jan 9, 2022

Difegue added a commit that referenced this issue Dec 11, 2022

(#555) Calculate tag indexes and spec out new DB layout

9bf09e1

Difegue added a commit that referenced this issue Dec 14, 2022

(#555) Rewrite search to use all our new indexes (WIP, no sort)

b2e26d6

Difegue added a commit that referenced this issue Dec 16, 2022

(#555) Add sort functionality back

af6e4d4

title sorting is optimized thanks to the title sorted set, other namespaces less so.

Difegue closed this as completed Dec 16, 2022

This was referenced Dec 16, 2022

LRR not handling exact tag namespaces when searching #444

Closed

Search function in LR doesn't return correct results; often misses entries in the filesystem #657

Closed

Difegue mentioned this issue Dec 18, 2022

Please add a grammar to search archives untagged #709

Closed

kw4s mentioned this issue Jan 1, 2023

LRR doesn't start after updating to a newest version #737

Closed

Difegue unpinned this issue Jan 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Search performance by introducing per-tag index hashes #555

Improve Search performance by introducing per-tag index hashes #555

Difegue commented Nov 26, 2021

Difegue commented Dec 16, 2022

Difegue commented Dec 16, 2022 •

edited

Loading

Improve Search performance by introducing per-tag index hashes #555

Improve Search performance by introducing per-tag index hashes #555

Comments

Difegue commented Nov 26, 2021

Difegue commented Dec 16, 2022

Difegue commented Dec 16, 2022 • edited Loading

Difegue commented Dec 16, 2022 •

edited

Loading