This repository has been archived by the owner on Apr 4, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 81
Reduce memory usage of the MatchingWords structure #708
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
loiclec
added
no breaking
The related changes are not breaking (DB nor API)
performance
Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption
labels
Nov 24, 2022
loiclec
commented
Nov 24, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments in the code to help with the review :))
ManyTheFish
suggested changes
Nov 24, 2022
loiclec
force-pushed
the
cache-matching-words
branch
from
November 28, 2022 09:27
71bf18c
to
80588da
Compare
loiclec
force-pushed
the
cache-matching-words
branch
from
November 28, 2022 11:55
0049dcb
to
f70856b
Compare
ManyTheFish
suggested changes
Nov 28, 2022
ManyTheFish
approved these changes
Nov 30, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
Bors merge
Build succeeded:
|
Ho no! We should not merge new PRs on this main branch until we release v0.30.1 of Meilisearch and fix the current bugs! |
No, it's ok, let is as it is. I will create a custom branch and I will cherry pick the commits on it. |
8 tasks
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
no breaking
The related changes are not breaking (DB nor API)
performance
Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request
Related issue
Fixes (partially) meilisearch/meilisearch#3115
What does this PR do?
Reduces the memory usage caused by the creation of a 10-word query tree by 20x.
This is done by deduplicating the
MatchingWord
values, which are heavy because of their inner DFA. The deduplication works by wrapping eachMatchingWord
in a reference-counted box and using a hash map to determine whether aMatchingWord
DFA already exists for a certain signature, or whether a new one needs to be built.Avoid the worst-case scenario of creating a
MatchingWord
for extremely long words that cannot be indexed by milli.