Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

Reduce memory usage of the MatchingWords structure #708

Merged
merged 7 commits into from
Nov 30, 2022
Merged

Conversation

loiclec
Copy link
Contributor

@loiclec loiclec commented Nov 24, 2022

Pull Request

Related issue

Fixes (partially) meilisearch/meilisearch#3115

What does this PR do?

  1. Reduces the memory usage caused by the creation of a 10-word query tree by 20x.
    This is done by deduplicating the MatchingWord values, which are heavy because of their inner DFA. The deduplication works by wrapping each MatchingWord in a reference-counted box and using a hash map to determine whether a MatchingWord DFA already exists for a certain signature, or whether a new one needs to be built.

  2. Avoid the worst-case scenario of creating a MatchingWord for extremely long words that cannot be indexed by milli.

@loiclec loiclec added no breaking The related changes are not breaking (DB nor API) performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption labels Nov 24, 2022
Copy link
Contributor Author

@loiclec loiclec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments in the code to help with the review :))

milli/src/lib.rs Show resolved Hide resolved
milli/src/search/matches/matching_words.rs Show resolved Hide resolved
milli/src/search/query_tree.rs Show resolved Hide resolved
milli/src/search/query_tree.rs Outdated Show resolved Hide resolved
milli/src/search/query_tree.rs Outdated Show resolved Hide resolved
milli/src/search/query_tree.rs Show resolved Hide resolved
milli/src/search/query_tree.rs Outdated Show resolved Hide resolved
milli/src/search/query_tree.rs Outdated Show resolved Hide resolved
milli/src/search/query_tree.rs Outdated Show resolved Hide resolved
milli/src/search/query_tree.rs Outdated Show resolved Hide resolved
milli/src/search/query_tree.rs Show resolved Hide resolved
milli/src/search/matches/matching_words.rs Show resolved Hide resolved
milli/src/search/query_tree.rs Outdated Show resolved Hide resolved
Copy link
Member

@ManyTheFish ManyTheFish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

Bors merge

@bors
Copy link
Contributor

bors bot commented Nov 30, 2022

Build succeeded:

@bors bors bot merged commit 5e754b3 into main Nov 30, 2022
@bors bors bot deleted the cache-matching-words branch November 30, 2022 18:02
@Kerollmops
Copy link
Member

Ho no! We should not merge new PRs on this main branch until we release v0.30.1 of Meilisearch and fix the current bugs!
Should we revert this PR on main @curquiza?

@curquiza
Copy link
Member

curquiza commented Dec 1, 2022

No, it's ok, let is as it is. I will create a custom branch and I will cherry pick the commits on it.
Merge everything your need to merge on main

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
no breaking The related changes are not breaking (DB nor API) performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants