-
Notifications
You must be signed in to change notification settings - Fork 81
Conversation
64a6154
to
0031e6f
Compare
7e3811d
to
1a44e33
Compare
Could you fix the benchmarks, please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that declaring these new databases isn't an issue, I find that this is a good way to do that. I haven't finished my review, I need to continue on the transform.rs
file but I prefer publishing that than nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please publish the benchmarks you ran recently here? And remove all of the unwrap
s (or at least the problematic ones), this is probably something that you forgot to change 😃
/// Generate the `TransformOutput` based on the given sorter that can be generated from any | ||
/// format like CSV, JSON or JSON stream. This sorter must contain a key that is the document | ||
/// id for the user side and the value must be an obkv where keys are valid fields ids. | ||
pub(crate) fn output_from_sorter<F>( | ||
self, | ||
wtxn: &mut heed::RwTxn, | ||
progress_callback: F, | ||
_progress_callback: F, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could probably be an issue if we no more call this callback, don't you think? It is even easier to use it now that we have only have one loop, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put the call back in the loop that compute the field distribution. Not sure it's the best but it's something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should remove this callback entirely actually? Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, remove it if you don't use it, it would be better :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve this PR, it look good to me! Thanks for having added the flatten-serde-json
crate in the repo along with the fuzzer! Now, we will see what people think about it in the RC0+ of Meilisearch 😃
bors merge
For the following document:
Suppose the user sets
person
as a filterable attribute. We need to storeperson
in the filterable obviously. But we also need to keep track ofperson.name
andperson.age
somewhere.That’s where I changed a little bit the logic of the engine.
Currently, we have a function called
faceted_field
that returns the union of the filterable and sortable.I renamed this function in
user_defined_faceted_field
. And now, when we finish indexing documents, we look at all the fields and see if they « match » auser_defined_faceted_field
.So in our case:
id
matchperson
: 🔴person.name
matchperson
: 🟢person.age
matchperson
: 🟢And thus, we insert in the database the following faceted fields:
person, person.name, person.age
.The good thing about that solution is that we generate everything during the indexing phase, and then during the search, we can access our field without recomputing too much globbing.
Now the bad thing is that I had to create a new db.
And if that was only one db, that would be ok, but actually, I need to do the same for the:
@Kerollmops
Do you think there is a better way to do it?
Apart from all the code, can we have a problem because we have too many dbs?