Nested fields #458

irevoire · 2022-03-01T17:05:37Z

For the following document:

{
  "id": 1,
  "person": {
    "name": "tamo",
    "age": 25,
  }
}

Suppose the user sets person as a filterable attribute. We need to store person in the filterable obviously. But we also need to keep track of person.name and person.age somewhere.
That’s where I changed a little bit the logic of the engine.

Currently, we have a function called faceted_field that returns the union of the filterable and sortable.
I renamed this function in user_defined_faceted_field. And now, when we finish indexing documents, we look at all the fields and see if they « match » a user_defined_faceted_field.
So in our case:

does id match person: 🔴
does person.name match person: 🟢
does person.age match person: 🟢

And thus, we insert in the database the following faceted fields: person, person.name, person.age.

The good thing about that solution is that we generate everything during the indexing phase, and then during the search, we can access our field without recomputing too much globbing.

Now the bad thing is that I had to create a new db.

And if that was only one db, that would be ok, but actually, I need to do the same for the:

Displayed attributes
Attributes to retrieve
Attributes to highlight
Attribute to crop

@Kerollmops
Do you think there is a better way to do it?
Apart from all the code, can we have a problem because we have too many dbs?

Kerollmops · 2022-03-24T14:48:26Z

Could you fix the benchmarks, please?
https://github.com/meilisearch/milli/runs/5676613548?check_suite_focus=true#step:8:248

Kerollmops

I think that declaring these new databases isn't an issue, I find that this is a good way to do that. I haven't finished my review, I need to continue on the transform.rs file but I prefer publishing that than nothing.

milli/Cargo.toml

milli/src/lib.rs

milli/src/update/index_documents/mod.rs

milli/src/update/index_documents/transform.rs

Kerollmops

Could you please publish the benchmarks you ran recently here? And remove all of the unwraps (or at least the problematic ones), this is probably something that you forgot to change 😃

milli/src/update/index_documents/transform.rs

Kerollmops · 2022-04-06T23:15:03Z

milli/src/update/index_documents/transform.rs

    /// Generate the `TransformOutput` based on the given sorter that can be generated from any
    /// format like CSV, JSON or JSON stream. This sorter must contain a key that is the document
    /// id for the user side and the value must be an obkv where keys are valid fields ids.
    pub(crate) fn output_from_sorter<F>(
        self,
        wtxn: &mut heed::RwTxn,
-        progress_callback: F,
+        _progress_callback: F,


It could probably be an issue if we no more call this callback, don't you think? It is even easier to use it now that we have only have one loop, no?

I put the call back in the loop that compute the field distribution. Not sure it's the best but it's something.

Maybe we should remove this callback entirely actually? Wdyt?

Yeah, remove it if you don't use it, it would be better :)

milli/src/update/index_documents/transform.rs

Kerollmops

I approve this PR, it look good to me! Thanks for having added the flatten-serde-json crate in the repo along with the fuzzer! Now, we will see what people think about it in the RC0+ of Meilisearch 😃

bors merge

bors · 2022-04-07T17:09:28Z

Build succeeded:

487: Update version (v0.26.0) r=Kerollmops a=curquiza breaking because of #458 Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>

curquiza mentioned this pull request Mar 7, 2022

Support nested fields meilisearch/meilisearch#2211

Closed

3 tasks

irevoire force-pushed the nested_fields branch 4 times, most recently from 64a6154 to 0031e6f Compare March 17, 2022 16:13

irevoire force-pushed the nested_fields branch 3 times, most recently from 7e3811d to 1a44e33 Compare March 24, 2022 12:54

irevoire requested a review from Kerollmops April 6, 2022 16:40

irevoire marked this pull request as ready for review April 6, 2022 16:41

Kerollmops suggested changes Apr 6, 2022

View reviewed changes

irevoire added the DB breaking The related changes break the DB label Apr 6, 2022

Kerollmops suggested changes Apr 6, 2022

View reviewed changes

curquiza mentioned this pull request Apr 7, 2022

Update version (v0.26.0) #487

Merged

irevoire force-pushed the nested_fields branch from 00ea696 to 1147438 Compare April 7, 2022 14:57

irevoire added 2 commits April 7, 2022 16:58

nested fields

4f3ce6d

fix tests after rebase

ab458d8

irevoire force-pushed the nested_fields branch from 1147438 to ab458d8 Compare April 7, 2022 15:00

move the flatten-serde-json crate inside of milli

bab898c

Kerollmops approved these changes Apr 7, 2022

View reviewed changes

Kerollmops mentioned this pull request Apr 7, 2022

Nested fields meilisearch/meilisearch#2298

Merged

irevoire mentioned this pull request Apr 7, 2022

List of improvements on the nested field post v0.27.0 RC0 #488

Closed

9 tasks

bors bot merged commit 80ae020 into main Apr 7, 2022

bors bot deleted the nested_fields branch April 7, 2022 17:09

bors bot added a commit that referenced this pull request Apr 7, 2022

Merge #487

9ac2fd1

487: Update version (v0.26.0) r=Kerollmops a=curquiza breaking because of #458 Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nested fields #458

Nested fields #458

irevoire commented Mar 1, 2022 •

edited

Loading

Kerollmops commented Mar 24, 2022

Kerollmops left a comment

Kerollmops left a comment

Kerollmops Apr 6, 2022

irevoire Apr 7, 2022

irevoire Apr 7, 2022

Kerollmops Apr 7, 2022

Kerollmops left a comment

bors bot commented Apr 7, 2022

Nested fields #458

Nested fields #458

Conversation

irevoire commented Mar 1, 2022 • edited Loading

Kerollmops commented Mar 24, 2022

Kerollmops left a comment

Choose a reason for hiding this comment

Kerollmops left a comment

Choose a reason for hiding this comment

Kerollmops Apr 6, 2022

Choose a reason for hiding this comment

irevoire Apr 7, 2022

Choose a reason for hiding this comment

irevoire Apr 7, 2022

Choose a reason for hiding this comment

Kerollmops Apr 7, 2022

Choose a reason for hiding this comment

Kerollmops left a comment

Choose a reason for hiding this comment

bors bot commented Apr 7, 2022

irevoire commented Mar 1, 2022 •

edited

Loading