Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

Only flatten the required objects #494

Merged
merged 3 commits into from
Apr 14, 2022
Merged

Only flatten the required objects #494

merged 3 commits into from
Apr 14, 2022

Conversation

irevoire
Copy link
Member

@irevoire irevoire commented Apr 11, 2022

Instead of flattening every object to write them in the flatenned sorter we now check if we needs to flatten the object or if we can insert it as-is.

group                                                             indexing_flatten-what-is-needed_a6031f9a    indexing_main_6b073738
-----                                                             ----------------------------------------    ----------------------
indexing/Indexing geo_point                                       1.00      25.1±0.20s        ? ?/sec         1.00      25.2±0.20s        ? ?/sec
indexing/Indexing movies in three batches                         1.01      18.3±0.12s        ? ?/sec         1.00      18.2±0.10s        ? ?/sec
indexing/Indexing movies with default settings                    1.00      17.6±0.11s        ? ?/sec         1.00      17.5±0.09s        ? ?/sec
indexing/Indexing songs in three batches with default settings    1.00      66.4±0.46s        ? ?/sec         1.03      68.3±1.01s        ? ?/sec
indexing/Indexing songs with default settings                     1.00      55.7±1.15s        ? ?/sec         1.14      63.2±0.78s        ? ?/sec
indexing/Indexing songs without any facets                        1.00      51.6±1.04s        ? ?/sec         1.16      59.6±1.00s        ? ?/sec
indexing/Indexing songs without faceted numbers                   1.00      55.3±1.09s        ? ?/sec         1.13      62.8±0.38s        ? ?/sec
indexing/Indexing wiki                                            1.00   1006.6±26.89s        ? ?/sec         1.00   1009.2±25.25s        ? ?/sec
indexing/Indexing wiki in three batches                           1.00   1140.5±11.97s        ? ?/sec         1.00    1142.0±9.97s        ? ?/sec

We now have performance similar to what we had before for the non nested datasets 🎉

@irevoire irevoire added the no breaking The related changes are not breaking (DB nor API) label Apr 12, 2022
@irevoire irevoire marked this pull request as ready for review April 12, 2022 21:24
milli/src/update/index_documents/transform.rs Outdated Show resolved Hide resolved
json-depth-checker/Cargo.toml Outdated Show resolved Hide resolved
json-depth-checker/benches/depth.rs Show resolved Hide resolved
json-depth-checker/src/lib.rs Outdated Show resolved Hide resolved
milli/src/update/index_documents/transform.rs Outdated Show resolved Hide resolved
milli/src/update/index_documents/transform.rs Show resolved Hide resolved
Kerollmops
Kerollmops previously approved these changes Apr 13, 2022
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thank you!
bors merge

bors bot added a commit that referenced this pull request Apr 13, 2022
494: Only flatten the required objects r=Kerollmops a=irevoire

Instead of flattening every object to write them in the flatenned sorter we now check if we needs to flatten the object or if we can insert it as-is.

```
group                                                             indexing_flatten-what-is-needed_a6031f9a    indexing_main_6b073738
-----                                                             ----------------------------------------    ----------------------
indexing/Indexing geo_point                                       1.00      25.1±0.20s        ? ?/sec         1.00      25.2±0.20s        ? ?/sec
indexing/Indexing movies in three batches                         1.01      18.3±0.12s        ? ?/sec         1.00      18.2±0.10s        ? ?/sec
indexing/Indexing movies with default settings                    1.00      17.6±0.11s        ? ?/sec         1.00      17.5±0.09s        ? ?/sec
indexing/Indexing songs in three batches with default settings    1.00      66.4±0.46s        ? ?/sec         1.03      68.3±1.01s        ? ?/sec
indexing/Indexing songs with default settings                     1.00      55.7±1.15s        ? ?/sec         1.14      63.2±0.78s        ? ?/sec
indexing/Indexing songs without any facets                        1.00      51.6±1.04s        ? ?/sec         1.16      59.6±1.00s        ? ?/sec
indexing/Indexing songs without faceted numbers                   1.00      55.3±1.09s        ? ?/sec         1.13      62.8±0.38s        ? ?/sec
indexing/Indexing wiki                                            1.00   1006.6±26.89s        ? ?/sec         1.00   1009.2±25.25s        ? ?/sec
indexing/Indexing wiki in three batches                           1.00   1140.5±11.97s        ? ?/sec         1.00    1142.0±9.97s        ? ?/sec
```

We now have performance similar to what we had before for the non nested datasets 🎉 

Co-authored-by: Tamo <tamo@meilisearch.com>
@bors
Copy link
Contributor

bors bot commented Apr 13, 2022

This PR was included in a batch that successfully built, but then failed to merge into main. It will not be retried.

Additional information:

{"message":"Required status check \"Specify breaking\" is expected.","documentation_url":"https://docs.github.com/articles/about-protected-branches"}

@curquiza
Copy link
Member

Could you rebase? This PR does not have the latest changes regarding the bors settings

@irevoire
Copy link
Member Author

I rebased on main

Kerollmops
Kerollmops previously approved these changes Apr 13, 2022
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors merge

bors bot added a commit that referenced this pull request Apr 13, 2022
494: Only flatten the required objects r=Kerollmops a=irevoire

Instead of flattening every object to write them in the flatenned sorter we now check if we needs to flatten the object or if we can insert it as-is.

```
group                                                             indexing_flatten-what-is-needed_a6031f9a    indexing_main_6b073738
-----                                                             ----------------------------------------    ----------------------
indexing/Indexing geo_point                                       1.00      25.1±0.20s        ? ?/sec         1.00      25.2±0.20s        ? ?/sec
indexing/Indexing movies in three batches                         1.01      18.3±0.12s        ? ?/sec         1.00      18.2±0.10s        ? ?/sec
indexing/Indexing movies with default settings                    1.00      17.6±0.11s        ? ?/sec         1.00      17.5±0.09s        ? ?/sec
indexing/Indexing songs in three batches with default settings    1.00      66.4±0.46s        ? ?/sec         1.03      68.3±1.01s        ? ?/sec
indexing/Indexing songs with default settings                     1.00      55.7±1.15s        ? ?/sec         1.14      63.2±0.78s        ? ?/sec
indexing/Indexing songs without any facets                        1.00      51.6±1.04s        ? ?/sec         1.16      59.6±1.00s        ? ?/sec
indexing/Indexing songs without faceted numbers                   1.00      55.3±1.09s        ? ?/sec         1.13      62.8±0.38s        ? ?/sec
indexing/Indexing wiki                                            1.00   1006.6±26.89s        ? ?/sec         1.00   1009.2±25.25s        ? ?/sec
indexing/Indexing wiki in three batches                           1.00   1140.5±11.97s        ? ?/sec         1.00    1142.0±9.97s        ? ?/sec
```

We now have performance similar to what we had before for the non nested datasets 🎉 

Co-authored-by: Tamo <tamo@meilisearch.com>
@bors
Copy link
Contributor

bors bot commented Apr 13, 2022

This PR was included in a batch that successfully built, but then failed to merge into main. It will not be retried.

Additional information:

{"message":"Required status check \"Specify breaking\" is expected.","documentation_url":"https://docs.github.com/articles/about-protected-branches"}

@curquiza
Copy link
Member

bors try

bors bot added a commit that referenced this pull request Apr 14, 2022
@bors
Copy link
Contributor

bors bot commented Apr 14, 2022

@irevoire
Copy link
Member Author

bors merge

bors bot added a commit that referenced this pull request Apr 14, 2022
494: Only flatten the required objects r=irevoire a=irevoire

Instead of flattening every object to write them in the flatenned sorter we now check if we needs to flatten the object or if we can insert it as-is.

```
group                                                             indexing_flatten-what-is-needed_a6031f9a    indexing_main_6b073738
-----                                                             ----------------------------------------    ----------------------
indexing/Indexing geo_point                                       1.00      25.1±0.20s        ? ?/sec         1.00      25.2±0.20s        ? ?/sec
indexing/Indexing movies in three batches                         1.01      18.3±0.12s        ? ?/sec         1.00      18.2±0.10s        ? ?/sec
indexing/Indexing movies with default settings                    1.00      17.6±0.11s        ? ?/sec         1.00      17.5±0.09s        ? ?/sec
indexing/Indexing songs in three batches with default settings    1.00      66.4±0.46s        ? ?/sec         1.03      68.3±1.01s        ? ?/sec
indexing/Indexing songs with default settings                     1.00      55.7±1.15s        ? ?/sec         1.14      63.2±0.78s        ? ?/sec
indexing/Indexing songs without any facets                        1.00      51.6±1.04s        ? ?/sec         1.16      59.6±1.00s        ? ?/sec
indexing/Indexing songs without faceted numbers                   1.00      55.3±1.09s        ? ?/sec         1.13      62.8±0.38s        ? ?/sec
indexing/Indexing wiki                                            1.00   1006.6±26.89s        ? ?/sec         1.00   1009.2±25.25s        ? ?/sec
indexing/Indexing wiki in three batches                           1.00   1140.5±11.97s        ? ?/sec         1.00    1142.0±9.97s        ? ?/sec
```

We now have performance similar to what we had before for the non nested datasets 🎉 

Co-authored-by: Tamo <tamo@meilisearch.com>
@bors
Copy link
Contributor

bors bot commented Apr 14, 2022

This PR was included in a batch that successfully built, but then failed to merge into main. It will not be retried.

Additional information:

{"message":"Required status check \"Specify breaking\" is expected.","documentation_url":"https://docs.github.com/articles/about-protected-branches"}

Copy link
Member

@curquiza curquiza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging without bors since we have issues with it and we don't know why...
The PR is up-to-date with main and all the tests have passed.
Also, the PR status check (Specify breaking) is validated as well.

@curquiza curquiza merged commit d362278 into main Apr 14, 2022
@curquiza curquiza deleted the flatten-what-is-needed branch April 14, 2022 09:43
@irevoire irevoire added the performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption label Jul 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
no breaking The related changes are not breaking (DB nor API) performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants