Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

Reapply fixes for v0.37.1 #725

Merged
merged 4 commits into from
Dec 6, 2022
Merged

Reapply fixes for v0.37.1 #725

merged 4 commits into from
Dec 6, 2022

Conversation

Kerollmops
Copy link
Member

@Kerollmops Kerollmops commented Dec 6, 2022

This PR reapplies #712, #720, #722, and #723.
We needed the #720 as it was bringing more tests and snap-related improvements.

bors bot and others added 4 commits December 6, 2022 16:11
712: Fix bulk facet indexing bug r=Kerollmops a=loiclec

# Pull Request

## Related issue
Fixes (partially, until merged into meilisearch) meilisearch/meilisearch#3165

## What does this PR do?
Fixes a bug where indexing certain numbers of filterable attribute values in bulk led to corrupted facet databases. This was due to a lossy integer conversion which would ultimately prevent entire levels of the facet database to be written into LMDB.

More specifically, this change was made:
```diff
      - if cur_writer_len as u8 >= self.min_level_size {
      + if cur_writer_len >= self.min_level_size as usize {
```
I also checked other comparisons to `min_level_size` and other conversions such as `x as u8` in this part of the codebase.



Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
722: Geosearch for zero radius r=irevoire a=amab8901

# Pull Request

## Related issue
Fixes #3167 (meilisearch/meilisearch#3167)

## What does this PR do?
- allows Geosearch with zero radius to return the specified location when the coordinates match perfectly (instead of returning nothing). See link for more details.
- new attempt on #713

## PR checklist
Please check if your PR fulfills the following requirements:
- [ X ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ X ] Have you read the contributing guidelines?
- [ X ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: amab8901 <amab8901@protonmail.com>
Co-authored-by: Tamo <irevoire@protonmail.ch>
720: Make soft deletion optional in document addition and deletion + add lots of tests r=irevoire a=loiclec

# Pull Request

## What does this PR do?
When debugging recent issues, I created a few unit tests in the hopes reproducing the bugs I was looking for. In the end, I didn't find any, but I thought it would still be good to keep those tests. 

More importantly, I added a field to the `DeleteDocuments` and `IndexDocuments` builders, called `disable_soft_deletion`. If set to `true`, the indexing/deletion will never add documents to the `soft_deleted_documents_ids` and instead perform a real deletion of the documents from the databases.

For the new tests, I have:
- Improved the insta-snapshot format of the `external_documents_ids` structure
- Added more tests for the facet DB indexing, deletion, and search algorithms, making sure to test them when the facet DB contains strings (instead of numbers) as well.
- Added more tests for the incremental indexing of the prefix proximity databases. For example, to see if documents are replaced correctly and if common prefixes are deleted correctly.
- Added tests that mix soft deletion and hard deletion, including when processing batches of document updates. 


Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
723: Fix bug in handling of soft deleted documents when updating settings r=Kerollmops a=loiclec

# Pull Request

## Related issue
Fixes (partially, until merged into meilisearch) meilisearch/meilisearch#3021

## What does this PR do?
This PR fixes the bug where a `missing key in documents database` internal error message could appear when indexing documents.

When updating the settings, before clearing the database and before creating the transform output, we now modify the `ExternalDocumentsIds` structure to get rid of all references to soft deleted document ids in its FSTs.

It used to be that updating the settings would clear the soft-deleted document ids, but keep the original `ExternalDocumentsIds` structure. As a consequence of this, when processing a future document addition, we could wrongly believe that a document was being replaced when, in fact, it was a completely new document. See the tests `bug_3021_first`, `bug_3021_second`, and `bug_3021` for a minimal test case that would have reproduced the issue.
 
We need to take special care to:
- evaluate how users should update to v0.30.1 (containing this fix): dump? reimporting all documents from scratch?
- understand IF/HOW this bug could have caused duplicate documents to be returned 
- and evaluate the correctness of the fix, of course :)


Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
@Kerollmops Kerollmops added the no breaking The related changes are not breaking (DB nor API) label Dec 6, 2022
Copy link
Member

@curquiza curquiza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for me 👌

@curquiza
Copy link
Member

curquiza commented Dec 6, 2022

bors merge

@bors
Copy link
Contributor

bors bot commented Dec 6, 2022

@bors bors bot merged commit e1b2113 into release-v0.37.1 Dec 6, 2022
@bors bors bot deleted the reapply-fixes branch December 6, 2022 16:03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
no breaking The related changes are not breaking (DB nor API)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants