-
Notifications
You must be signed in to change notification settings - Fork 81
Make soft deletion optional in document addition and deletion + add lots of tests #720
Conversation
Also allow disabling soft-deletion in the IndexDocumentsConfig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote comments to highlight where the soft deletion changes were made, to focus the reviewing efforts on the consequential parts of the PR.
I also don't expect the reviewer to check the content of the .snap files, of course. You can trust that I have read them and didn't see anything wrong with their content :) Don't forget you can filter the files to review by their extension, and thus exclude all .snap
files.
@@ -26,7 +26,6 @@ pub struct DeleteDocuments<'t, 'u, 'i> { | |||
index: &'i Index, | |||
external_documents_ids: ExternalDocumentsIds<'static>, | |||
to_delete_docids: RoaringBitmap, | |||
#[cfg(test)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an important change that does not only affect tests.
@@ -88,6 +88,7 @@ pub struct IndexDocumentsConfig { | |||
pub words_positions_level_group_size: Option<NonZeroU32>, | |||
pub words_positions_min_level_size: Option<NonZeroU32>, | |||
pub update_method: IndexDocumentsMethod, | |||
pub disable_soft_deletion: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second important change that does not only affect the tests
@@ -331,6 +332,7 @@ where | |||
// able to simply insert all the documents even if they already exist in the database. | |||
if !replaced_documents_ids.is_empty() { | |||
let mut deletion_builder = update::DeleteDocuments::new(self.wtxn, self.index)?; | |||
deletion_builder.disable_soft_deletion(self.config.disable_soft_deletion); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only place where IndexDocumentsConfig.disable_soft_deletion
is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for the tests!
bors merge
@@ -80,6 +80,8 @@ impl<'t, 'e> Iterator for AscendingFacetSort<'t, 'e> { | |||
// that we found all the documents in the sub level iterations already, | |||
// we can pop this level iterator. | |||
if documents_ids.is_empty() { | |||
// break our of the for loop into the end of the 'outer loop, which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// break our of the for loop into the end of the 'outer loop, which | |
// break out of the for loop into the end of the 'outer loop, which |
720: Make soft deletion optional in document addition and deletion + add lots of tests r=irevoire a=loiclec # Pull Request ## What does this PR do? When debugging recent issues, I created a few unit tests in the hopes reproducing the bugs I was looking for. In the end, I didn't find any, but I thought it would still be good to keep those tests. More importantly, I added a field to the `DeleteDocuments` and `IndexDocuments` builders, called `disable_soft_deletion`. If set to `true`, the indexing/deletion will never add documents to the `soft_deleted_documents_ids` and instead perform a real deletion of the documents from the databases. For the new tests, I have: - Improved the insta-snapshot format of the `external_documents_ids` structure - Added more tests for the facet DB indexing, deletion, and search algorithms, making sure to test them when the facet DB contains strings (instead of numbers) as well. - Added more tests for the incremental indexing of the prefix proximity databases. For example, to see if documents are replaced correctly and if common prefixes are deleted correctly. - Added tests that mix soft deletion and hard deletion, including when processing batches of document updates. Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
Pull Request
What does this PR do?
When debugging recent issues, I created a few unit tests in the hopes reproducing the bugs I was looking for. In the end, I didn't find any, but I thought it would still be good to keep those tests.
More importantly, I added a field to the
DeleteDocuments
andIndexDocuments
builders, calleddisable_soft_deletion
. If set totrue
, the indexing/deletion will never add documents to thesoft_deleted_documents_ids
and instead perform a real deletion of the documents from the databases.For the new tests, I have:
external_documents_ids
structure