Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace deleted elements at addition #418

Merged
merged 28 commits into from
Jan 12, 2023
Merged

Conversation

dyashuni
Copy link
Contributor

@dyashuni dyashuni commented Sep 20, 2022

Added a new method to replace a deleted element when a new one is added:
labeltype addPointToVacantPlace(const void* data_point, labeltype label)

The method is the same as addPoint but allows to replace a deleted element with a new one to save memory. Note: addPointToVacantPlace is much slower than addPoint, because replacement is a heavy operation.

Edit:

  • Added replace a deleted element when a new one is added
  • Fix of deadlock when add duplicated labels by multithreads #420 by introducing locks by label
  • Added a multi thread test to test against concurrent operations with the same labels
  • Added timeout to test jobs in actions in case if we have deadlock
  • Added locks by label
  • Removed python 3.6 tests as it is not available in Ubuntu 22.04
  • Fix multithread update of elements
  • Update readme and refactoring

@dyashuni dyashuni changed the title Replace deleted elements at insertion Replace deleted elements at addition Sep 20, 2022
Copy link
Member

@yurymalkov yurymalkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the code is not thread safe.

I guess we should write parallel stress tests to find those bugs.

hnswlib/hnswalg.h Outdated Show resolved Hide resolved
hnswlib/hnswalg.h Outdated Show resolved Hide resolved
hnswlib/hnswalg.h Outdated Show resolved Hide resolved
hnswlib/hnswalg.h Outdated Show resolved Hide resolved
Copy link
Member

@yurymalkov yurymalkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!
I think the main thing to consider is the interface - using flag instead of separate method.

hnswlib/hnswalg.h Show resolved Hide resolved
hnswlib/hnswalg.h Outdated Show resolved Hide resolved
static const unsigned char DELETE_MARK = 0x01;

size_t max_elements_{0};
size_t cur_element_count{0};
mutable std::atomic<size_t> cur_element_count{0}; // current number of elements
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it atomic, shouldn't it be locked all the times by external locks?

hnswlib/hnswalg.h Outdated Show resolved Hide resolved
Copy link
Member

@yurymalkov yurymalkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the parameter in the index should be renamed as there is a name collision. Maybe also run the performance regression tests. Otherwise looks great to me! Thank you!

.github/workflows/build.yml Show resolved Hide resolved
hnswlib/hnswalg.h Outdated Show resolved Hide resolved
@dyashuni dyashuni merged commit 3e006ea into nmslib:develop Jan 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants