-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize Vamana indexing #2
Comments
How does DiskANN deal with it? First, it updates p's out edges under a lock. Second. at the end of robust prune (in
This is actually not a complete solution!
|
Delaying pruning of back-edge nodes can be a good solution for annindex. Need to be careful not to let them grow too far though. |
Another alternative is to do that part of building the index in one core, perhaps using minibatches whose size depend on the number of cores:
|
Another option: build and merge, similar to what DiskANN does for large collections that don't fit in memory. |
incorrect tag in comment :( |
See also the ParlayANN paper at https://dl.acm.org/doi/abs/10.1145/3627535.3638475 |
Parallelize Vamana indexing over multiple cores.
Superficially, it should be "easy". Index simply means iterating over all point p and (1) doing greedy search, and (2) robust prune: update p's out-neighbours using the nodes visited during search, and create corresponding back-edges to p.
In reality, this is more challenging, for several reasons:
Must be careful with tweakes, but something should be done, as indexing is not fast enough currently.
The text was updated successfully, but these errors were encountered: