-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is NGT::Index::search thread-safe? #117
Comments
NGT::Index::search is thread-safe. There are cases where search returns different results for the same query, because search uses randomly select nodes from the leaf of the tree as seeds to explore the graph. However, the possibility is quite low. Also the possibility depends on the indexed dataset. I would like to confirm that you don't find the issue when you don't use OpenMP. If so, could you provide your simplified search source code and your NGT index? I will check them. |
Thanks! I have not been able to reproduce the issue without OpenMP, but it's possible that I'm missing something. My simplified search code looks like: struct Edge {
std::size_t id;
std::size_t neighborId;
float distance;
};
#pragma omp declare reduction (merge: std::vector<Edge> : omp_out.insert(omp_out.end(), std::make_move_iterator(omp_in.begin()), std::make_move_iterator(omp_in.end())))
#pragma omp parallel default(none) \
shared(edges, points, numPoints, radius, indexPath)
{
NGT::Index localIndex(indexPath, true);
#pragma omp for nowait reduction(merge : edges)
for (std::size_t i = 0; i < points.size(); ++i) {
const auto &point = points[i];
NGT::SearchQuery q(point);
NGT::ObjectDistances neighbors;
q.setResults(&neighbors);
// To ensure that we get everything within range
q.setSize(numPoints);
q.setRadius(radius);
// NOTE: Including this critical section removes the bug behavior in my testing - it just also removes the parallelism
// #pragma omp critical
// {
localIndex.search(q);
// }
for (const auto &neighbor : neighbors) {
// NGT indices start at 1; ours start at 0
const auto neighborId = neighbor.id - 1;
if (idx != neighborId) {
edges.emplace_back(Edge{idx, neighborId, neighbor.distance});
}
}
}
}
I can also provide generation code to produce The index: ngt_index.tar.gz |
Thank you for providing your code and index. However, I could not reproduce differences between parallelism and non-parallelism even by using points extracted from your index. It might depend on the environment. In addition, I am wondering if emplace_back is thread-safe, even though there is no problem on my environment. Apart from this issue, the opened NGT index placed outside of the parallel section can be shared. |
Interesting! Thank you for trying to reproduce the issue. The |
First off, thank you for NGT! It's quite impressive.
I'm trying to parallelize a large number of queries with OpenMP. My parallelized loop calls
search()
on a read-only index, then further processes the results. However, I seem to get nondeterministic results from this, and I'm wondering if the problem is that this method is not thread-safe.I should also note that I've tried both with and without the shared memory allocator - I did not observe a difference in behavior.
The text was updated successfully, but these errors were encountered: