Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initialize fields in constructor #396

Merged
merged 2 commits into from
Aug 5, 2022

Conversation

jlmelville
Copy link
Contributor

Fixes #393.

Testing: I ran make test and got no different results between the output in this PR and the current state of develop. That said I do see these warnings in the output (these are also present in develop):

.Adding all elements (10000)
.Running pickle tests for <hnswlib.Index(space='cosine', dim=32)>
.Running pickle tests for <hnswlib.Index(space='ip', dim=16)>
Warning: 1 labels are missing from ann results (k=25, err_thresh=5)
Warning: 1 labels are missing from ann results (k=25, err_thresh=5)
Warning: 1 labels are missing from ann results (k=25, err_thresh=5)
Warning: 1 ann distance values are different from brute-force values (total # of values=2500, dists_thresh=50)
Warning: 1 labels are missing from ann results (k=25, err_thresh=5)
Warning: 1 labels are missing from ann results (k=25, err_thresh=5)
Warning: 1 labels are missing from ann results (k=25, err_thresh=5)
Warning: 1 ann distance values are different from brute-force values (total # of values=2500, dists_thresh=50)

I am not sure if this is an actual problem or not?

Also, I confirmed that the valgrind error disappears via testing through the R bindings I maintain.

@jlmelville jlmelville changed the base branch from master to develop August 2, 2022 04:18
@yurymalkov
Copy link
Member

Cool! Thanks!

Before merging, I wonder if the constructor initialization is the preferred way (compared to C++ 11 initialization at the variable declaration)?

@jlmelville
Copy link
Contributor Author

Before merging, I wonder if the constructor initialization is the preferred way (compared to C++ 11 initialization at the variable declaration)?

I can do that. The body of the third constructor HierarchicalNSW(SpaceInterface<dist_t> *, size_t, size_t, size_t, size_t) (not changed in this PR) sets some of those fields to other values (e.g. ef_construction and M_). Would you want those fields also changed to default member initializers (presumably paying the minimal price of setting them twice) or leave those ones alone?

If we are talking constructor improvements then that third constructor would gain the most from changes, as it is the least C++ idiomatic as it sets most of its fields in the constructor body rather than using initializer lists. This PR will leave the constructor styles a bit inconsistent which might be confusing for readers. But I wanted to keep the scope of this PR as small as I could, especially as clang-tidy has a lot to say about converting hnswlib to C++11 form and it's hard to know where to stop once you start down that path...

Anyway let me know if you would like me to go ahead and use the default initialization approach (I see that clang-tidy may have missed some fields that should be default initialized).

@yurymalkov
Copy link
Member

Thanks! I am not a C++ expert, so not sure what is the best.
From my experience with other languages (java mostly) it is convenient to have declaration and default initialization at the same point of code (e.g. adding the default values in the block https://github.com/nmslib/hnswlib/blob/master/hnswlib/hnswalg.h#L88-L125 and not touching the existing constructors - they would only override the default values), so it would be also visible if some of the variables are not initialized without cross-checking. I thought that was introduced in C++11 (which is already used in hnswlib), though not sure if that is the right thing to do.

@jlmelville
Copy link
Contributor Author

@yurymalkov I have made the changes to use direct initialization for the non-class-type fields like ints and size_ts (I have also used braces for the initialization which seems to be preferred by the C++ Core Guidelines). I believe more complex member types like std::vector will be default initialized without having to do anything but I am happy to be corrected on that.

Valgrind (via R) is still happy and make test gives the same output.

@yurymalkov
Copy link
Member

Great! Thank you so much!

@yurymalkov yurymalkov merged commit fdb1632 into nmslib:develop Aug 5, 2022
@jlmelville jlmelville deleted the bug/ctor-init branch August 5, 2022 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

valgrind problems when searching a reloaded index
2 participants