-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate IVF-PQ from RAFT to cuVS #86
Migrate IVF-PQ from RAFT to cuVS #86
Conversation
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
/ok to test |
Signed-off-by: Mickael Ide <mide@nvidia.com>
/ok to test |
/ok to test |
Signed-off-by: Mickael Ide <mide@nvidia.com>
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good start, but the layout and design of cuvs code is fundamentally quite different than what we've been doing in RAFT because it's not a header-only library. As such, we need to treat the src/
and include/
directories as a first-class C++ library, which means defining a very clear separation between interface and implementation. The include/
directory should contain only definitions and interfaces and all implementations should be compiled into source files in src/
. Any and all implementation headers that are not part of the user-visible public APIs belong in src/
.
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider keeping this in raft with the other vocabulary types so it can continue to be used across different libraries that use raft. Also- we can't be defining device functions in an hpp file in cuVS, since it's not header-only. The only APIs users should be interacting with in cuVS should be pre-compiled runtime APIs.
cpp/include/cuvs/core/nvtx.hpp
Outdated
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be a user-facing API and thus shouldn't be defined in include/
. We should no longer be using include/
for headers just because they need to be used across cuvs. Use src/
for those headers. Only headers in include/
should be those that we expect the user to interact with directly (thus the number of headers in include/
should be very small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having all of the declaration in src/
will lead to #include
with long relative paths that can be problematic in the future.
To avoid that we can add the src/
directory to CMake target_include_directories()
but this would lead to includes such as src/core/nvtx.hpp
. Are we fine with that or do we want to keep something like cuvs/core/nvtx.hpp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done relative paths for cagra so far and haven't seen them to be too terrible. Is ivf-pq somehow making it more challenging to work with?
From a development perspective, I generally tend to prefer the use of the relative quotationed paths for things that are local to src/, (and thus internal) rather than muddying the line between the two and making it harder to discern which things are public APIs and which aren't.
template <typename filter_t, typename = void> | ||
struct takes_three_args : std::false_type {}; | ||
template <typename filter_t> | ||
struct takes_three_args< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitino in include/
, implementation compiled into source file(s) in src/
.
cpp/src/core/bitset.cu
Outdated
return raft::make_device_vector_view<const bitset_t, index_t>(bitset_ptr_, n_elements()); | ||
} | ||
|
||
template <typename bitset_t, typename index_t> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind compiling bitset
into cuvs, but i still think much of the implementation should live in headers in raft. I think what we can do is use the using
clause in order to expose a cuvs::bitset
while still having the bitset code officially live in raft. This just makes things more flexible for users of raft.
/ok to test |
…-2406-cagra_from_raft
/ok to test |
/ok to test |
/ok to test |
/ok to test |
1 similar comment
/ok to test |
/ok to test |
2 similar comments
/ok to test |
/ok to test |
/ok to test |
/ok to test |
/ok to test |
* @param[in] label the id of the target list (cluster). | ||
*/ | ||
void erase_list(raft::resources const& res, index<int64_t>* index, uint32_t label); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we are missing the reset_index
helper. That helper is needed by FAISS. It can also be moved to the index class instead of being an external helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are missing unpack_contiguous_list_data
and recompute_internal_state
in the helpers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract_centers
is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you ensure that all the helpers have been migrated? They are essential for FAISS' future dependence on cuVS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @lowener- we will need to add these in follow-on PRs. We needed to merge this PR for other reasons.
/merge |
List of changes I made during the migration: - The unit tests are testing IVFPQ codepacker and build functions. So these functions had to be exposed (through `cuvs/neighbors/details`) - The bitset filter is now located under `cuvs/neighbors/bitset_filter.cuh" - search_with_filter is added to the public API, with bitset_filter - Bitset is exposed in public API in a `.hpp` header, for inclusion in CUDA-free code. `bitset::test()` function has been temporarily disabled due to the switch from `.cuh` to `.hpp`. Authors: - Micka (https://github.com/lowener) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai/cuvs#86
List of changes I made during the migration:
cuvs/neighbors/ivf_pq_helpers.hpp
)search_with_filter
is added to the public API, with bitset_filter