Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate IVF-PQ from RAFT to cuVS #86

Merged
merged 47 commits into from
May 10, 2024

Conversation

lowener
Copy link
Contributor

@lowener lowener commented Apr 19, 2024

List of changes I made during the migration:

  • The unit tests are testing IVFPQ codepacker and build functions. So these functions had to be exposed (through cuvs/neighbors/ivf_pq_helpers.hpp)
  • search_with_filter is added to the public API, with bitset_filter

lowener added 9 commits April 11, 2024 11:16
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
@lowener lowener requested review from a team as code owners April 19, 2024 14:28
Copy link

copy-pr-bot bot commented Apr 19, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@lowener lowener added feature request New feature or request non-breaking Introduces a non-breaking change labels Apr 19, 2024
@lowener
Copy link
Contributor Author

lowener commented Apr 19, 2024

/ok to test

Signed-off-by: Mickael Ide <mide@nvidia.com>
@lowener
Copy link
Contributor Author

lowener commented Apr 23, 2024

/ok to test

@lowener
Copy link
Contributor Author

lowener commented Apr 23, 2024

/ok to test

Signed-off-by: Mickael Ide <mide@nvidia.com>
@lowener
Copy link
Contributor Author

lowener commented Apr 23, 2024

/ok to test

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good start, but the layout and design of cuvs code is fundamentally quite different than what we've been doing in RAFT because it's not a header-only library. As such, we need to treat the src/ and include/ directories as a first-class C++ library, which means defining a very clear separation between interface and implementation. The include/ directory should contain only definitions and interfaces and all implementations should be compiled into source files in src/. Any and all implementation headers that are not part of the user-visible public APIs belong in src/.

*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider keeping this in raft with the other vocabulary types so it can continue to be used across different libraries that use raft. Also- we can't be defining device functions in an hpp file in cuVS, since it's not header-only. The only APIs users should be interacting with in cuVS should be pre-compiled runtime APIs.

* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be a user-facing API and thus shouldn't be defined in include/. We should no longer be using include/ for headers just because they need to be used across cuvs. Use src/ for those headers. Only headers in include/ should be those that we expect the user to interact with directly (thus the number of headers in include/ should be very small.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having all of the declaration in src/ will lead to #include with long relative paths that can be problematic in the future.
To avoid that we can add the src/ directory to CMake target_include_directories() but this would lead to includes such as src/core/nvtx.hpp. Are we fine with that or do we want to keep something like cuvs/core/nvtx.hpp?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done relative paths for cagra so far and haven't seen them to be too terrible. Is ivf-pq somehow making it more challenging to work with?

From a development perspective, I generally tend to prefer the use of the relative quotationed paths for things that are local to src/, (and thus internal) rather than muddying the line between the two and making it harder to discern which things are public APIs and which aren't.

cpp/include/cuvs/neighbors/detail/ivf_common.cuh Outdated Show resolved Hide resolved
cpp/include/cuvs/neighbors/ivf_list.hpp Outdated Show resolved Hide resolved
cpp/include/cuvs/neighbors/ivf_pq.hpp Show resolved Hide resolved
cpp/include/cuvs/neighbors/ivf_pq_helpers.cuh Outdated Show resolved Hide resolved
template <typename filter_t, typename = void>
struct takes_three_args : std::false_type {};
template <typename filter_t>
struct takes_three_args<
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitino in include/, implementation compiled into source file(s) in src/.

return raft::make_device_vector_view<const bitset_t, index_t>(bitset_ptr_, n_elements());
}

template <typename bitset_t, typename index_t>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind compiling bitset into cuvs, but i still think much of the implementation should live in headers in raft. I think what we can do is use the using clause in order to expose a cuvs::bitset while still having the bitset code officially live in raft. This just makes things more flexible for users of raft.

@lowener
Copy link
Contributor Author

lowener commented May 8, 2024

/ok to test

@cjnolet
Copy link
Member

cjnolet commented May 9, 2024

/ok to test

@cjnolet
Copy link
Member

cjnolet commented May 9, 2024

/ok to test

@github-actions github-actions bot added the Python label May 9, 2024
@cjnolet
Copy link
Member

cjnolet commented May 9, 2024

/ok to test

@cjnolet
Copy link
Member

cjnolet commented May 9, 2024

/ok to test

1 similar comment
@cjnolet
Copy link
Member

cjnolet commented May 9, 2024

/ok to test

@lowener
Copy link
Contributor Author

lowener commented May 10, 2024

/ok to test

2 similar comments
@lowener
Copy link
Contributor Author

lowener commented May 10, 2024

/ok to test

@cjnolet
Copy link
Member

cjnolet commented May 10, 2024

/ok to test

@cjnolet
Copy link
Member

cjnolet commented May 10, 2024

/ok to test

@cjnolet
Copy link
Member

cjnolet commented May 10, 2024

/ok to test

@cjnolet
Copy link
Member

cjnolet commented May 10, 2024

/ok to test

* @param[in] label the id of the target list (cluster).
*/
void erase_list(raft::resources const& res, index<int64_t>* index, uint32_t label);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are missing the reset_index helper. That helper is needed by FAISS. It can also be moved to the index class instead of being an external helper.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are missing unpack_contiguous_list_data and recompute_internal_state in the helpers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract_centers is missing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you ensure that all the helpers have been migrated? They are essential for FAISS' future dependence on cuVS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @lowener- we will need to add these in follow-on PRs. We needed to merge this PR for other reasons.

@cjnolet
Copy link
Member

cjnolet commented May 10, 2024

/merge

@rapids-bot rapids-bot bot merged commit 22f107c into rapidsai:branch-24.06 May 10, 2024
54 checks passed
@lowener lowener deleted the 24.06-ivfpq-migration branch May 14, 2024 11:11
difyrrwrzd added a commit to difyrrwrzd/cuvs that referenced this pull request Aug 10, 2024
List of changes I made during the migration:
- The unit tests are testing IVFPQ codepacker and build functions. So these functions had to be exposed (through `cuvs/neighbors/details`)
- The bitset filter is now located under `cuvs/neighbors/bitset_filter.cuh"
- search_with_filter is added to the public API, with bitset_filter
- Bitset is exposed in public API in a `.hpp` header, for inclusion in CUDA-free code. `bitset::test()` function has been temporarily disabled due to the switch from `.cuh` to `.hpp`.

Authors:
  - Micka (https://github.com/lowener)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai/cuvs#86
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cpp feature request New feature or request non-breaking Introduces a non-breaking change Python
Projects
Development

Successfully merging this pull request may close these issues.

3 participants