Lucene CuVS Integration

This repo shows an end-to-end way to get started with GPU-accelerated vector search from NVIDIA's Rapids AI CuVS, formerly raft using NVIDIA's brev.dev for one-click deploy GPUs, Apache Lucene Apache Lucene, and Trace Machina's NativeLink to enable users to target specialized hardware.

Tutorial

This is a quick tutorial to get started with GPU-accelerated Lucene.

Pre-requisites:

a brev.dev account with credits for GPUs (let me know if you need help)
a GitHub account

Bootstrap Brev's NVIDIA GPUs

Log into your brev.dev account.
Select "New" in the upper-right
Click "Container Mode" (in production, please use VM mode)
Scroll down to the bottom right and select NVIDIA Rapids
Select the NVIDIA T4 It's plenty. A similarly sized instance from another manufacturer would get smoked in this evaluation.
Deploy
Click this link to get your instance in the browser
(there are other ways but this is the easiest and the least amount of drawing I need to do to obscure details)
Look down and click the "Terminal" option

Set Up the Instance

Sorry, but we need to set up Java. 😅

curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix | sh -s -- install
# TODO there will be some command printed after the installation to refresh the shell.
git clone git clone https://github.com/TraceMachina/lucene-on-gpu
cd lucene on gpu
nix develop
bazel run lucene

Again, text me at 510-495-5257 if you have any questions.

🏛️ Architecture

As an initial integration, the CuVS library is plugged in as an IndexSearcher. This project has two layers:

Java/JNI layer in the lucene directory.
CuVS/C++ layer in the cuda directory.

By way of a working example, OpenAI's Wikipedia corpus (25k documents) can be indexed, each document having a content vector. A provided sample query (query.txt) can be executed after the indexing.

Caution

This is not production ready yet.

🚀 Benchmarks

Wikipedia (768 dimensions, 1M vectors):

	Indexing	Improvement	Search	Improvement
CuVS (RTX 4090, NN_DESCENT)	38.80 sec	25.6x	2 ms	4x
CuVS (RTX 2080 Ti, NN_DESCENT)	47.67 sec	20.8x	3 ms	2.7x
Lucene HNSW (Ryzen 7700X, single thread)	992.37 sec	-	8 ms	-

Wikipedia (2048 dimensions, 1M vectors):

	Indexing	Improvement
CuVS (RTX 4090, NN_DESCENT)	55.84 sec	23.8x
Lucene HNSW (Ryzen 7950X, single thread)	1329.9 sec	-

❄️ Setup

Install Nix and enable flake support. The new experimental nix installer does this for you: https://github.com/NixOS/experimental-nix-installer

Install direnv and add the direnv hook to your .bashrc:

nix profile install nixpkgs#direnv

# For hooks into shells other than bash see https://direnv.net/docs/hook.html.
echo 'eval "$(direnv hook bash)"' >> ~/.bashrc

source ~/.bashrc

Now clone lucene-cuvs, cd into it and run direnv allow:

git clone git@github.com/TraceMachina/lucene-on-gpus
cd lucene-on-gpus
direnv allow

Note

If you don't want to use direnv you can use nix develop manually which is the command that direnv would automatically call for you.

Now run the example:

bazel run lucene

The above command will fetch the dataset, build the CUDA and Java code and run a script which invokes a search on the dataset:

Dataset file used is: /xxx/external/_main~_repo_rules~dataset/file/dataset.zip
Index of vector field is: 5
Name of the vector field is: content_vector
Number of documents to be indexed are: 25000
Number of dimensions are: 768
Query file used is: /xxx/lucene-cuvs/lucene/query.txt
May 30, 2024 3:25:14 PM org.apache.lucene.internal.vectorization.VectorizationProvider lookup
INFO: Java vector incubator API enabled; uses preferredBitSize=256; FMA enabled
5000 docs indexed ...
10000 docs indexed ...
15000 docs indexed ...
20000 docs indexed ...
25000 docs indexed ...
Time taken for index building (end to end): 48656
Time taken for copying data from IndexReader to arrays for C++: 154
CUDA devices: 1
Data copying time (CPU to GPU): 87
[I] [15:27:45.929751] optimizing graph
[I] [15:27:47.860877] Graph optimized, creating index
Cagra Index building time: 104303
[I] [15:27:47.896854] Saving CAGRA index with dataset
Time taken for index building: 104722
Time taken for cagra::search: 7
Time taken for searching (end to end): 8
Found 5 hits.
DocID: 1461, Score: 0.12764463
DocID: 1472, Score: 0.16027361
DocID: 4668, Score: 0.16650483
DocID: 1498, Score: 0.1781094
DocID: 1475, Score: 0.18247437

🥾 Next steps

Instead of extending the IndexSearcher, create a KnnVectorFormat and corresponding KnnVectorsWriter and KnnVectorsReader for tighter integration.

🌱 Contributors

Aaron Siddhartha Mondal, Trace Machina
Blake Hatch, Trace Machina
Marcus Eagan, Trace Machina & Committer, Apache Solr (Later Heavy Lifting)
Dr. Andrew Shipley, Trace Machina
Tim Potter, Trace Machina & Committer, Apache Lucene & Solr
Vivek Narang, SearchScale
Ishan Chattopadhyaya, SearchScale & Committer, Apache Lucene & Solr
Corey Nolet, NVIDIA
Puneet Ahuja, SearchScale
Kishore Angani, SearchScale
Noble Paul, SearchScale & Committer, Apache Lucene & Solrgi

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
cuda		cuda
lucene		lucene
patches		patches
thirdparty		thirdparty
.bazelrc		.bazelrc
.bazelversion		.bazelversion
.envrc		.envrc
.gitignore		.gitignore
BUILD.bazel		BUILD.bazel
LICENSE.txt		LICENSE.txt
MODULE.bazel		MODULE.bazel
MODULE.bazel.lock		MODULE.bazel.lock
Pulumi.yaml		Pulumi.yaml
README.md		README.md
architecture.png		architecture.png
extensions.bzl		extensions.bzl
flake.lock		flake.lock
flake.nix		flake.nix
maven_install.json		maven_install.json
pre-commit-hooks.nix		pre-commit-hooks.nix
query.txt		query.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lucene CuVS Integration

Tutorial

Bootstrap Brev's NVIDIA GPUs

Set Up the Instance

🏛️ Architecture

🚀 Benchmarks

❄️ Setup

🥾 Next steps

🌱 Contributors

About

Releases

Packages

Languages

License

TraceMachina/lucene-on-gpu

Folders and files

Latest commit

History

Repository files navigation

Lucene CuVS Integration

Tutorial

Bootstrap Brev's NVIDIA GPUs

Set Up the Instance

🏛️ Architecture

🚀 Benchmarks

❄️ Setup

🥾 Next steps

🌱 Contributors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages