Due to restrictions imposed by the company's agreement, we are unable to release the internal code for building indexes and conducting the NN search. However, we now release a unified NN search framework based on the hnswlib. One advantage is that it allows for a fairer comparison of search performance with hnswlib, even if the latter employs a hierarchical indexing structure.
- Support various indexing graphs (TSDG$^+$, DPG, NSG, SSG, Vamana, and HNSW), while only one universal API.
- Fast NN search performance and has a fair comparison with hnswlib.
- no change to hnsw indexing and NN search.
- our NN search = hnsw layer0's NN search + random seeds.
- Please make sure
Boost
is installed. On ubuntu, it can be installed bysudo apt install libboost-all-dev
.
cd hnswlib
mkdir build && cd build
cmake .. && make -j
cd ../..
- Google Drive
- Only SIFT1M and DEEP1M have indexing graphs for all comparison methods, while other datasets only have index graphs for TSDG$^+$.
- Run NN search performance curve
- This will load the indexing graph to run NN search and save records.
warning: NN search works on single-thread. It may take several minutes.
./hnswlib/build/uni_nnsearch \
--base_path=/home/jfwang/data/sift1m/sift1m_base.fvecs \
--query_path=/home/jfwang/data/sift1m/sift1m_query.fvecs \
--gt_path=/home/jfwang/data/sift1m/sift1m_gt.ivecs \
--index_path=/home/jfwang/data/sift1m/sift1m_TSDGed_K200_SavMem_XNDesc_SCG_1.2_4_50.ivecs \
>> ./nns_records/sift1m_TSDG_plus.record
- visualization
- after all records are put in the
./nns_records
folder. we can calldraw_fig.py
to show curves.
python draw_fig.py
- We release the parameters we used to index.
- LSH-APG's parameters are loyal to the original implementations except the searched candidate pool size, which is set as
512
. Since it cannot be integrated into this NN search framwork, its parameters are not listed here, and can be found in its literature.
- Only
float
datatype is supported as hnswlib does. - Only the search performance curves in our paper's Fig 6 - 8 can be made. We have an un-released version code, which optimizes the memory cost during NN search.
- For full API, plz run
./hnswlib/build/uni_nnsearch -h
- Our API source codes are listed at
hnswlib/examples/cpp/uni_nnsearch.cpp
andhnswlib/hnswlib/dataloader.h
. Have fun with it.