A curated collection of awesome papers in the field of vector search, known as approximate nearest neighbor search (ANN search, ANNS). This repository aims to gather high-quality research papers, articles, and resources that provide valuable insights and advancements. This technology is a critical component in vector databases, retrieval-augmented generation (RAG), large-scale information retrieval, recommendation systems, drug discovery, image search, etc.
First of all, what is vector search, and why is it so important in the booming age of AI?
simple explanation:
- what-is-vector-search
- a-gentle-introduction-to-vector-search
- Explanation in Quora
- k-nn-vs-approximate-nearest-neighbors
Applications:
Title | Url | High-Level Category | Remarks |
---|---|---|---|
RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor Search | link | graph-based | out-of-distribution |
On Efficient Retrieval of Top Similarity Vectors | link | MIPS | MIPS for top-1 |
In-Storage Acceleration of Graph-Traversal-Based Approximate Nearest Neighbor Search | Link | NAND-Flash acceleration | Using storage compute |
DESSERT: An Efficient Algorithm for Vector Set Search with Vector Set Queries | Link | multi-vector | |
Approximate Nearest Neighbor Search on High Dimensional Data β Experiments, Analyses, and Improvement | Link | Survey | |
Graph-based Nearest Neighbor Search: From Practice to Theory | Link | Theoretical | |
FINGER: Fast Inference for Graph-based Approximate Nearest Neighbor Search | Link | Graph-based | |
HVS: hierarchical graph structure based on Voronoi diagrams for solving approximate nearest neighbor search | Link | Graph-based | |
DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node | Link | Graph-based | SSD-based |
Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs | Link | Graph-based | |
SONG: Approximate Nearest Neighbor Search on GPU | Link | Graph-based | |
Graph-based Nearest Neighbor Search: Promises and Failures | Link | Graph-based | |
Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination | Link | Graph-based | |
A Comprehensive Survey and Experimental Comparison of Graph-Based Approximate Nearest Neighbor Search | Link | Survey | |
Fast approximate nearest neighbor search with the navigating spreading-out graph | Link | Graph-based | |
Non-metric Similarity Graphs for Maximum Inner Product Search | Link | Graph-based | |
Understanding and Improving Proximity Graph-based Maximum Inner Product Search | Link | Graph-based | |
Learning to Route in Similarity Graphs | Link | Graph-based+DeepLearning(GCN) | |
Optimization of Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional Data | Link | Graph-based | |
Fast Approximate Nearest Neighbor Search with a Dynamic Exploration Graph using Continuous Refinement | Link | Graph-based | |
Efficient Approximate Nearest Neighbor Search in Multi-dimensional Databases | Link | Graph-based | |
Scaling Graph-Based ANNS Algorithms to Billion-Size Datasets: A Comparative Analysis | Link | Graph-based | |
SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search | Link | Graph-Tree-based | SSD-based |
Hierarchical Clustering-Based Graphs for Large Scale Approximate Nearest Neighbor Search | Link | Graph-based | |
Hierarchical Clustering-Based Graphs for Large Scale Approximate Nearest Neighbor Search | Link | Graph-based | |
Fusion of graph-based indexing and product quantization for ANN search | Link | Graph-based | |
Towards Efficient Index Construction and Approximate Nearest Neighbor Search in High-Dimensional Spaces | Link | Graph-based | |
Optimization of Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional Data | Link | Graph-based | |
Scaling Graph-Based ANNS Algorithms to Billion-Size Datasets: A Comparative Analysis | Link | Survey | |
Automating Nearest Neighbor Search Configuration with Constrained Optimization | Link | Learning | |
Approximate Nearest Neighbor Search under Neural Similarity Metric for Large-Scale Recommendation | Link | Graph-based | |
Norm Adjusted Proximity Graph for Fast Inner Product Retrieval | Link | Graph-based | |
On Efficient Retrieval of Top Similarity Vectors | Link | Graph-based | |
SONG: Approximate Nearest Neighbor Search on GPU | Link | GPU | |
RTNN: Accelerating Neighbor Search Using Hardware Ray Tracing | Link | GPU | |
Billion-scale similarity search with GPUs | Link | GPU | |
Fast neural ranking on bipartite graph indices | Link | Neural Rank | |
Fast Item Ranking under Neural Network based Measures | Link | Neural Rank | |
Non-metric Similarity Graphs for Maximum Inner Product Search | Link | MIPS | |
MΓΆbius Transformation for Fast Inner Product Search on Graph | Link | MIPS | |
Understanding and Improving Proximity Graph-based Maximum Inner Product Search | Link | MIPS | |
Reinforcement Routing on Proximity Graph for Efficient Recommendation | Link | Learning | |
From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective | Link | Learning | |
Constructing Tree-based Index for Efficient and Effective Dense Retrieval | Link | Learning | |
Reverse Maximum Inner Product Search: Formulation, Algorithms, and Analysis | Link | MIPS | |
FARGO: Fast Maximum Inner Product Search via Global Multi-Probing | Link | LSH | |
SRS: solving c -approximate nearest neighbor queries in high dimensional Euclidean space with a tiny index | Link | LSH | |
From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective | Link | LSH | |
LazyLSH: Approximate Nearest Neighbor Search for Multiple Distance Functions with a Single Index | Link | LSH | |
HD-index: pushing the scalability-accuracy boundary for approximate kNN search in high-dimensional spaces | Link | LSH | |
Falconn++: A Locality-sensitive Filtering Approach for Approximate Nearest Neighbor Search | Link | LSH | |
Deep Semantic-Preserving Ordinal Hashing for Cross-Modal Similarity Search | Link | LSH | |
Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval | Link | LSH | |
A Revisit of Hashing Algorithms for Approximate Nearest Neighbor Search | Link | Survey | |
Transformer Memory as a Differentiable Search Index | Link | Model-as-Index | |
Recommender Systems with Generative Retrieval | Link | Model-as-Index | |
SPREADING VECTORS FOR SIMILARITY SEARCH | Link | Learning + Dimensionality Reduction | |
Model-enhanced Vector Index | Link | Fusion Retrieval | |
GraSP: Optimizing Graph-based Nearest Neighbor Search with Subgraph Sampling and Pruning | Link | Prune edges with learning | |
Low-Precision Quantization for Efficient Nearest Neighbor Search | Link | scalar quantization |
Please note that some entries may require access or membership to view the full content.
We welcome contributions to expand and improve this collection. If you have any papers or resources that you believe should be included, please follow these guidelines:
- Fork the repository.
- Add your paper/resource to the appropriate category or create a new category if needed.
- Include a link to the paper/resource (if available) or any relevant information.
- Submit a pull request.
MIT license.