You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
IVF index is a partition type index, which consists of a set of inverted buckets. In the retrieval stage, an inverted bucket with a certain amount of data is selected, and then scanned in the bucket to obtain several candidate points that finally meet the nearest neighbors.
Compared with HNSW type algorithms, IVF often requires a certain amount of data for training bucketing. The usual bucketing method is to use K-means clustering to generate K centroids. For the bucket selection strategy during query, we support indexing these K centroids, such as graph indexing for routing.
The vectors in the bucket support multiple encoding methods. Due to the continuous arrangement of data, the access overhead is relatively low, but the total computational effort is higher than that of graph algorithms with similar configurations.
Below we will introduce the basic design framework of IVF index
Construct BucketDatacell to manage the data in the bucket (excluding the centroid)
Construct a data structure called Router to manage the centroid and the corresponding routing method. A simple implementation of Router is composed of centroids. Its classic construction method is k-means. Of course, it also supports importing from the outside. It can contain an Index entity
IVF also supports a reordering mechanism
The relevant Pull Requests are as follows:
The text was updated successfully, but these errors were encountered:
IVF index is a partition type index, which consists of a set of inverted buckets. In the retrieval stage, an inverted bucket with a certain amount of data is selected, and then scanned in the bucket to obtain several candidate points that finally meet the nearest neighbors.
Compared with HNSW type algorithms, IVF often requires a certain amount of data for training bucketing. The usual bucketing method is to use K-means clustering to generate K centroids. For the bucket selection strategy during query, we support indexing these K centroids, such as graph indexing for routing.
The vectors in the bucket support multiple encoding methods. Due to the continuous arrangement of data, the access overhead is relatively low, but the total computational effort is higher than that of graph algorithms with similar configurations.
Below we will introduce the basic design framework of IVF index
Construct
BucketDatacell
to manage the data in the bucket (excluding the centroid)Construct a data structure called
Router
to manage the centroid and the corresponding routing method. A simple implementation of Router is composed of centroids. Its classic construction method is k-means. Of course, it also supports importing from the outside. It can contain an Index entityIVF also supports a reordering mechanism
The relevant Pull Requests are as follows:
The text was updated successfully, but these errors were encountered: