Package hnsw
implements Hierarchical Navigable Small World graphs in Go. You
can read up about how they work here. In essence,
they allow for fast approximate nearest neighbor searches with high-dimensional
vector data.
This package can be thought of as an in-memory alternative to your favorite vector database (e.g. Pinecone, Weaviate). It implements just the essential operations:
Operation | Complexity | Description |
---|---|---|
Insert | Insert a vector into the graph | |
Delete | Delete a vector from the graph | |
Search | Search for the nearest neighbors of a vector | |
Lookup | Retrieve a vector by ID |
Note
Complexities are approximate where
go get github.com/coder/hnsw@main
g := hnsw.NewGraph[int]()
g.Add(
hnsw.MakeNode(1, []float32{1, 1, 1}),
hnsw.MakeNode(2, []float32{1, -1, 0.999}),
hnsw.MakeNode(3, []float32{1, 0, -0.5}),
)
neighbors := g.Search(
[]float32{0.5, 0.5, 0.5},
1,
)
fmt.Printf("best friend: %v\n", neighbors[0].Vec)
// Output: best friend: [1 1 1]
While all graph operations are in-memory, hnsw
provides facilities for loading/saving from persistent storage.
For an io.Reader
/io.Writer
interface, use Graph.Export
and Graph.Import
.
If you're using a single file as the backend, hnsw provides a convenient SavedGraph
type instead:
path := "some.graph"
g1, err := LoadSavedGraph[int](path)
if err != nil {
panic(err)
}
// Insert some vectors
for i := 0; i < 128; i++ {
g1.Add(hnsw.MakeNode(i, []float32{float32(i)}))
}
// Save to disk
err = g1.Save()
if err != nil {
panic(err)
}
// Later...
// g2 is a copy of g1
g2, err := LoadSavedGraph[int](path)
if err != nil {
panic(err)
}
See more:
We use a fast binary encoding for the graph, so you can expect to save/load nearly at disk speed. On my M3 Macbook I get these benchmark results:
goos: darwin
goarch: arm64
pkg: github.com/coder/hnsw
BenchmarkGraph_Import-16 4029 259927 ns/op 796.85 MB/s 496022 B/op 3212 allocs/op
BenchmarkGraph_Export-16 7042 168028 ns/op 1232.49 MB/s 239886 B/op 2388 allocs/op
PASS
ok github.com/coder/hnsw 2.624s
when saving/loading a graph of 100 vectors with 256 dimensions.
By and large the greatest effect you can have on the performance of the graph is reducing the dimensionality of your data. At 1536 dimensions (OpenAI default), 70% of the query process under default parameters is spent in the distance function.
If you're struggling with slowness / latency, consider:
- Reducing dimensionality
- Increasing
$M$
And, if you're struggling with excess memory usage, consider:
- Reducing
$M$ a.k.aGraph.M
(the maximum number of neighbors each node can have) - Reducing
$m_L$ a.k.aGraph.Ml
(the level generation parameter)
The memory overhead of a graph looks like:
where:
-
$n$ is the number of vectors in the graph -
$\text{size(key)}$ is the average size of the key in bytes -
$M$ is the maximum number of neighbors each node can have -
$d$ is the dimensionality of the vectors -
$mem_{graph}$ is the memory used by the graph structure across all layers -
$mem_{base}$ is the memory used by the vectors themselves in the base or 0th layer
You can infer that:
- Connectivity (
$M$ ) is very expensive if keys are large - If
$d \cdot 4$ is far larger than$M \cdot \text{size(key)}$ , you should expect linear memory usage spent on representing vector data - If
$d \cdot 4$ is far smaller than$M \cdot \text{size(key)}$ , you should expect$n \cdot \log(n)$ memory usage spent on representing graph structure
In the example of a graph with 256 dimensions, and
-
$256 \cdot 4 = 1024$ data bytes -
$16 \cdot 8 = 128$ metadata bytes
and memory growth is mostly linear.