pgvector support for Elixir
Add this line to your application’s mix.exs
under deps
:
{:pgvector, "~> 0.3.0"}
And follow the instructions for your database library:
Or check out some examples:
- Embeddings with OpenAI
- Binary embeddings with Cohere
- Sentence embeddings with Bumblebee
- Hybrid search with Bumblebee (Reciprocal Rank Fusion)
- Sparse search with Bumblebee
- Horizontal scaling with Citus
- Bulk loading with
COPY
Create lib/postgrex_types.ex
with:
Postgrex.Types.define(MyApp.PostgrexTypes, Pgvector.extensions() ++ Ecto.Adapters.Postgres.extensions(), [])
And add to config/config.exs
:
config :my_app, MyApp.Repo, types: MyApp.PostgrexTypes
Create a migration
mix ecto.gen.migration create_vector_extension
with:
defmodule MyApp.Repo.Migrations.CreateVectorExtension do
use Ecto.Migration
def up do
execute "CREATE EXTENSION IF NOT EXISTS vector"
end
def down do
execute "DROP EXTENSION vector"
end
end
Run the migration
mix ecto.migrate
You can now use the vector
type in future migrations
create table(:items) do
add :embedding, :vector, size: 3
end
Also supports :halfvec
, :bit
, and :sparsevec
Update the model
schema "items" do
field :embedding, Pgvector.Ecto.Vector
end
Also supports Pgvector.Ecto.HalfVector
, Pgvector.Ecto.Bit
, and Pgvector.Ecto.SparseVector
Insert a vector
alias MyApp.{Repo, Item}
Repo.insert(%Item{embedding: [1, 2, 3]})
Get the nearest neighbors
import Ecto.Query
import Pgvector.Ecto.Query
Repo.all(from i in Item, order_by: l2_distance(i.embedding, ^Pgvector.new([1, 2, 3])), limit: 5)
Also supports max_inner_product
, cosine_distance
, l1_distance
, hamming_distance
, and jaccard_distance
Convert a vector to a list or Nx tensor
item.embedding |> Pgvector.to_list()
item.embedding |> Pgvector.to_tensor()
Add an approximate index in a migration
create index("items", ["embedding vector_l2_ops"], using: :hnsw)
# or
create index("items", ["embedding vector_l2_ops"], using: :ivfflat, options: "lists = 100")
Use vector_ip_ops
for inner product and vector_cosine_ops
for cosine distance
Register the extension
Postgrex.Types.define(MyApp.PostgrexTypes, Pgvector.extensions(), [])
And pass it to start_link
{:ok, pid} = Postgrex.start_link(types: MyApp.PostgrexTypes)
Enable the extension
Postgrex.query!(pid, "CREATE EXTENSION IF NOT EXISTS vector", [])
Create a table
Postgrex.query!(pid, "CREATE TABLE items (embedding vector(3))", [])
Insert a vector
Postgrex.query!(pid, "INSERT INTO items (embedding) VALUES ($1)", [[1, 2, 3]])
Get the nearest neighbors
Postgrex.query!(pid, "SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 5", [[1, 2, 3]])
Convert a vector to a list or Nx tensor
vector |> Pgvector.to_list()
vector |> Pgvector.to_tensor()
Add an approximate index
Postgrex.query!(pid, "CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)", [])
# or
Postgrex.query!(pid, "CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)", [])
Use vector_ip_ops
for inner product and vector_cosine_ops
for cosine distance
Lists must be converted to Pgvector
structs for Ecto distance functions.
# before
l2_distance(i.embedding, [1, 2, 3])
# after
l2_distance(i.embedding, ^Pgvector.new([1, 2, 3]))
Vectors are now returned as Pgvector
structs instead of lists. Get a list with:
vector |> Pgvector.to_list()
or an Nx tensor with:
vector |> Pgvector.to_tensor()
View the changelog
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone https://github.com/pgvector/pgvector-elixir.git
cd pgvector-elixir
mix deps.get
createdb pgvector_elixir_test
mix test
To run an example:
cd examples/loading
mix deps.get
createdb pgvector_example
mix run example.exs