Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(fdw): How to be compatible with new pgvector types #459

Open
2 of 11 tasks
cutecutecat opened this issue Apr 8, 2024 · 1 comment
Open
2 of 11 tasks

feat(fdw): How to be compatible with new pgvector types #459

cutecutecat opened this issue Apr 8, 2024 · 1 comment
Assignees

Comments

@cutecutecat
Copy link
Member

cutecutecat commented Apr 8, 2024

pgvector/pgvector#508

For:

Possible Methods:

  • Hook / Rewrite --> Different name mapping will fail to push down to index
  • Symbol renaming
@cutecutecat cutecutecat self-assigned this Apr 8, 2024
@cutecutecat cutecutecat changed the title feat(fdw): How to be compatible with pgvector symbol feat(fdw): How to be compatible with new pgvector types Apr 8, 2024
@cutecutecat
Copy link
Member Author

cutecutecat commented Apr 9, 2024

FDW problems

At step CREATE SERVER:

CREATE SERVER foreign_server
        FOREIGN DATA WRAPPER postgres_fdw
        OPTIONS (host 'host.docker.internal', port '28816', dbname 'vectors', sslmode 'disable', use_remote_estimate 'true', extensions 'vector');
GRANT USAGE ON FOREIGN SERVER foreign_server TO local_user;
  • If not config the extensions='vector' at CREATE SERVER step of local machine
    • Then a full table scan will be sent to remote, unable to use an index
postgres=> EXPLAIN verbose select * from foreign_test ORDER BY embedding <-> '[0.40671515, 0.24202824, 0.37059402]' LIMIT 1;
                                            QUERY PLAN                                             
---------------------------------------------------------------------------------------------------
 Limit  (cost=4881.00..4881.00 rows=1 width=44)
   Output: id, embedding, ((embedding <-> '[0.40671515,0.24202824,0.37059402]'::vector))
   ->  Sort  (cost=4881.00..5131.00 rows=100000 width=44)
         Output: id, embedding, ((embedding <-> '[0.40671515,0.24202824,0.37059402]'::vector))
         Sort Key: ((foreign_test.embedding <-> '[0.40671515,0.24202824,0.37059402]'::vector))
         ->  Foreign Scan on public.foreign_test  (cost=100.00..4381.00 rows=100000 width=44)
               Output: id, embedding, (embedding <-> '[0.40671515,0.24202824,0.37059402]'::vector)
               Remote SQL: SELECT id, embedding FROM public.test
  • If config the extensions='vector' at CREATE SERVER step of local machine
    • It requires the extension at "local" and "remote" is installed in same schema
postgres=> EXPLAIN verbose select * from foreign_test ORDER BY embedding <-> '[0.40671515, 0.24202824, 0.37059402]' LIMIT 1;
ERROR:  operator does not exist: vectors.vector public.<-> unknown
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.
CONTEXT:  remote SQL command: EXPLAIN SELECT id, embedding FROM vectors.test ORDER BY (embedding OPERATOR(public.<->) '[0.40671515,0.24202824,0.37059402]') ASC NULLS LAST

The remote operator should be vectors.<-> instead of public.<->, but the local pgvector is installed at public.
https://github.com/postgres/postgres/blob/43a9cab4844b9c933f1a24e24a38311ee24deefd/contrib/postgres_fdw/deparse.c#L3418

pgvecto.rs can be only installed at schema vectors, but pgvector can be installed any schema.

For example, in supabase, pgvector is installed at schema extensions.

Alternatives

  • multiply images

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants