-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embed RAG data closer to source data? #675
Comments
cc @jmartisk |
Quickly updating the corresponding embeddings if one of the source documents changes is unfortunately not trivial, because there's no 'equality' operator in this world. Could you perhaps, in the
This also depends on the particualar embedding store implementing the |
I've just noticed there's also the related langchain4j/langchain4j#1299 (except it's about using the embedding ID instead of one stored as part of the metadata) |
|
The current RAG model for pgvector is to store the documents in their own table.
In my application my source documents already have a table:
So, when I iterate those to index them, they all go in their separate table:
This leads me to wonder how I can keep my model and the index in sync. What do I do when I update a single
Talk
entity? Do I need to re-index the entire store?Intuitively, I was expecting to be able to do something like:
But I'm not too sure how to wire this up.
I suppose that a
PanacheEmbeddingStore
could have this sort of API for batch reindex, given this:The text was updated successfully, but these errors were encountered: