-
Notifications
You must be signed in to change notification settings - Fork 15.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add delete and ensure add_texts performs upsert (w/ ID optional) #6126
Add delete and ensure add_texts performs upsert (w/ ID optional) #6126
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
4f1ebc3
to
f428280
Compare
f428280
to
e3ac198
Compare
e3ac198
to
8e363ed
Compare
@rlancemartin is attempting to deploy a commit to the LangChain Team on Vercel. A member of the Team first needs to authorize it. |
8e363ed
to
9a27c76
Compare
85e941d
to
8572885
Compare
langchain/vectorstores/supabase.py
Outdated
] | ||
|
||
# Handle each insert individually to avoid conflicting IDs | ||
for row in rows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unfortunate. May slow adds considerably. Need to see if there is a better way.
c261dd2
to
4c53e18
Compare
4c53e18
to
adf5d3c
Compare
7ae6cc1
to
c7e85cb
Compare
c852a82
to
2c9d66b
Compare
2c9d66b
to
b03e03d
Compare
lgtm! 👍 🎉 |
langchain/vectorstores/redis.py
Outdated
@@ -461,19 +464,24 @@ def from_texts( | |||
|
|||
@staticmethod | |||
def delete( | |||
keys: List[str], | |||
ids: Optional[List[str]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets just change this, since its an arg and idk how well used i dont think we need to worry about backwards compat
8efe420
to
40a32d8
Compare
@rlancemartin Is it also the intention for this change to standardize the vector DB ids to be UUIDs? The documentation for the supabase integration ( python / javascript ) suggests BIGSERIAL as the type. The underlying database is PostgreSQL with a PGVector extension. At this moment, if we call the add_documents method, it does not send a list of ids to add_texts method. base.py:
def add_documents(self, documents: List[Document], **kwargs: Any) -> List[str]:
"""Run more documents through the embeddings and add to the vectorstore.
Args:
documents (List[Document]: Documents to add to the vectorstore.
Returns:
List[str]: List of IDs of the added texts.
"""
# TODO: Handle the case where the user doesn't provide ids on the Collection
texts = [doc.page_content for doc in documents]
metadatas = [doc.metadata for doc in documents]
return self.add_texts(texts, metadatas, **kwargs)
supabase.py:
def add_texts(
self,
texts: Iterable[str],
metadatas: Optional[List[dict[Any, Any]]] = None,
ids: Optional[List[str]] = None,
**kwargs: Any,
) -> List[str]:
ids = ids or [str(uuid.uuid4()) for _ in texts]
docs = self._texts_to_documents(texts, metadatas)
vectors = self._embedding.embed_documents(list(texts))
return self.add_vectors(vectors, docs, ids) The add_texts method then creates UUIDs and the -- Create a table to store your documents
create table documents (
id bigserial primary key,
content text,
metadata jsonb,
embedding vector(1536)
); Any recommendations here? |
Ah, I see. Will BIGSERIAL always be used as Supabase primary key? If so, we can modify ID. Feel free to put up a PR! |
Goal
We want to ensure consistency across vectordbs:
1/ add
delete
by ID method to the base vectorstore class2/ ensure
add_texts
performsupsert
with ID optionally passedTesting
langchain_test
vectorstore.langchain_test
table.langchain_test
index.langchain_test
table.delete
method added redis method to delete entries by keys #6222