-
Notifications
You must be signed in to change notification settings - Fork 15.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to specify ID when adding to the FAISS vectorstore. #5190
Conversation
This change allows unique IDs to be specified when adding documents / embeddings to a faiss vectorstore. This reflects the current approach with the chroma vectorstore. It allows rejection of inserts on duplicate IDs and will allow deletion / update by searching on deterministic ID (such as a hash). This commit solves #5065 and #3896 and should solve #2699 indirectly.
I think that all the Specifying the ids in the |
Why? They all pass through the Is it for the documentation? |
This gives visibility in the documentation.
I added the change you suggested. |
I mean how do you specify the ids if you use add_texts? |
@@ -432,6 +443,7 @@ def from_embeddings( | |||
text_embeddings: List[Tuple[str, List[float]]], | |||
embedding: Embeddings, | |||
metadatas: Optional[List[dict]] = None, | |||
ids: Optional[List[str]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be passed to __from right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -161,6 +167,7 @@ def add_embeddings( | |||
text_embeddings: Iterable pairs of string and embedding to | |||
add to the vectorstore. | |||
metadatas: Optional list of metadatas associated with the texts. | |||
ids: Optional list of unique IDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be passed to __add right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
**kwargs: Any, | ||
) -> List[str]: | ||
"""Run more texts through the embeddings and add to the vectorstore. | ||
|
||
Args: | ||
texts: Iterable of strings to add to the vectorstore. | ||
metadatas: Optional list of metadatas associated with the texts. | ||
ids: Optional list of unique IDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be passed to __add right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
This is now necessary because they are listed in the input arguments explicitly.
I know.
This wasn't necessary when the ids were not explicitly listed as arguments for the public access methods, because the ids arguments would pass through to the private After the change you requested, yes, I should have passed them explicitly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
…ai#5190) # Allow to specify ID when adding to the FAISS vectorstore This change allows unique IDs to be specified when adding documents / embeddings to a faiss vectorstore. - This reflects the current approach with the chroma vectorstore. - It allows rejection of inserts on duplicate IDs - will allow deletion / update by searching on deterministic ID (such as a hash). - If not specified, a random UUID is generated (as per previous behaviour, so non-breaking). This commit fixes langchain-ai#5065 and langchain-ai#3896 and should fix langchain-ai#2699 indirectly. I've tested adding and merging. Kindly tagging @Xmaster6y @dev2049 for review. --------- Co-authored-by: Ati Sharma <ati@agalmic.ltd> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Allow to specify ID when adding to the FAISS vectorstore
This change allows unique IDs to be specified when adding documents / embeddings to a faiss vectorstore.
This commit fixes #5065 and #3896 and should fix #2699 indirectly. I've tested adding and merging.
Kindly tagging @Xmaster6y @dev2049 for review.