Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When trying to reingest in privtegpt #1103

Closed
Musthafamoz opened this issue Oct 24, 2023 · 1 comment
Closed

When trying to reingest in privtegpt #1103

Musthafamoz opened this issue Oct 24, 2023 · 1 comment

Comments

@Musthafamoz
Copy link

Creating a new vectorstore
Loading documents from source_documents
Loading new documents: 100%|████████████████████| 48/48 [00:10<00:00, 4.71it/s]
Loaded 994 new documents from source_documents
Split into 11798 chunks of text (max. 166 tokens each)
Creating embeddings. May take some minutes...
Traceback (most recent call last):
File "C:\Users\mustafa\Desktop\moos\ingest.py", line 172, in
main()
File "C:\Users\mustafa\Desktop\moos\ingest.py", line 164, in main
db = Chroma.from_documents(texts, embeddings, persist_directory=persist_directory, client_settings=CHROMA_SETTINGS, client=chroma_client)
File "C:\Users\mustafa\Desktop\moos\privateGPT\lib\site-packages\langchain\vectorstores\chroma.py", line 612, in from_documents
return cls.from_texts(
File "C:\Users\mustafa\Desktop\moos\privateGPT\lib\site-packages\langchain\vectorstores\chroma.py", line 576, in from_texts
chroma_collection.add_texts(texts=texts, metadatas=metadatas, ids=ids)
File "C:\Users\mustafa\Desktop\moos\privateGPT\lib\site-packages\langchain\vectorstores\chroma.py", line 222, in add_texts
raise e
File "C:\Users\mustafa\Desktop\moos\privateGPT\lib\site-packages\langchain\vectorstores\chroma.py", line 208, in add_texts
self._collection.upsert(
File "C:\Users\mustafa\Desktop\moos\privateGPT\lib\site-packages\chromadb\api\models\Collection.py", line 298, in upsert
self._client._upsert(
File "C:\Users\mustafa\Desktop\moos\privateGPT\lib\site-packages\chromadb\api\segment.py", line 290, in _upsert
self._producer.submit_embeddings(coll["topic"], records_to_submit)
File "C:\Users\mustafa\Desktop\moos\privateGPT\lib\site-packages\chromadb\db\mixins\embeddings_queue.py", line 127, in submit_embeddings
raise ValueError(
ValueError:
Cannot submit more than 166 embeddings at once.
Please submit your embeddings in batches of size
166 or less.

@imartinez
Copy link
Collaborator

This was fixed in #1087
Please make sure you pull the latest version of main branch and try again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants