-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VectorDB support (pgvector) for archival memory #226
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NICE WORK!!
Can we make it possible to pass a connection string instead and would be nice to set overrides for the table and column names i have preloaded vectors.
Also i saw a comment that says it will mess up existing data but didnt look why... thats not ideal.
I will try to test tomorrow and add some experience feedback.
@silence48 what kind of database do you have? I'm using the LlamaIndex wrappers for retrieving from Postgres/Chroma for now, but it seems like they dont allow for configuring the column names (run-llama/llama_index#6058). But we can add a non-LlamaIndex connector that is more configurable. Another issue with bringing your own table is that I'm not sure how we should store new archival memories generated by the agent. Presumably you don't want MemGPT to insert new documents into your existing table? There's two possible solutions I can think of for this:
|
This would work for me i think. Its in a postgres database but i have 2 tables with vectors one called codevecs and one called docvecs. I had set it up following the readme in the pgvector repo. |
Addresses #142
To-Dos:
memgpt load
into pgvectormemgpt list sources
to look at DB tablesCLI Updates
Configuration
When running
memgpt configure
, user will have the option to choose betweenlocal/postgres/chroma
, and provide their chroma/postgres URI which will be used to store data. This will configure the data backend that MemGPT uses to save and read archival storage.NOTE: Users can swap out the data backend, but a previously loaded data source or saved agent will no longer be accessible until they switch back to the original backend.
NOTE: Do not add a storage backend to MemGPT unless you are okay with MemGPT writing new tables/data.
Importing data
Imported data and agent archival data will be stored in the chosen data backend, not just locally.
So if Postgres is the data backend, the following command will load data into Postgres:
An agent can access the data in multiple ways. Both methods will copy the attached data into the agent's index.
memgpt run
).Once data is attached to an agent, it cannot be removed. We can address this in a future issue.
Loading from a VectorDB
Users should not provide their vector DB as a storage backend unless they are ok with MemGPT writing to the DB. If you want to make read-only data accessible to MemGPT, you must load the data via:
Future Issues