A Langchain-based RAG application using Angle Embeddings and Cohere Reranker to fetch relevant emails based on query.

docker compose up
The app is served on localhost:5173/
The Gmail API is used to download emails into a specified mongo database.
Once set up, the credential files are to be placed in a folder titled res in backend
The default behavior is to collect all emails in the Primary folder in the last 60 days. This can be customized by modifying the get_emails script lines 49-52.
The custom Retriever is composed of a AnglE embedding model and a Pinecone Vector Database. Pinecone offers a free API Index, which was used to build this project.
The Cohere Re-ranker was used to re-rank 20 retrievals into 10 most relevant documents, which were stored in the Mongo database, and displayed in the web application.
The container serves a React App built using ViteJS containing custom CSS elements for aesthetic purposes. The app makes fetch
requests based on context and serves the result in a read-only Textbox. The API is served on LocalHost:5173/
.
The Flask API serves two distinct endpoints:
/get_emails
Used to invoke the Gmail API to collect emails from inbox./rag
Used to perform the Retrieval Pipeline.
The docker containers need increased memory to perform the retrieval tasks, failing which the backend container may crash.
The mongo v7.0 database interacts solely with the backend container, for storing all emails in a collection titled emails and retrieved emails in a collection titled event-emails.
Golden Retriever was a personal project I undertook after missing an event at Northeastern on a Wednesday in February, 2024. The project involves designing and developing a RAG pipeline based on personal research, containerized into three containers bridged by an underlying docker network and orchestrated using docker-compose. The query-based email retrieval step takes a bit of time at the moment. There are some security considerations such as passing of API Keys via REST API Calls, which are subject to attacks, but in this context, which is completely local, has been deemed to be sufficient. A LLM can be retrofitted using an additional line of python code, using Langchain ConversationalRetrievalChain
but has not been implemented as a part of the project due to compute restrictions of running it locally. LLMs such as Llama 2 7B has been tested and works as expected.