PageSage is a web application that uses Retrieval Augmented Generation (RAG) to allow users to communicate with a PDF document.
It works by vectorizing uploaded pdfs and then using a pre-trained model to generate text.
- The frontend is built with ReactJS and TypeScript.
- The frontend is built with FastAPI.
- Embeddings are generated using Langchain Sentence Transformers.
- The vector storage is using ChromaDB.
- The prompts are processed in context by Google Gemini API.
- Upload your PDF file using drag and drop.
- The PDF is being uploaded to the backend, split into chunks, and stored in the vector database.
- The text area becomes enabled, and you can send queries to the backend which will be answered in the context of the uploaded document.
- Clone the repo
cd
into thefrontend
directory, and run the commandnpm i
.cd
into thebackend
directory, and runpipenv install
,pipenv shell
and thenfastapi dev
.- If you don't have
pipenv
installed, runpip install pipenv
- You should have a
.env
file, with the api key set as shown in the.env.example
file.