Chat with your PDF

This project allows you to engage in interactive conversations with your PDF documents using LangChain, ChromaDB, and OpenAI's API. With this powerful combination, you can extract valuable insights and information from your PDFs through dynamic chat-based interactions.

Article - https://medium.com/@yash9439/unleashing-the-power-of-chat-with-multiple-pdfs-a-beginners-guide-fa1b6d4d6b89

Architecture

The architecture of this project involves several components working together:

LangChain: It serves as the interface for communication with OpenAI's API. LangChain handles rephrasing, retrieves relevant text chunks, and manages the conversation flow.
ChromaDB: A vector database used to store and query high-dimensional vectors. It helps in efficiently searching for and retrieving relevant text chunks during conversations.
OpenAI's API: The API provides access to OpenAI's language models, such as GPT-3.5 Turbo. It processes prompts, generates responses, and incorporates retrieved text chunks to ensure accurate and context-aware conversations.

Getting Started

To get started with this project, follow the steps below:

Prerequisites

Python
Pipenv

Installation

Clone the repository:

git clone https://github.com/yash9439/chat-with-multiple-pdf

Navigate to the project directory:
```
cd chat-with-multiple-pdf
```
Install the required dependencies using Pipenv:
```
pipenv install
```
Activate the Pipenv shell:
```
pipenv shell
```
Create a .env file and replace OPENAI_API_KEY="sk-XXXXXXXXXXXXXXXXXXXXXXXXXXXX" with your OpenAI API key:
```
echo 'OPENAI_API_KEY="sk-XXXXXXXXXXXXXXXXXXXXXXXXXXXX"' > .env
```
Run merge script to combine all the PDF to chat with them simultaniously:
```
python src/merge.py
```
Run the ingestion script to parse and extract text from the PDF:
```
python src/ingest.py
```
Start the conversation script to interact with the PDF:
```
python src/chat-with-multiple-pdf.py
```

Useful Links

The pdf used here are a AI Development Index report 2023 and a research paper https://www.researchgate.net/publication/323498156_Artificial_Intelligence
OpenAI: OpenAI's platform provides access to powerful language models and APIs.
LangChain: LangChain is the library used for communication and interaction with OpenAI's API.
Chroma DB: Chroma DB is a vector database used to store and query high-dimensional vectors efficiently.

Feel free to explore this project and enhance it further to suit your needs. Enjoy chatting with your PDFs and extracting valuable insights!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src		src
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chat with your PDF

Architecture

Getting Started

Prerequisites

Installation

Useful Links

About

Releases

Packages

Languages

License

yash9439/chat-with-multiple-pdf

Folders and files

Latest commit

History

Repository files navigation

Chat with your PDF

Architecture

Getting Started

Prerequisites

Installation

Useful Links

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages