Ask your knowledge base

This project is a Document Question-Answering (QA) Chatbot built using the following technologies:

LangChain: For building the retrieval-augmented generation (RAG) pipeline.
Pinecone: As the vector store for document embedding storage and retrieval.
Amazon Bedrock:
- mistral.mixtral-8x7b-instruct: A powerful large language model (LLM) for generating responses.
- amazon.titan-embed-text-v1: An embedding model used to generate vector representations of the documents.
Streamlit: For creating an interactive frontend for document upload and chatting.

This chatbot allows users to upload documents, index them into Pinecone, and then ask questions that the chatbot answers using the uploaded content.

Features

Document Upload: Users can upload .pdf documents, which are processed and stored in a vectorized format.
Contextual Question Answering: Questions are answered using the uploaded documents as the primary source of context.
Streamlit Frontend: Provides a simple and user-friendly interface for document upload and chat functionality.
Persistent Vector Storage: Indexed documents are stored in Pinecone for efficient retrieval.
Robust LLM Integration: Utilizes Amazon Bedrock's state-of-the-art models for embeddings and language understanding.

Requirements

Prerequisites

Python 3.8+ installed on your system.
Pinecone Account: Sign up for Pinecone and get an API key.
Amazon Bedrock Access: Ensure you have access to Amazon Bedrock and necessary credentials.
Streamlit: Install Streamlit for running the frontend.

Python Libraries

Install the required libraries using pip:

pip install -r requirements.txt

Setup Instructions

1. Clone the Repository

git clone https://github.com/sayan-dg/ask-knowledge-base
cd ask-knowledge-base

2. Set Up Environment Variables

Create a .env file or set the following environment variables in your system:

export aws_access_key_id
aws_secret_access_key
pinecone_api_key

3. Initialize Pinecone

Log in to Pinecone, create an index named vectorstore, and note the environment.

4. Run the Application

Start the Streamlit app:

streamlit run app.py

Application Workflow

Document Upload:
- Upload a .pdf file.
- Press Process document
- The document is processed, vectorized using amazon.titan-embed-text-v1, and stored in Pinecone under a specific namespace.
Chat Interface:
- Enter your question in the input box.
- Press Generate response
- The chatbot retrieves relevant context from Pinecone and generates a response using mistral.mixtral-8x7b-instruct.

Technologies Used

Component	Technology
Frontend	Streamlit
LLM	mistral.mixtral-8x7b-instruct (Amazon Bedrock)
Embedding Model	amazon.titan-embed-text-v1 (Amazon Bedrock)
Vector Store	Pinecone
Backend Pipeline	LangChain

Contributing

Contributions are welcome! Please fork the repository, make changes, and submit a pull request.

License

This project is licensed under the Apache 2.0 License.

Contact

For any questions or issues, feel free to contact:
Your Name
Email: sayandg05@gmail.com
GitHub: sayan-dg

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md
app.py		app.py
bedrock_client.py		bedrock_client.py
embedding.py		embedding.py
llm.py		llm.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ask your knowledge base

Features

Requirements

Prerequisites

Python Libraries

Setup Instructions

1. Clone the Repository

2. Set Up Environment Variables

3. Initialize Pinecone

4. Run the Application

Application Workflow

Technologies Used

Contributing

License

Contact

About

Releases

Packages

Languages

License

sayan-dg/ask-knowledge-base

Folders and files

Latest commit

History

Repository files navigation

Ask your knowledge base

Features

Requirements

Prerequisites

Python Libraries

Setup Instructions

1. Clone the Repository

2. Set Up Environment Variables

3. Initialize Pinecone

4. Run the Application

Application Workflow

Technologies Used

Contributing

License

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages