Erudite says - Transform Your PDF Conversations!
Erudite is not just another chatbot, it's your intelligent assistant for navigating through multiple PDF documents. With Erudite, dive deep into your PDFs, extract relevant information, and have meaningful conversations about your documents. Powered by latest AI technologies, Erudite makes document interaction seamless and insightful.
Erudite, your AI-powered PDF ChatBot, is designed to help you interact with multiple PDF documents effortlessly. Utilizing state-of-the-art Large Language Models and sophisticated text processing techniques, Erudite enables you to upload, analyze, and converse about your documents, providing accurate and contextually relevant answers.
- Document Upload: Upload multiple PDF documents through our intuitive Streamlit interface.
- Text Extraction: The
get_pdf_text()
function extracts text from the PDFs using PyPDF2. - Text Chunking: Text is divided into manageable chunks with
get_text_chunks()
for better analysis. - Vectorization: Create vector representations of text chunks with
get_vectorstore()
using OpenAI embeddings and store them in a FAISS index for swift retrieval. - Conversation Chain: The
get_conversation_chain()
function sets up a dynamic conversational retrieval chain featuring:- A custom prompt template
- The ChatOpenAI language model
- A conversation memory buffer
- FAISS vector store for efficient document retrieval
- User Interaction: Ask questions about your PDFs through the Streamlit interface.
- Answer Generation: The
handle_userinput()
function processes queries, retrieves relevant information, and crafts responses using the conversation chain. - Source Attribution: Get source information with each response to trace back the data origins.
- Multi-PDF Uploads: Seamlessly handle multiple PDFs 📄
- Advanced Text Analysis: Extract and analyze document content 🔍
- Conversational Q&A: Ask detailed questions and get precise answers 💬
- Source Attribution: Track the origins of information 📚
- User-Friendly Interface: Enjoy a smooth experience with our Streamlit app 🖥️
- Streamlit: For creating an interactive user interface
- LangChain: For building the conversational AI pipeline
- OpenAI: For generating embeddings and powering the language model
- ChatGroq: For alternative language model
- FAISS: For fast similarity search and clustering of vectors
- Hugging Face: For open source embeddings and additional AI tools
- PyPDF2: For extracting text from PDFs
- Alternative Tools: Open source embeddings from HuggingFace, ChatGroq, Llama, Palm, NLTK etc.
Contributions are welcome! Feel free to fork the repository, make improvements, and submit pull requests to enhance Erudite’s capabilities.
Erudite is developed and maintained by Pramod Koujalagi. Connect with me to provide feedback, suggestions, or ideas for future enhancements.
Elevate your PDF interactions with Erudite and experience a new level of document engagement! 🌟📚
This project is licensed under the MIT License - see the LICENSE file for details.
Feel free to adjust any sections to better fit your needs or project specifics!