This project includes code files for updating a Pinecone index with PDF document data and querying the Pinecone vector store using GPT-3.5 for question answering.
Before running the project, make sure you have the following:
- Node.js installed on your machine
- Pinecone API credentials (API key, environment, and index name)
- OpenAI API credentials
-
Clone the project repository to your local machine.
-
Install the required dependencies by running the following command in the project directory:
npm install
- Create a .env file in the project directory and add the following environment variables:
OPENAI_API_KEY=<your-openai-api-key>
PINECONE_API_KEY=<your-pinecone-api-key>
PINECONE_ENVIRONMENT=<your-pinecone-environment>
PINECONE_INDEX=<your-pinecone-index-name>
To use the code files in this project, follow the instructions below:
npm run query <pdf-file-path> <question>
- Replace with the path to the PDF file you want to update the Pinecone index with. The code will split the PDF into chunks, embed the chunks using OpenAI's Embedding endpoint, and update the Pinecone index with the generated vectors.
- Replace with the question you want to ask. The code will query the Pinecone vector store using the provided question, retrieve the top matches, and if matches are found, ask GPT-3.5 for the answer.
Here are some sample queries you can run with the provided PDFs:
- This query will search for the author(s) of the study in the "quantom-computation.pdf" PDF.
npm run query /docs/quantom-computation.pdf "who wrote this study?"
- This query will provide an explanation of quantum computation suitable for a beginner, based on the "quantom-computation.pdf" PDF.
npm run query /docs/quantom-computation.pdf "Explain quantum computation to a first timer"
- This query will provide information about the definition of YouTube based on the "youtube.pdf" PDF.
npm run query /docs/youtube.pdf "what is YouTube?"
- This query will retrieve the references listed in the "socialmedia.pdf" PDF.
npm run query /docs/socialmedia.pdf "what is the definition of social media?"
- This query will provide the definition of social media based on the content of the "socialmedia.pdf" PDF.
npm run query /docs/socialmedia.pdf "What is the definition of social media?"
- This query will retrieve the names of the authors of the paper mentioned in the "Towards_Data_Science.pdf" PDF.
npm run query /docs/Towards_Data_Science.pdf "Who are the authors for this paper?"