Dataset comprises 12 publicly available documents related to insurance policies and campaigns from Sigortam.net. These documents present various contextual challenges, including some that contain numeric values associated with insurance costs and fees based on different vehicle information. Additionally, several documents include unstructured tables, which can complicate the retrieval and generation of accurate information in RAG-based applications.
The chunk_up_documents
function is designed to process PDF documents in a specified directory, chunking their text into smaller, manageable segments.
-
File Reading: The function iterates through all files in the given file_path, checking for PDF files. It uses PyPDFLoader to load the content of each PDF file and appends the loaded documents to a list.
-
Text Splitting: A RecursiveCharacterTextSplitter is initialized with specified parameters: chunk_size, which defines the maximum size of each text chunk, and chunk_overlap, which determines how much text from the end of one chunk overlaps with the beginning of the next. The splitting is done using defined separators (in this case, double newlines).
-
Returning Chunks: Finally, the function returns a list of chunked documents, allowing for further processing or analysis.
# Chunking Methodology
def chunk_up_documents(
file_path: str,
chunk_size: Optional[int] = 1000,
chunk_overlap: Optional[int] = 100
):
documents = []
for file in os.listdir(file_path):
if file.endswith(".pdf"):
pdf_path = file_path + file
loader = PyPDFLoader(pdf_path)
documents.extend(loader.load())
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
separators="\n\n",
chunk_size=chunk_size,
chunk_overlap=chunk_overlap
)
chunked_docs = text_splitter.split_documents(documents)
return chunked_docs
# Example Usage
document_chunks = chunk_up_documents(
file_path="/content/rag_documents/",
chunk_size=2000,
chunk_overlap=0
)
print(len(document_chunks))
FAISS had been used as the vectorstore across various optimization processes / comparison.
# Example Usage
faissdb_cohere = FAISS.from_documents(document_chunks, cohere_embedding)
faissdb_cohere.save_local("faiss_cohere")
RAG (Retrieval-Augmented Generation) optimization was performed, within a grid-search, using various methodologies— Stuff, Query Step-Down, Multi-Query, Contextual Compression, and Reciprocal —across different embedding models, including ada-002 (OpenAI), cohere-v3-multilingual (Cohere), and bge-en-small (BGE). A total of 378,286 tokens (including both prompt and completion) were processed to determine which RAG method and embedding model combination yielded the highest accuracy.
The performance comparison was based on the evaluation dataset available here, assessed by GPT-4, with answers generated using GPT-3.5-Turbo-0125. The evaluation focused on several LLM-based metrics, including Coherence, Conciseness, Contextual Accuracy, Helpfulness, and Relevance. To see detailed LangEval results
The test dataset consists of frequently asked questions sourced from the Sigortam.net website. Q&A pairs include question and answers relevant to ad-campaigns, promotions, as well as numerical values (fees, charges, etc.) that are essential for the accurate generation of responses in a RAG (Retrieval-Augmented Generation) system.
Embedding | RAG Method | Coherence | Conciseness | Cot Contextual Accuracy | Relevance | Helpfulness |
---|---|---|---|---|---|---|
OpenAI | Step-Down | 10.0 | 3.0 | 6.0 | 6.0 | 8.0 |
BGE | Step-Down | 12.0 | 4.0 | 8.0 | 7.0 | 11.0 |
Cohere | Step-Down | 11.0 | 3.0 | 8.0 | 7.0 | 9.0 |
BGE | Multi-Query | 12.0 | 5.0 | 9.0 | 8.0 | 12.0 |
Cohere | Reciprocal | 11.0 | 7.0 | 9.0 | 9.0 | 11.0 |
BGE | Stuff Method | 11.0 | 7.0 | 10.0 | 10.0 | 10.0 |
Cohere | Multi-Query | 12.0 | 7.0 | 10.0 | 9.0 | 10.0 |
BGE | Reciprocal | 12.0 | 6.0 | 10.0 | 10.0 | 11.0 |
OpenAI | Stuff Method | 11.0 | 7.0 | 10.0 | 7.0 | 11.0 |
Multi-Query | 12.0 | 5.0 | 11.0 | 8.0 | 11.0 | |
Reciprocal | 12.0 | 6.0 | 11.0 | 8.0 | 12.0 | |
BGE | Contextual Compression | 12.0 | 7.0 | 12.0 | 10.0 | 12.0 |
Cohere | Contextual Compression | 12.0 | 6.0 | 12.0 | 9.0 | 12.0 |
Stuff Method | 12.0 | 8.0 | 12.0 | 10.0 | 11.0 | |
OpenAI | Contextual Compression | 12.0 | 9.0 | 12.0 | 9.0 | 12.0 |
Embedding | RAG Method | P50 Latency | P99 Latency | Error Rate |
---|---|---|---|---|
BGE | Stuff Method | 2.42 | 4.55 | 0.0 |
OpenAI | Stuff Method | 2.53 | 4.48 | 0.0 |
Cohere | Stuff Method | 2.68 | 4.97 | 0.0 |
OpenAI | Contextual Compression | 2.71 | 4.75 | 0.0 |
Cohere | Contextual Compression | 3.58 | 6.11 | 0.0 |
OpenAI | Reciprocal | 3.78 | 7.15 | 0.0 |
Multi-Query | 3.91 | 5.81 | 0.0 | |
Cohere | Multi-Query | 4.72 | 12.62 | 0.0 |
Reciprocal | 4.77 | 9.90 | 0.0 | |
BGE | Reciprocal | 5.25 | 13.57 | 0.0 |
Multi-Query | 5.83 | 14.78 | 0.0 | |
OpenAI | Step-Down | 6.54 | 15.80 | 17.0 |
Cohere | Step-Down | 6.79 | 10.38 | 0.0 |
BGE | Step-Down | 7.28 | 12.48 | 0.0 |
Contextual Compression | 31.08 | 40.42 | 0.0 |