Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyse Retriever performance #1

Open
dasgoutam opened this issue Mar 20, 2024 · 0 comments
Open

Analyse Retriever performance #1

dasgoutam opened this issue Mar 20, 2024 · 0 comments
Assignees
Labels
analysis Analyse/comparative study of features component:chat Chat Back End

Comments

@dasgoutam
Copy link
Collaborator

In a standard RAG pipeline, the first 2 steps usually involve 'Storage' and 'Retrieval' -

  1. Storage - Load data from a source --> split data into chunks --> create embeddings --> Store in a data store(Vector DB)
  2. Retrieval - On a given query, retrieve chunks from the data store

The performance of the retrieval process should depend on the following parameters -

  • creating chunks in accordance to source data
  • type of embedding model used
  • retrieval mechanism used in the vector db

Using a single data source - 'muenchen-en', show results of how/whether performance varies on manipulating these parameters, and formulate an acceptable success criteria.

@dasgoutam dasgoutam added the analysis Analyse/comparative study of features label Mar 20, 2024
@dasgoutam dasgoutam self-assigned this Mar 20, 2024
@svenseeberg svenseeberg added this to the Answer Retrieval milestone May 22, 2024
@svenseeberg svenseeberg added the component:chat Chat Back End label Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analysis Analyse/comparative study of features component:chat Chat Back End
Projects
None yet
Development

No branches or pull requests

2 participants