-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve recall speed #175432
Comments
Pinging @elastic/obs-knowledge-team (Team:obs-knowledge) |
Steps to set up search connector
|
Done in #176428 |
In #173710 and #172164 we extended the capabilities of recall by making it include data from Search Connectors and asking the LLM to score the documents.
From our experience, this gives better relevance to the Knowledge base hits included but comes at the cost of performance (time it takes to perform a recall).
We would like to see if we can strike a good balance between time and accuracy. The idea is that perhaps by asking the LLM to evaluate the documents in parallel we can make that part faster, there may be other things we could do like ask the LLM to only report back the IDs above a certain score to minimize the tokens being generated.
UPDATE: We should also try to ask the LLM to return only the index of the document, not it's actual ID, to see if we can save on a little bit of tokens that way.
AC
search-*
+ field_caps call, passing top 20 hits to LLM, LLM generating it's scores and output)The text was updated successfully, but these errors were encountered: