Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Vault QA Hybrid Search (I): retrieval by explicit note title #331

Closed
logancyang opened this issue Mar 5, 2024 · 11 comments
Closed

Comments

@logancyang
Copy link
Owner

logancyang commented Mar 5, 2024

Right now if you ask the AI to do something with a [[note title]] directly in Vault QA, it does not work every time. This is because it uses embedding search.

The solution is to parse the user message for the note title and do a full-text search.

  • [[note title]] retrieval: when there are explicit items of [[note title]] present in the user query, the notes mentioned should be included at the top of the sources list

Next step:

  • Salient term extraction + full-text search
@logancyang logancyang changed the title Vault QA Hybrid Search [FR] Vault QA Hybrid Search Mar 5, 2024
@wwjCMP
Copy link

wwjCMP commented Mar 11, 2024

I support your idea. Use LLM to read the question, let it generate search keywords, then call the full-text search function of obsidian, and finally use word embedding search in the search results. I think this is an effective solution.

@logancyang logancyang changed the title [FR] Vault QA Hybrid Search [FR] Vault QA Hybrid Search (I): retrieval by explicit note title Mar 12, 2024
@wwjCMP
Copy link

wwjCMP commented Mar 12, 2024

I think it is necessary to support obsidian's built-in search query. Only in this way can the two systems of LLM and PKM be connected.

@wwjCMP
Copy link

wwjCMP commented Mar 13, 2024

I have an idea. That is, do not send all the fragments directly to LLM. Instead, let LLM score each fragment according to its relevance to the question, and finally use the highly relevant fragments to answer the question. In this way we can weaken irrelevant corpus and use more corpus. I don't know if this is such a workflow now.

@logancyang
Copy link
Owner Author

I have an idea. That is, do not send all the fragments directly to LLM. Instead, let LLM score each fragment according to its relevance to the question, and finally use the highly relevant fragments to answer the question. In this way we can weaken irrelevant corpus and use more corpus. I don't know if this is such a workflow now.

This is actually a standard step in RAG called the LLM reranker. It's a good idea BUT it relies on predictable LLM behavior. If this plugin works by calling my backend where I set all the params for the users, there's not much of a problem. But this plugin is completely local with params set on the user side, which means someone may be using a <3B local LLM that's not fit for this task to run the reranking.

In short, albeit a good idea, I'd like to keep the moving parts minimal to avoid people using it the wrong way.

@wwjCMP
Copy link

wwjCMP commented Mar 14, 2024

At present, retrieving problem-related content from valut is a key issue, at least Gemini pro does not perform very well in this regard. You mean this standard step is not performed when using local model?

@logancyang
Copy link
Owner Author

@wwjCMP When you say Gemini pro does not perform very well, do you mean when irrelevant notes are retrieved, Gemini pro chat model has a low answer quality?

If I can use a fixed reranker of which the quality I can trust, reranking is definitely going to help a ton. Probably I should look into Cohere's new offerings like cmd R and stuff.

@wwjCMP
Copy link

wwjCMP commented Mar 14, 2024

@wwjCMP When you say Gemini pro does not perform very well, do you mean when irrelevant notes are retrieved, Gemini pro chat model has a low answer quality?

If I can use a fixed reranker of which the quality I can trust, reranking is definitely going to help a ton. Probably I should look into Cohere's new offerings like cmd R and stuff.

It mostly refuses to answer questions because it believes that the content provided is irrelevant to the question。

@wwjCMP
Copy link

wwjCMP commented Mar 15, 2024

I have found that when using LLM to explore our vault, excluding irrelevant information is more effective than providing relevant information

@logancyang
Copy link
Owner Author

@wwjCMP Great observation.

In the near term I'm going to make the "similarity threshold" a user setting so you can tune it up to exclude irrelevant docs. Since for different embedding models the threshold can be different, it will need the user to experiment a bit.

This is the downside of customizability - I don't have much control over how the user wants to use it, hence it requires more know-how from the user. If instead, I provide a fixed setting and fixed provider, I can control how it works much better.

@wwjCMP
Copy link

wwjCMP commented Mar 22, 2024

@wwjCMP Great observation.

In the near term I'm going to make the "similarity threshold" a user setting so you can tune it up to exclude irrelevant docs. Since for different embedding models the threshold can be different, it will need the user to experiment a bit.

This is the downside of customizability - I don't have much control over how the user wants to use it, hence it requires more know-how from the user. If instead, I provide a fixed setting and fixed provider, I can control how it works much better.

https://github.com/logancyang/obsidian-copilot/releases/tag/2.5.2
Now we can specify note in vault QA through [[ ]], I wonder if we can exclude note through [[ ]]?

@wwjCMP
Copy link

wwjCMP commented Apr 16, 2024

@wwjCMP When you say Gemini pro does not perform very well, do you mean when irrelevant notes are retrieved, Gemini pro chat model has a low answer quality?

If I can use a fixed reranker of which the quality I can trust, reranking is definitely going to help a ton. Probably I should look into Cohere's new offerings like cmd R and stuff.

Is it possible to use a local rerank model? I'm not sure if ollama currently supports rerank models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants