Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cannot find highlights in the reference doc #469

Open
Ruoyu-y opened this issue Nov 6, 2024 · 0 comments
Open

[BUG] Cannot find highlights in the reference doc #469

Ruoyu-y opened this issue Nov 6, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@Ruoyu-y
Copy link

Ruoyu-y commented Nov 6, 2024

Description

After asking a question related to the doc i uploaded, the answer is quite relevant and accurate. However, there's no highlight showing on the reference in the information panel, which makes me hard to find the exact reference.
I could also see errors like this in the log:

CitationPipeline: {"evidences":"[\"CAGRA stands for Center of Analysis and Graphics Research.\", \"It focuses on advanced research in computer graphics, visualization, and related fields.\"]"}
1 validation error for CiteEvidence
evidences
  Input should be a valid list [type=list_type, input_value='["CAGRA stands for Cente..., and related fields."]', input_type=str]
    For further information visit https://errors.pydantic.dev/2.9/v/list_type

Any suggestion?

Reproduction steps

1. Setup the Kotaemon following the guide
2. Upload your own files
3. Ask a question related to the file
4. No highlights found

Screenshots

![DESCRIPTION](LINK.png)

Logs

User-id: 1, can see public conversations: True
Session reasoning type None
Session LLM None
Reasoning class <class 'ktem.reasoning.simple.FullQAPipeline'>
Reasoning state {'app': {'regen': False}, 'pipeline': {}}
Thinking ...
Retrievers [DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x748972b54310>, FSPath=PosixPath('/home/sdp/kotaemon/ktem_app_data/user_data/files/index_1'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x748972b56470>, get_extra_table=False, llm_scorer=LLMTrulensScoring(concurrent=True, normalize=10, prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x74894d3fef20>, system_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x74894d3fd6c0>, top_k=3, user_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x74894d3fce20>), mmr=False, rerankers=[CohereReranking(cohere_api_key='<COHERE_API_KEY>', model_name='rerank-multilingual-v2.0')], retrieval_mode='hybrid', top_k=10, user_id=1), GraphRAGRetrieverPipeline(DS=<theflow.base.unset_ object at 0x748af1dd25c0>, FSPath=<theflow.base.unset_ object at 0x748af1dd25c0>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset_ object at 0x748af1dd25c0>, VS=<theflow.base.unset_ object at 0x748af1dd25c0>, file_ids=[], user_id=<theflow.base.unset_ object at 0x748af1dd25c0>)]
searching in doc_ids ['9f0e4d1f-2f61-4f7a-8e3b-dab5ababf92f', '47f769f5-a12e-4543-9e99-9b05b2a1fd5e']
retrieval_kwargs: dict_keys(['do_extend', 'scope', 'filters'])
Number of requested results 100 is greater than number of elements in index 43, updating n_results = 43
Got 43 from vectorstore
Got 43 from docstore
Cohere API key not found. Skipping rerankings.
Got raw 10 retrieved documents
thumbnail docs 3 non-thumbnail docs 7 raw-thumbnail docs 0
retrieval step took 1.082975149154663
Got 10 retrieved documents
len (original) 24156
len (trimmed) 24156
Got 3 images
Trying LLM streaming
CitationPipeline: invoking LLM
CitationPipeline: finish invoking LLM
CitationPipeline: {"evidences":"[\"CAGRA stands for Center of Analysis and Graphics Research.\", \"It focuses on advanced research in computer graphics, visualization, and related fields.\"]"}
1 validation error for CiteEvidence
evidences
  Input should be a valid list [type=list_type, input_value='["CAGRA stands for Cente..., and related fields."]', input_type=str]
    For further information visit https://errors.pydantic.dev/2.9/v/list_type
LLM rerank scores [1.0, 1.0, 0.9, 0.9, 0.9, 0.9, 0.9, 0.8, 0.7, 0.7]
Got 0 cited docs

Browsers

Chrome

OS

Linux

Additional information

No response

@Ruoyu-y Ruoyu-y added the bug Something isn't working label Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant