Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Knowledge Retrieval with external knowledge base #9142

Closed
5 tasks done
akkaz opened this issue Oct 10, 2024 · 12 comments · Fixed by #9241
Closed
5 tasks done

Knowledge Retrieval with external knowledge base #9142

akkaz opened this issue Oct 10, 2024 · 12 comments · Fixed by #9241
Assignees
Labels
🐞 bug Something isn't working

Comments

@akkaz
Copy link

akkaz commented Oct 10, 2024

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.9.1

Cloud or Self Hosted

Cloud

Steps to reproduce

I successfully connect an external knowledge base with the new feature provided in 0.9.1.

I can also test the knowledge base, in the dedicated section and it works fine.
image

When you try to insert the "Knowledge Retrieval" in a workwflow, you get this error
image

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

Copy link

dosubot bot commented Oct 10, 2024

To resolve the "'NoneType' object has no attribute 'provider'" error when inserting "Knowledge Retrieval" in a workflow using Dify version 0.9.1, ensure that the reranking_model is properly initialized before accessing its provider attribute. This error likely occurs in the following section:

if node_data.multiple_retrieval_config.reranking_mode == "reranking_model":
    reranking_model = {
        "reranking_provider_name": node_data.multiple_retrieval_config.reranking_model.provider,
        "reranking_model_name": node_data.multiple_retrieval_config.reranking_model.model,
    }

Make sure that node_data.multiple_retrieval_config.reranking_model is not None and is correctly set up with the necessary attributes (provider and model). You might need to add a check or initialize it properly before this code block.

Additionally, review the _convert_to_knowledge_retrieval_node function in the workflow_converter.py file to ensure that the multiple_retrieval_config and its reranking_model are set up correctly, especially when the retrieval mode is MULTIPLE [1][2].

To continue talking to Dosu, mention @dosu.

@dosubot dosubot bot added the 🐞 bug Something isn't working label Oct 10, 2024
@Yawen-1010
Copy link

@akkaz This error may be caused by not successfully configuring the Rerank Model but turning on the switch. We are working on fixing this issue. This feature can still be used normally. You can choose to temporarily turn off the rerank model switch, or configure an available rerank model.

@BGFGB
Copy link

BGFGB commented Oct 11, 2024

@akkaz This error may be caused by not successfully configuring the Rerank Model but turning on the switch. We are working on fixing this issue. This feature can still be used normally. You can choose to temporarily turn off the rerank model switch, or configure an available rerank model.

In my case, the issue occurred because the knowledge retrieval returned an empty value, despite the status being a successful 200. Upon reviewing the code, I identified a potential problem here:

.join(subquery, Dataset.id == subquery.c.dataset_id)
. After commenting out that line, the issue was resolved.

Additionally, the documentation mentions that metadata should be a string (https://docs.dify.ai/zh-hans/guides/knowledge-base/external-knowledge-api-documentation), but in practice, the backend treats it as a dictionary (

metadata: Optional[dict] = Field(default_factory=dict)
). These inconsistencies didn't raise any errors during recall tests, but in actual workflows, they caused failures with empty data being returned.

After fixing these two points, everything is functioning correctly. However, regarding the first issue, simply commenting out that line might introduce other problems, so a more robust solution may be needed.

@Yawen-1010
Copy link

@JohnJyong

@sepa85
Copy link

sepa85 commented Oct 11, 2024

For me also, enabling the rerank fixes the error, but always gives empty results.

@JohnJyong
Copy link
Collaborator

the cloud servcie has updated to the latest version ,pls try , thanks ~

@sepa85
Copy link

sepa85 commented Oct 12, 2024

I still have empty result, even if knowledge test gives results. I'm using cloud.

Screenshot_20241012-025710.jpg

Screenshot_20241012-025747.jpg

@BGFGB
Copy link

BGFGB commented Oct 12, 2024

the cloud servcie has updated to the latest version ,pls try , thanks ~

"I see that your fix has been merged into the main branch, but the cloud service is using Version 0.9.1-fix1, and it seems that the fix hasn't been integrated yet."

@BGFGB
Copy link

BGFGB commented Oct 12, 2024

@JohnJyong I locally merged your fix branch into version 0.9.1, and so far the issue appears to be resolved. Thank you for the fix!

@sepa85
Copy link

sepa85 commented Oct 14, 2024

After updating to version 0.9.2, I'm still encountering issues with the workflow.

Now, whenever I click "Run," I receive a "Rerank model is required" error, regardless of whether rerank is enabled or disabled.

Screenshot_20241014-231240~2.jpg

Screenshot_20241014-231235~2.jpg

@YIXIAO0
Copy link
Collaborator

YIXIAO0 commented Oct 15, 2024

Hello, thank you for bringing this issue to our attention. After reviewing it, it seems to be a frontend saving bug. For now, please try returning to the studio and then selecting the app. This should prevent the 'rerank model is required' issue from appearing. We'll address this bug in the next release. Thank you for your patience!

@sepa85
Copy link

sepa85 commented Oct 15, 2024

Going to studio and back to workflow seems to "fix" the front end issue, but the result array is still empty.

Screenshot_20241015-171020.jpg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants