parallel iteration detail log bug report #11117

JungHyunMoon · 2024-11-26T06:32:30Z

Self Checks

This is only for bug report, if you would like to ask a question, please head to Discussions.
I have searched for existing issues search for existing issues, including closed ones.
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template :) and fill in all the required fields.

Dify version

0.12

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Receive multiple images from the user.
Use the OCR API to extract text from each image.
Use the extracted text to perform a knowledge-based search to retrieve related documents.
Retrieve 5 related documents (Top-k = 5).
Perform masking and relevance judgment (isRelevant) on each of the retrieved documents.
In the execution log, the iteration should show 3 iterations for 3 images, and 5 iterations for the 5 retrieved documents.
However, when checking the detailed tracing logs, the number of iterations is incorrectly displayed as 8 in the second iteration, where the previous OCR API call count (3) is added, resulting in an incorrect total iteration count.

✔️ Expected Behavior

The OCR API should be called once for each input image.
After searching for related documents, 5 documents should be retrieved (Top-k = 5).
The log should show 3 iterations for the 3 images, and 5 iterations for the 5 retrieved documents.

❌ Actual Behavior

In the detailed tracing logs, the second iteration incorrectly adds the OCR API call count (3), resulting in a total of 8 iterations instead of 5 for document processing.
This discrepancy does not appear in the preview logs, but it is visible in the tracing logs.

dosubot · 2024-11-26T06:33:49Z

The issue you're experiencing with the iteration count in the detailed tracing logs seems to be related to how iteration metadata is being handled and displayed. In Dify, iteration counts are managed through events and metadata, specifically using the IterationNode class, which generates events like IterationRunStartedEvent and IterationRunNextEvent to track iteration progress ^[1].

The problem might be occurring due to incorrect handling of iteration metadata, where the OCR API call count is being incorrectly added to the document processing iteration count. This could be a bug in how the iteration index is being updated or managed in parallel execution paths.

To address this, you might want to check the implementation of the IterationNode class and how it handles event metadata, particularly focusing on the _handle_event_metadata method and the VariablePool management. Ensuring that the iteration index is correctly reset or managed between different stages (OCR and document processing) could resolve the discrepancy in the iteration count.

To continue talking to Dosu, mention @dosu.

JungHyunMoon · 2024-11-26T06:40:33Z

@dosu
The option for max parallelism is set to 2 for iteration before and 3 for iteration after.

Additional questions: YML files are too large and contain private information, so how do you get them to the author?

dosubot · 2024-11-26T06:42:49Z

The max parallelism settings you've mentioned (2 for the iteration before and 3 for the iteration after) could be affecting how iterations are processed and logged. When parallelism is enabled, iterations can run concurrently, which might lead to complexities in how iteration counts are tracked and logged, especially if there are issues with synchronization or metadata handling ^[1].

Regarding sharing YML files that are large and contain private information, here are some best practices:

Anonymize Data: Remove or obfuscate any sensitive information before sharing the files. This can include replacing private data with placeholders or generic data.
Use Secure Channels: Share the files through secure, encrypted channels to ensure that they are not intercepted or accessed by unauthorized parties.
Limit Access: Only share the files with individuals who absolutely need access, and ensure that they understand the sensitivity of the information.
File Encryption: Encrypt the YML files before sharing them. This adds an additional layer of security, ensuring that only those with the decryption key can access the contents.
Version Control: If possible, use a version control system with access controls to manage who can view or edit the files.

These steps can help you securely share YML files while protecting private information.

To continue talking to Dosu, mention @dosu.

laipz8200 · 2024-11-26T07:46:04Z

Did any errors occur during the execution process, or are the errors only shown in the logs?

JungHyunMoon · 2024-11-26T07:58:12Z

@laipz8200
I suspect it only happens in logs

Attached a screen shot in English for your understanding

dosubot bot added the 🐞 bug Something isn't working label Nov 26, 2024

laipz8200 assigned Nov1c444 Nov 26, 2024

laipz8200 removed the 🐞 bug Something isn't working label Nov 26, 2024

Nov1c444 mentioned this issue Nov 27, 2024

fix: Incorrect iteration log display in workflow with multiple parallel mode iteartaion nodes #11158

Merged

5 tasks

Nov1c444 closed this as completed in #11158 Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallel iteration detail log bug report #11117

parallel iteration detail log bug report #11117

JungHyunMoon commented Nov 26, 2024

dosubot bot commented Nov 26, 2024

JungHyunMoon commented Nov 26, 2024

dosubot bot commented Nov 26, 2024

laipz8200 commented Nov 26, 2024

JungHyunMoon commented Nov 26, 2024 •

edited

Loading

parallel iteration detail log bug report #11117

parallel iteration detail log bug report #11117

Comments

JungHyunMoon commented Nov 26, 2024

Self Checks

Dify version

Cloud or Self Hosted

Steps to reproduce

✔️ Expected Behavior

❌ Actual Behavior

dosubot bot commented Nov 26, 2024

JungHyunMoon commented Nov 26, 2024

dosubot bot commented Nov 26, 2024

laipz8200 commented Nov 26, 2024

JungHyunMoon commented Nov 26, 2024 • edited Loading

JungHyunMoon commented Nov 26, 2024 •

edited

Loading