Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow updating chunk settings for the existing documents #12833

Merged
merged 5 commits into from
Jan 21, 2025

Conversation

kurokobo
Copy link
Contributor

Summary

This PR allows the chunk settings of existing documents to be changed via the Web GUI.

It includes the following changes:

  • API
    • Made the data_source of KnowledgeConfig optional.
    • This change is necessary because when modifying the chunk settings of existing documents, the POST to /documents will fail validation since data_source is not included.
  • Web
    • The hidden Save & Process button is now displayed.
    • Additionally, after pressing the Save button and finishing the embedding, the list of segments will be automatically updated by triggering resetList with usePathname.

Fixes #12794

Screenshots

Before After
image image

Checklist

Important

Please review the checklist below before submitting your pull request.

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

Tests

  • Before updating settings
    image
  • Change chunk length to 500 and save
    image
  • Start embedding
    image
  • Updated chunks are displayed without refreshing browser
    image

@kurokobo kurokobo marked this pull request as draft January 17, 2025 17:16
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Jan 17, 2025
@kurokobo
Copy link
Contributor Author

Will fix the lint errors and tests later😃

@dosubot dosubot bot added the 💪 enhancement New feature or request label Jan 17, 2025
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jan 17, 2025
@kurokobo kurokobo marked this pull request as ready for review January 17, 2025 17:55
@kurokobo
Copy link
Contributor Author

@douxc @WTW0313
Marked as ready for review. Please let me know if you have any concerns about these changes. Thanks!

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. size:M This PR changes 30-99 lines, ignoring generated files. labels Jan 17, 2025
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 21, 2025
@crazywoola crazywoola merged commit 3defd24 into langgenius:main Jan 21, 2025
7 checks passed
@WTW0313
Copy link
Collaborator

WTW0313 commented Jan 21, 2025

@douxc @WTW0313 Marked as ready for review. Please let me know if you have any concerns about these changes. Thanks!
#12794 (comment)

@kurokobo Actually, the Save & Process button missing is a bug. We were intended to disable all the settings except Chunk Settings and allow user to modify chunk settings. Thank you for the fix.🫡

@kurokobo kurokobo deleted the save_step_two branch January 21, 2025 03:41
Scorpion1221 added a commit to yybht155/dify that referenced this pull request Jan 21, 2025
* commit '6db3ae9b8ec2f8491e2c9355056a8693ecd67f47': (22 commits)
  chore: remove webapp ga (langgenius#12909)
  fix: variable panel scrollable (langgenius#12769)
  fix: OpenAI o1 Bad Request Error (langgenius#12839)
  Update deepseek model configuration (langgenius#12899)
  fix: external dataset hit test display issue(langgenius#12564) (langgenius#12612)
  add deepseek-reasoner (langgenius#12898)
  chore(fix): Invalid quotes for using Array[String] in HTTP request node as JSON body (langgenius#12761)
  fix: Issues related to the deletion of conversation_id (langgenius#12488) (langgenius#12665)
  chore(lint): fix quotes for f-string formatting by bumping ruff to 0.9.x (langgenius#12702)
  feat:Support Minimax-Text-01 (langgenius#12763)
  fix: serply credential check query might return empty records (langgenius#12784)
  feat: allow updating chunk settings for the existing documents (langgenius#12833)
  fix: SparkLite API Auth error (langgenius#12781) (langgenius#12790)
  fix: "parmas" spelling mistake. (langgenius#12875)
  Fix suggested_question_prompt (langgenius#12738)
  fix(i18n): correct typo in zh-Hant translation (langgenius#12852)
  chore: fix chinese translation for 'recall' (langgenius#12772)
  fix: DeepSeek API Error with response format active (text and json_object)  (langgenius#12747)
  feat: enhance credential extraction logic based on configurate method (langgenius#12853)
  fix: Fix rerank model switching issue (langgenius#12721)
  ...

# Conflicts:
#	api/core/tools/utils/message_transformer.py
#	api/poetry.lock
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💪 enhancement New feature or request lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

change and save Chunk Settings
3 participants