Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Fix MessageDeduplication replay timeout cause topic loading stuck #23004

Merged

Conversation

TakaHiR07
Copy link
Contributor

@TakaHiR07 TakaHiR07 commented Jul 4, 2024

Fixes #23003

Motivation

As shown in the issue. fix the risk of topic loading stuck.

Modifications

  1. make takeSnapshot() async , and if topic loading timeout, topic close should wait until takeSnapshot finish. This would not influence the other area where using takeSnapshot()
  2. add test for this issue scene

Verifying this change

  • Make sure that the change passes the CI checks.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jul 4, 2024
@TakaHiR07 TakaHiR07 force-pushed the fix_dedup_replay_cause_topic_unavailable branch from 35df8ea to 3268d0d Compare July 4, 2024 12:54
@shibd shibd added release/3.3.1 release/3.0.6 release/3.2.4 type/bug The PR fixed a bug or issue reported a bug labels Jul 5, 2024
@shibd shibd self-requested a review July 5, 2024 01:10
@TakaHiR07 TakaHiR07 force-pushed the fix_dedup_replay_cause_topic_unavailable branch from 3268d0d to 8cb4559 Compare July 5, 2024 02:36
@codecov-commenter
Copy link

codecov-commenter commented Jul 5, 2024

Codecov Report

Attention: Patch coverage is 66.66667% with 3 lines in your changes missing coverage. Please review.

Project coverage is 73.44%. Comparing base (bbc6224) to head (155e0c3).
Report is 438 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #23004      +/-   ##
============================================
- Coverage     73.57%   73.44%   -0.13%     
- Complexity    32624    33298     +674     
============================================
  Files          1877     1910      +33     
  Lines        139502   143106    +3604     
  Branches      15299    15590     +291     
============================================
+ Hits         102638   105108    +2470     
- Misses        28908    29980    +1072     
- Partials       7956     8018      +62     
Flag Coverage Δ
inttests 27.61% <0.00%> (+3.03%) ⬆️
systests 24.70% <0.00%> (+0.37%) ⬆️
unittests 72.49% <66.66%> (-0.36%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...roker/service/persistent/MessageDeduplication.java 80.88% <66.66%> (-0.04%) ⬇️

... and 491 files with indirect coverage changes

@dao-jun dao-jun merged commit 41ef3f6 into apache:master Jul 5, 2024
51 checks passed
shibd pushed a commit that referenced this pull request Jul 6, 2024
…ding stuck (#23004)

Co-authored-by: fanjianye <fanjianye@bigo.sg>
(cherry picked from commit 41ef3f6)
shibd pushed a commit that referenced this pull request Jul 6, 2024
…ding stuck (#23004)

Co-authored-by: fanjianye <fanjianye@bigo.sg>
(cherry picked from commit 41ef3f6)
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Jul 10, 2024
…ding stuck (apache#23004)

Co-authored-by: fanjianye <fanjianye@bigo.sg>
(cherry picked from commit 41ef3f6)
(cherry picked from commit 0252671)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Jul 15, 2024
…ding stuck (apache#23004)

Co-authored-by: fanjianye <fanjianye@bigo.sg>
(cherry picked from commit 41ef3f6)
(cherry picked from commit 0252671)
@lhotari
Copy link
Member

lhotari commented Jul 29, 2024

@TakaHiR07 would you mind backporting this change to branch-3.2? The PR can be applied to branch-3.2, but the test TopicDuplicationTest.testFinishTakeSnapshotWhenTopicLoading fails consistently.

@TakaHiR07
Copy link
Contributor Author

TakaHiR07 commented Jul 29, 2024

@TakaHiR07 would you mind backporting this change to branch-3.2? The PR can be applied to branch-3.2, but the test TopicDuplicationTest.testFinishTakeSnapshotWhenTopicLoading fails consistently.

@lhotari The reason is pr-22860 not in branch-3.2. Do I change the test "testFinishTakeSnapshotWhenTopicLoading" in branch-3.2, or wait until cherry-pick pr-22860?

@lhotari
Copy link
Member

lhotari commented Jul 29, 2024

@lhotari The reason is pr-22860 not in branch-3.2. Do I change the test "testFinishTakeSnapshotWhenTopicLoading" in branch-3.2, or wait until cherry-pick pr-22860?

Thanks for checking that @TakaHiR07 . I'll cherry-pick PR 22860 to branch-3.2 and check if things pass after that.

lhotari pushed a commit that referenced this pull request Jul 29, 2024
…ding stuck (#23004)

Co-authored-by: fanjianye <fanjianye@bigo.sg>
(cherry picked from commit 41ef3f6)
@lhotari
Copy link
Member

lhotari commented Jul 29, 2024

@TakaHiR07 Cherry-picking PR 22860 fixed the issue. Thanks for the guidance.

nodece pushed a commit to ascentstream/pulsar that referenced this pull request Sep 4, 2024
…ding stuck (apache#23004)

Co-authored-by: fanjianye <fanjianye@bigo.sg>

(cherry picked from commit 41ef3f6)
Signed-off-by: Zixuan Liu <nodeces@gmail.com>
@lhotari lhotari added this to the 4.0.0 milestone Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug][broker] MessageDeduplication replay timeout would cause topic loading stuck and become unavailable
6 participants