Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a cache eviction policy:Evicting cache data by the slowest markDeletedPosition #14985

Merged
merged 8 commits into from
Apr 6, 2022

Conversation

lordcheng10
Copy link
Contributor

@lordcheng10 lordcheng10 commented Apr 1, 2022

Motivation

We found a problem: even if we adjusted the cache eviction size and time a lot, there would still be some misscaches when the client restarted.The corresponding configuration is as follows:
managedLedgerCacheSizeMB=40960
managedLedgerCacheEvictionTimeThresholdMillis=1800000

The current cache eviction strategy is to evict according to the read position, so there may be a miss cache when the data is pushed again.

In our scenario, there are tens of thousands of clients, and they are deployed on k8s. In order to save resources, they may dynamically expand or shrink, so that data re-pushing will happen from time to time:
image
image

Modifications

Therefore, to improve the cache hit rate , I think a new cache eviction strategy should be added: evict data according to the markdelete position.

Documentation

Check the box below or label this PR directly (if you have committer privilege).

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    (Please explain why)

  • doc

    (If this PR contains doc changes)

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Apr 1, 2022
@lordcheng10
Copy link
Contributor Author

@eolivelli @Technoboy- @codelipenghui
PTAL,thanks!

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, especially if you have some numbers to share.

Can you please add a test case ?

cc @merlimat @rdhabalia WDTY?

@lordcheng10
Copy link
Contributor Author

Make sense, especially if you have some numbers to share.

Can you please add a test case ?

cc @merlimat @rdhabalia WDTY?

OK ,I will fix!

Copy link
Contributor

@Jason918 Jason918 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. IMO, markDeletedPosition is better than readPosition in normal case. Maybe we can use markDeletedPosition by default, and use readPosition iff overall cache usage is over some threshold (like 85%)?

@lordcheng10
Copy link
Contributor Author

lordcheng10 commented Apr 2, 2022

markDeletedPosition is better than readPosition in normal case. Maybe we can use markDeletedPosition by default.

OK, I aggree.

@aloyszhang
Copy link
Contributor

Make sense to me.
A test to cover this is better.

@lordcheng10
Copy link
Contributor Author

use readPosition iff overall cache usage is over some threshold (like 85%)?

In the scenario of message re-pushing, this way of dynamically adjusting the eviction policy is not suitable. I can reopen a discussion about dynamically adjusting the eviction policy later.

@lordcheng10
Copy link
Contributor Author

lordcheng10 commented Apr 2, 2022

Make sense. IMO, markDeletedPosition is better than readPosition in normal case. Maybe we can use markDeletedPosition by default, and use readPosition iff overall cache usage is over some threshold (like 85%)?

Fixed. PTAL,thanks! @Jason918 @eolivelli @aloyszhang
1.add unit test;
2.fix evictionPos: use markDeletedPosition.next;

@lordcheng10 lordcheng10 requested a review from eolivelli April 2, 2022 08:05
@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

2 similar comments
@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

7 similar comments
@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

5 similar comments
@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

/pulsarbot run-failure-checks

@lordcheng10
Copy link
Contributor Author

@eolivelli PTAL,thanks!

@lordcheng10
Copy link
Contributor Author

ping

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eolivelli eolivelli merged commit 9b36dcd into apache:master Apr 6, 2022
@codelipenghui codelipenghui added this to the 2.11.0 milestone Apr 6, 2022
@codelipenghui codelipenghui added type/feature The PR added a new feature or issue requested a new feature area/broker labels Apr 6, 2022
aparajita89 pushed a commit to aparajita89/pulsar that referenced this pull request Apr 18, 2022
Nicklee007 pushed a commit to Nicklee007/pulsar that referenced this pull request Apr 20, 2022
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jul 7, 2022
nicoloboschi added a commit to datastax/pulsar that referenced this pull request Jul 8, 2022
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jul 8, 2022
dragonls pushed a commit to dragonls/pulsar that referenced this pull request Oct 21, 2022
congbobo184 pushed a commit that referenced this pull request Nov 17, 2022
congbobo184 pushed a commit that referenced this pull request Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker cherry-picked/branch-2.10 doc-not-needed Your PR changes do not impact docs release/2.10.3 type/feature The PR added a new feature or issue requested a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants