Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DefaultRedisCacheWriter.clean() uses blocking KEYS command [DATAREDIS-1151] #1721

Closed
spring-projects-issues opened this issue May 21, 2020 · 9 comments
Assignees
Labels
in: cache RedisCache and CacheManager type: enhancement A general enhancement

Comments

@spring-projects-issues
Copy link

eyison opened DATAREDIS-1151 and commented

when I use clean method in class DefaultRedisCacheWriter ,it will excute keys command which is block,it may take a lot of time and cause other commands time out


Affects: 2.1.17 (Lovelace SR17), 2.2.7 (Moore SR7), 2.3 GA (Neumann)

Reference URL:

byte[][] keys = Optional.ofNullable(connection.keys(pattern)).orElse(Collections.emptySet())

Issue Links:

  • DATAREDIS-1113 Use SCAN instead KEYS when clean cache
    ("is duplicated by")
  • DATAREDIS-1176 org.springframework.data.redis.cache.DefaultRedisCacheWriter#clean method may ruin performance
    ("is duplicated by")
  • DATAREDIS-1152 change clean method keys command to scan in DefaultRedisCacheWriter
    ("is duplicated by")

Referenced from: pull request #532

1 votes, 4 watchers

@spring-projects-issues
Copy link
Author

chengchen commented

Hello, any chance to move this small fix forward? We would very like to have this improvement as well.

We have configured our Redis timeout at 1s, and cleaning the cache is causing timeout quite often.

It would be very appreciated if you could prioritize it, many thanks!

@spring-projects-issues
Copy link
Author

Mark Paluch commented

This is indeed a small change looking from a command perspective, but it introduces another level of complexity. KEYS + DEL is atomic while SCAN is not. Keys added while SCAN is in progress leaves entries in the cache. Additionally, we need to partition keys to not run into a timeout when issuing DEL with a huge number of keys. So from that perspective, it's a pretty significant change. Feel free to submit a pull request

@spring-projects-issues
Copy link
Author

chengchen commented

Mark Paluch Thanks for the quick feedback!

 KEYS + DEL is atomic

It seems to me that each operation is atomic but between 2 operations you still could leak some entries in the situation you described with SCAN, right? So this issue is only a bit amplified in SCAN IMHO.

 Additionally, we need to partition keys to not run into a timeout when issuing DEL with a huge number of keys

This is the main issue from my perspective. As different users configure different timeouts based on their criteria, it's probably better to leave the batch size configurable. For example, on our side we batch it with 10k keys to DEL, it won't timeout and yields relatively good performance.

 Feel free to submit a pull request.

There is already a PR proposed by someone, but there is some issue with it (DEL 1 by 1)

@spring-projects-issues
Copy link
Author

chengchen commented

Forgot to precise: In some cases, it's just not possible to load all the keys in 1 shot (it could simply consume too much memory), at least that's the case for us. That's why we are bypassing the Redis cache and removing entries by ourselves. Obviously, it would be better if it could be addressed by Spring Data Redis

@spring-projects-issues
Copy link
Author

chengchen commented

Mark Paluch I think this issue is not as minor as you think :) If the dataset is big enough, you either have timeout and / or OOM errors

@spring-projects-issues
Copy link
Author

shenjianeng commented

spring data redis 1.5.x , maintain known keys is good。

@spring-projects-issues
Copy link
Author

chengchen commented

Hello Mark Paluch, hope you are doing well.

I provided an example (not a definite PR) to show what's I have thought about improving this part:

https://github.com/spring-projects/spring-data-redis/pull/547/files

 

Even if we hacked our own way to remove entries for some cache with big quantity, we still rely on Spring Data Redis for some cache with small quantity, but now these cache starts to have this issue as well... So it's kind of important for us to solve this issue from the root. Hope you understand!

@spring-projects-issues
Copy link
Author

Mark Paluch commented

After reiterating a few times, it makes sense to reconsider this requirement. The Pull Request at #532 issues a DEL per key, https://github.com/spring-projects/spring-data-redis/pull/547 applies batching. PR 547 also shows the complexity that we need to introduce for batching.

It would make sense to have a way to configure the SCAN batch size and the option to retain the old KEYS behavior. Perhaps the best approach would be introducing a doWithKey method to RedisCacheWriter that accepts a BiFunction<String, RedisConnection> and that is guarded with a lock (if the cache writer is a locking one).

Then, we would extract the current KEY+DEL code and SCAN+DEL code into utility methods and custom implementation can choose from the possible variants

@spring-projects-issues spring-projects-issues added in: repository Repositories abstraction type: enhancement A general enhancement labels Dec 30, 2020
@mp911de mp911de added in: cache RedisCache and CacheManager and removed in: repository Repositories abstraction labels Apr 7, 2021
@mp911de
Copy link
Member

mp911de commented Apr 7, 2021

A potential way out could be introducing a clean strategy that either uses KEYS or SCAN.

@mp911de mp911de assigned mp911de and unassigned christophstrobl Apr 7, 2021
mp911de added a commit that referenced this issue Apr 21, 2021
We now support a configurable BatchStrategy for RedisCache. The implementations consist of KEYS (default) and SCAN. Since SCAN is not supported with Jedis Cluster and SCAN requires a batch size, we default to KEYS.

Closes #1721.
mp911de added a commit that referenced this issue Apr 21, 2021
We now support a configurable BatchStrategy for RedisCache. The implementations consist of KEYS (default) and SCAN. Since SCAN is not supported with Jedis Cluster and SCAN requires a batch size, we default to KEYS.

Closes #1721.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: cache RedisCache and CacheManager type: enhancement A general enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants