Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tiered Caching] Performance improvement for TieredCaching #13989

Closed
sgup432 opened this issue Jun 5, 2024 · 2 comments · Fixed by #16047
Closed

[Tiered Caching] Performance improvement for TieredCaching #13989

sgup432 opened this issue Jun 5, 2024 · 2 comments · Fixed by #16047
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request Search:Performance v2.18.0 Issues and PRs related to version 2.18.0

Comments

@sgup432
Copy link
Contributor

sgup432 commented Jun 5, 2024

Is your feature request related to a problem? Please describe

The current implementation of TieredCaching.computeIfAbsent() is performing poorly which was discovered while running some performance benchmarks. Before we delve into the root cause, lets me elaborate on the current implementation.

Current implementation:

As of now TieredCaching implementation has onheap and disk tier which are individually thread safe. But we need to ensure that the TieredCache as a whole is thread safe and doesn't allow any key to be cached onto both tiers.

So we implemented a single global read/write lock(link) which ensures that any write operation is performed on TieredCache by only one thread at a time and allows multiple readers to access any cache via read lock.

Performance Issues with currrent approach:

  1. Considering we have a single write/read lock, it makes the TieredCache a single segmented cache and doesn't provide write throughout benefits of a segmented cache(like the underlying onheap cache which has 256 segments). Due to this only one thread can write to cache at a time and impacts write/read throughput.
  2. In case of any cache miss, we are recomputing the query under the write lock(link) which makes the other thread wait and makes the performance really worse.

Describe the solution you'd like

To address above performance issues:

1. Firstly we need to move out the query recompute part(during cache miss) outside the write lock in computeIfAbsent

This can be probably achieved by below(reusing Cache.java logic).

  • Create a concurrent hashmap of Map<Key, CompletableFuture>. This will temporally hold the queries for short duration.
  • Multiple requests for same key A lands in computeIfAbsent. key A is not present in Cache yet.
  • Only one thread will succeed putting a future object in the map for key A and rest will reuse the same future.
  • The thread will succeeded putting a future in the map will load the query. Rest of the threads will wait via future.get().
  • Once query is loaded, the desired thread will put the item onto onHeap cache via put call. Rest of the threads will return the value.
  • Remove the desired key from the map.

2. Avoid using a single read/write lock

Possible solutions:

  • Segmented cache: Instead of using a single write/read local and a single segment, we can use a segmented tiered cache to improve read/write performance.
  • Lock per key: Considering lock usage in TieredCache is to ensure that no same key lands onto both caches, we can considering creating locks per key and then delete them once the request completes.
    • There might be some overhead in creating locks frequently. (Will run performance tests)
    • One pros of this is that we don't really need to move the query recompute part of write lock considering other requests for different key will not be blocked.

Related component

Search:Performance

Describe alternatives you've considered

No response

Additional context

No response

@sgup432 sgup432 added enhancement Enhancement or improvement to existing feature or request untriaged labels Jun 5, 2024
@peternied
Copy link
Member

[Triage - attendees 1 2 3 4 5 6 7]
@sgup432 Thanks for creating this issue with great detail, could you create a pull request to address?

@sgup432
Copy link
Contributor Author

sgup432 commented Jun 6, 2024

This will also help with the performance - #14016

@sgup432 sgup432 self-assigned this Jun 6, 2024
@sgup432 sgup432 moved this to In Progress in Performance Roadmap Jun 6, 2024
@kkhatua kkhatua added the v2.18.0 Issues and PRs related to version 2.18.0 label Oct 7, 2024
@github-project-automation github-project-automation bot moved this from Now(This Quarter) to ✅ Done in Search Project Board Oct 8, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Performance Roadmap Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search:Performance v2.18.0 Issues and PRs related to version 2.18.0
Projects
Status: New
Status: Done
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants