Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(parser): cache regex predicates with rc using router attribute #251

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nowNick
Copy link

@nowNick nowNick commented Sep 19, 2024

Description

This PR is one out of 3 proposed approaches how to optimize memory consumption for specific edge case with ATC Router. The scenario that is being considered is a Router that's defined with the same regular expression in different predicates. Currently Router does not have any ability to remember the regex passed resulting in a lot of copies of the same regex.

Approach in this PR

This PR proposes adding a special attribute to Router struct called regex_cache. It is passed down to the parser and allows the parser to either retrieve the Value::Regex from cache or create a new one and store it there. This approach does not use any singleton pattern. The upside of this solution is easy to track state - no global state. The downside is the requirement to change a lot of functions to "drill" down the regex_cache property to the place where it needs to be used.

Benchmarks

The benchmarking method was to use the commit in this PR: #253 on top of each of these PRs. The memory benchmark was done using dhat crate and performance was measured with criterion crate.

Memory consumption:

Baseline (main) Router attribute (this) Thread local (#250) Memoize (#252)
Total (human-readable) 4.54 GB 🔴 242.5 MB 🏆 242.5 MB 🏆 299.6 MB
Total (exact) 4,544,275,926 bytes in 6,762,329 blocks 242,508,176 bytes in 1,922,359 blocks 242,508,176 bytes in 1,922,359 blocks 299,608,085 bytes in 1,962,821 blocks
At t-gmax 194,867,606 bytes in 382,216 blocks 4,713,936 bytes in 41,863 blocks 4,713,936 bytes in 41,863 block 60,538,606 bytes in 72,258 blocks
At t-end 0 bytes in 0 blocks 0 bytes in 0 blocks 108,034 bytes in 168 blocks 35,671,000 bytes in 42 blocks

Performance:

Baseline (main) Router attribute (this) Thread local (#250) Memoize (#252)
660.68 ms 🔴 78.834 ms 🏆 80.695 ms 86.169 ms

Other PRs Links:

Issue reference:

KAG-3182

@nowNick nowNick force-pushed the chore/cache-regex-predicates-with-rc---simple branch from 3e66902 to 2e13e8f Compare September 19, 2024 12:45
@nowNick nowNick changed the title perf(*): cache regex predicates with rc using router attribute perf(parser): cache regex predicates with rc using router attribute Sep 19, 2024
@nowNick nowNick marked this pull request as ready for review September 19, 2024 15:17
@nowNick nowNick requested review from samugi and dndx September 19, 2024 15:19
Copy link
Contributor

@ADD-SP ADD-SP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this solution:

  1. LRU cache makes the performance less stable.
  2. The thread_local solution should save more memory for more than one router instance, this is not the typical usage of the router, and I personally don't like keeping a global state in a standalone library.

I think this approach is fine, I'd like to discuss one thing. The current implementation leaks the caching logical during the parsing, It works, but it is not as elegant as it could be for the type designing as AST should just describe the syntax. Do you have any idea to make it better?

Maybe we can just store the regex string in AST and build that later.

@bungle
Copy link
Member

bungle commented Sep 20, 2024

Do we need to take into account things like this:
https://github.com/openresty/lua-nginx-module?tab=readme-ov-file#lua_regex_cache_max_entries

Aka in Lua PCRE regex caches we have a upper bound.

@ADD-SP
Copy link
Contributor

ADD-SP commented Sep 23, 2024

Do we need to take into account things like this: https://github.com/openresty/lua-nginx-module?tab=readme-ov-file#lua_regex_cache_max_entries

Aka in Lua PCRE regex caches we have a upper bound.

@bungle

Let's say the customer configured 10 regexes, and our upper bound is 5, which means that we have to build regex on matching; it slows down the router-matching.

I think the reason OpenResty sets an upper bound is that OpenResty doesn't know the lifetime of each regex, but it doesn't want to build a regex each time, so it has to set an upper bound.

For atc-router, we know the lifetime of each regex, so the upper bound is not required.

It makes sense to build N regexes in memory if the customer configured N regexes, it is an acceptable cost and makes the matching speed more stable.

src/router.rs Show resolved Hide resolved
src/parser.rs Outdated Show resolved Hide resolved
src/parser.rs Outdated Show resolved Hide resolved
@nowNick nowNick force-pushed the chore/cache-regex-predicates-with-rc---simple branch from 2e13e8f to e78ee56 Compare September 26, 2024 17:11
@nowNick nowNick requested a review from dndx September 27, 2024 06:48
@nowNick
Copy link
Author

nowNick commented Sep 27, 2024

Hey @ADD-SP :) !
I was thinking if there's a way to refactor it but unfortunately this is the only solution that comes to my mind. We have to find the first place that can hold an object that will be shared across add_matcher calls and that's the Router. We cannot store it in ast since it creates a new instance on each add_matcher call. Actually when you think of it regex_cache shares some similar logic to fields. They both store general information about the matchers that are shared across add_matcher calls. The first one remembers regexes the other one predicates.

The only other approaches I saw were in the other PRs linked in the description at the top.

src/router.rs Outdated Show resolved Hide resolved
@nowNick nowNick requested a review from ADD-SP October 8, 2024 09:11
@dndx
Copy link
Member

dndx commented Oct 16, 2024

@nowNick Could you rebase this PR?

@ADD-SP
Copy link
Contributor

ADD-SP commented Oct 31, 2024

@nowNick Could you resolve merge conflicts?

@nowNick
Copy link
Author

nowNick commented Oct 31, 2024

@ADD-SP I need a little bit more time to resolve it. The changes with CPU locality optimizations are a little bit tricky to incorporate in this PR.

@nowNick nowNick force-pushed the chore/cache-regex-predicates-with-rc---simple branch from c6957bb to 5d6cf68 Compare November 21, 2024 16:13
}

fn release_cache(cir: &CirProgram, router: &mut Router) {
cir.instructions.iter().for_each(|instruction| {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since instructions are now just in an array I can simply iterate through it and release regexes when I encounter them. Previously I had to traverse the tree of AST.

@nowNick nowNick force-pushed the chore/cache-regex-predicates-with-rc---simple branch from 5d6cf68 to de4143b Compare November 21, 2024 16:17
@nowNick
Copy link
Author

nowNick commented Nov 21, 2024

I've rebased the branch to use CIR program. From what I can see the memory improvents remained the same ~19times less memory for this benchmark but what is more we achieved further CPU optimizations. From 80ms to 45ms! (46% better)

@dndx
Copy link
Member

dndx commented Nov 22, 2024

@xianghai2 @ADD-SP Could you help @nowNick review this once you have some chance? Not a release blocker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants