-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(parser): cache regex predicates with rc using router attribute #251
base: main
Are you sure you want to change the base?
Conversation
3e66902
to
2e13e8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer this solution:
- LRU cache makes the performance less stable.
- The thread_local solution should save more memory for more than one router instance, this is not the typical usage of the router, and I personally don't like keeping a global state in a standalone library.
I think this approach is fine, I'd like to discuss one thing. The current implementation leaks the caching logical during the parsing, It works, but it is not as elegant as it could be for the type designing as AST should just describe the syntax. Do you have any idea to make it better?
Maybe we can just store the regex string in AST and build that later.
Do we need to take into account things like this: Aka in Lua PCRE regex caches we have a upper bound. |
Let's say the customer configured 10 regexes, and our upper bound is 5, which means that we have to build regex on matching; it slows down the router-matching. I think the reason OpenResty sets an upper bound is that OpenResty doesn't know the lifetime of each regex, but it doesn't want to build a regex each time, so it has to set an upper bound. For atc-router, we know the lifetime of each regex, so the upper bound is not required. It makes sense to build N regexes in memory if the customer configured N regexes, it is an acceptable cost and makes the matching speed more stable. |
2e13e8f
to
e78ee56
Compare
Hey @ADD-SP :) ! The only other approaches I saw were in the other PRs linked in the description at the top. |
@nowNick Could you rebase this PR? |
@nowNick Could you resolve merge conflicts? |
@ADD-SP I need a little bit more time to resolve it. The changes with CPU locality optimizations are a little bit tricky to incorporate in this PR. |
c6957bb
to
5d6cf68
Compare
} | ||
|
||
fn release_cache(cir: &CirProgram, router: &mut Router) { | ||
cir.instructions.iter().for_each(|instruction| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since instructions are now just in an array I can simply iterate through it and release regexes when I encounter them. Previously I had to traverse the tree of AST.
5d6cf68
to
de4143b
Compare
I've rebased the branch to use CIR program. From what I can see the memory improvents remained the same ~19times less memory for this benchmark but what is more we achieved further CPU optimizations. From 80ms to 45ms! (46% better) |
@xianghai2 @ADD-SP Could you help @nowNick review this once you have some chance? Not a release blocker. |
Description
This PR is one out of 3 proposed approaches how to optimize memory consumption for specific edge case with ATC Router. The scenario that is being considered is a Router that's defined with the same regular expression in different predicates. Currently Router does not have any ability to remember the regex passed resulting in a lot of copies of the same regex.
Approach in this PR
This PR proposes adding a special attribute to Router struct called
regex_cache
. It is passed down to the parser and allows the parser to either retrieve theValue::Regex
from cache or create a new one and store it there. This approach does not use any singleton pattern. The upside of this solution is easy to track state - no global state. The downside is the requirement to change a lot of functions to "drill" down theregex_cache
property to the place where it needs to be used.Benchmarks
The benchmarking method was to use the commit in this PR: #253 on top of each of these PRs. The memory benchmark was done using dhat crate and performance was measured with criterion crate.
Memory consumption:
Performance:
Other PRs Links:
Issue reference:
KAG-3182