Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization: removed level of indirection for SubRuleContext. #173

Merged
merged 3 commits into from
Oct 15, 2024

Conversation

svladykin
Copy link
Contributor

@svladykin svladykin commented Aug 9, 2024

Description of changes:

Instead of Double id use SubRuleContext directly everywhere. SubRuleContext id is only used internally for backward compatibility in equals and hashCode (not sure if this is actually needed, but tests check that). The benchmarks are not stable (need JMH with forking, proper warmup and multiple longer iterations for individual benchmarks), but overall can see 3-8% in throughput improvement on some benchmarks with this easy fix.

Using long as id instead of double because long is typically faster and actually can produce more unique 64-bit patterns than double because for long every 64-bit pattern is valid while for double it is not true (NaN).

Changed only types but did not rename variables all over the place because it would be harder to review, can be done later as needed.

Benchmark / Performance (for source code changes):

Read 213068 events
Finding Rules...
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Lines: 213068, Msec: 11459
Events/sec: 18593.9
 Rules/sec: 130157.6
Reading citylots2
Read 213068 events
EXACT events/sec: 274218.8
WILDCARD events/sec: 176526.9
PREFIX events/sec: 271078.9
PREFIX_EQUALS_IGNORE_CASE_RULES events/sec: 272465.5
SUFFIX events/sec: 260474.3
SUFFIX_EQUALS_IGNORE_CASE_RULES events/sec: 266002.5
EQUALS_IGNORE_CASE events/sec: 233883.6
NUMERIC events/sec: 145937.0
ANYTHING-BUT events/sec: 129919.5
ANYTHING-BUT-IGNORE-CASE events/sec: 142902.7
ANYTHING-BUT-PREFIX events/sec: 153507.2
ANYTHING-BUT-SUFFIX events/sec: 150365.6
ANYTHING-BUT-WILDCARD events/sec: 160322.0
COMPLEX_ARRAYS events/sec: 32078.9
PARTIAL_COMBO events/sec: 49037.5
COMBO events/sec: 20434.3

Overall the becnhmark results are not conclusive across diffrent jvm versions: the same benchmark can be noticeably better or worse, from what I see from main branch history they are quite flaky. Will try to improve benhcmarks in a separate PR.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@baldawar
Copy link
Collaborator

hey @svladykin is this ready to review? its' been in draft state which has me confused on the state of the PR.

private final Map<Object, Set<Double>> nameToIds = new ConcurrentHashMap<>();
private final Map<Double, Object> idToName = new ConcurrentHashMap<>();
private final Map<Object, Set<SubRuleContext>> nameToContext = new ConcurrentHashMap<>();
private long nextId;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be AtomicLong ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I see the only place where it is called is GenericMachine.addPatternRule() under synchronized(this), so AtomicLong does not make much sense.

@@ -175,12 +175,11 @@ boolean isEmpty() {
* @param pattern The pattern used by the sub-rule to transition to this NameState.
* @param isTerminal True indicates that the sub-rule is using pattern to match on the final event field.
*/
void addSubRule(final Object rule, final double subRuleId, final Patterns pattern, final boolean isTerminal) {
void addSubRule(final Object rule, final SubRuleContext subRuleId, final Patterns pattern, final boolean isTerminal) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if here and elsewhere in the class if rule can be fetched from subRuleId instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is possible, but looks like if we do that everywhere the patch will be much larger, at the same time I don't see much improvement from this change. Either option is fine with me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's an acceptable next step for this PR, though digging into JMH variances would be better place to focus right now.

@svladykin
Copy link
Contributor Author

@baldawar I did not like the Benchmark numbers for this PR, thus submitted a second one with JMH benchmarks, assuming that I will be able to get more meaningful perfromance numbers here when the second PR is merged. This is why it is still in draft status.

@svladykin svladykin force-pushed the id-self branch 2 times, most recently from af1fb23 to 4801b92 Compare September 17, 2024 04:49
@svladykin svladykin marked this pull request as ready for review September 17, 2024 04:50
@svladykin
Copy link
Contributor Author

Rebased on top of main to check JMH throughput numbers.

@baldawar
Copy link
Collaborator

baldawar commented Sep 19, 2024

odd, perf tests are mixed bag when compared to https://github.com/aws/event-ruler/actions/runs/10805542552/job/30018080464 . Is it because of most of the time is spent in other parts of the code (my recent tests showed ByteMachine.getTransitionOn and code tied to exists matcher to be quite expensive)

@svladykin
Copy link
Contributor Author

Yes, I don't see any throughput improvement as well. We can either consider this patch to be just a code cleanup or just close it and stop spending time on it. I'm good with either option.

I also noticed that json parsing takes around 60% of the time for simple rules, which makes me think that using custom binary format instead of json could help a lot.

@timbray
Copy link
Collaborator

timbray commented Sep 19, 2024

FWIW, Quamina has a custom hand crafted JSON parser for events, and the benefits of that were huge. But didn't bother for rule parsing.

@baldawar
Copy link
Collaborator

baldawar commented Sep 19, 2024

useful bookmark for the custom JSON parser :

Ruler's tests make parsing look as the worst offender because we keep adding / removing json entries to setup the tests. In the wild, most of the time ruler is limited by the time-spent doing array consistency checks, and parsing numbers. There's a fair amount of usage of exists and anything-but matchers which need cleanup (to follow how wildcard matcher was implemented). Optimizing JSON parsing will still be meaningful, but so far I've found most folks to be content with the current speed.

For this change, I think its worth merging if we can add a test that shows SubRuleContext is now faster. Benchmark would be over kill but unit test would be good enough with bunch of for-loops similar to ComparableNumberTest.java is good enough IMO.

@baldawar
Copy link
Collaborator

baldawar commented Oct 1, 2024

@svladykin let me know if you're still working on this.

@svladykin
Copy link
Contributor Author

Yes, was a bit busy lately, will catch up this week.

@svladykin
Copy link
Contributor Author

Added a simple performance test which takes ~2500ms on the old code and ~1400ms on the new code.

@baldawar baldawar enabled auto-merge (squash) October 14, 2024 23:29
@baldawar baldawar merged commit d098560 into aws:main Oct 15, 2024
4 checks passed
@baldawar
Copy link
Collaborator

Thanks @svladykin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants