Use base64 radix to support 5e11 to -5e11 #171

baldawar · 2024-07-26T18:12:03Z

Issue #, if available: #42

Description of changes:

This change move from hex to base64 for numbers and subsequently allows to now support 5e11 to -5e11 numberic range. We still use hex for CIDR, and so selectively change to the older hex based digit system when necessary.

In case it matters, 5e11 was picked intentionally as 5e11 * 1e6 is the most we'd be able to support through Long primitive. Beyond we trigger integer overflows. As solution, we either have to find a different normalization method, or switch to BigDecimal. I expect BigDecimal math to have a performance hit. I chose to not tackle these in the interest of time.

Apologies for any silly errors in the diff. This change was a bit rushed.

Benchmark / Performance (for source code changes):

Running software.amazon.event.ruler.Benchmarks
Reading citylots2
Read 213068 events
Finding Rules...
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Lines: 213068, Msec: 11511
Events/sec: 18509.9
 Rules/sec: 129569.6
Reading citylots2
Read 213068 events
EXACT events/sec: 274926.5
WILDCARD events/sec: 187559.9
PREFIX events/sec: 273866.3
PREFIX_EQUALS_IGNORE_CASE_RULES events/sec: 265340.0
SUFFIX events/sec: 265670.8
SUFFIX_EQUALS_IGNORE_CASE_RULES events/sec: 269706.3
EQUALS_IGNORE_CASE events/sec: 238331.1
NUMERIC events/sec: 148066.7
ANYTHING-BUT events/sec: 144747.3
ANYTHING-BUT-IGNORE-CASE events/sec: 146037.0
ANYTHING-BUT-PREFIX events/sec: 155410.6
ANYTHING-BUT-SUFFIX events/sec: 153617.9
ANYTHING-BUT-WILDCARD events/sec: 157014.0
COMPLEX_ARRAYS events/sec: 32549.3
PARTIAL_COMBO events/sec: 49933.9
COMBO events/sec: 20397.1

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

baldawar · 2024-07-26T18:57:37Z

Previous benchmarks from #170 are roughly in the same ballpark. I don't see anything pointing to memory utilization going higher as well, but will run specific benchmarks for them once again.

Running software.amazon.event.ruler.Benchmarks
Reading citylots2
Read 213068 events
Finding Rules...
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Lines: 213068, Msec: 11288
Events/sec: 18875.6
 Rules/sec: 132129.3
Reading citylots2
Read 213068 events
EXACT events/sec: 268010.1
WILDCARD events/sec: 180872.7
PREFIX events/sec: 264352.4
PREFIX_EQUALS_IGNORE_CASE_RULES events/sec: 268686.0
SUFFIX events/sec: 263046.9
SUFFIX_EQUALS_IGNORE_CASE_RULES events/sec: 274218.8
EQUALS_IGNORE_CASE events/sec: 236217.3
NUMERIC events/sec: 145141.7
ANYTHING-BUT events/sec: 141385.5
ANYTHING-BUT-IGNORE-CASE events/sec: 145339.7
ANYTHING-BUT-PREFIX events/sec: 154509.1
ANYTHING-BUT-SUFFIX events/sec: 152956.2
ANYTHING-BUT-WILDCARD events/sec: 160927.5
COMPLEX_ARRAYS events/sec: 32011.4
PARTIAL_COMBO events/sec: 49957.3
COMBO events/sec: 19639.4

…rdersThemCorrectly

timbray · 2024-07-31T00:08:14Z

src/main/software/amazon/event/ruler/ByteMachine.java

@@ -325,7 +325,7 @@ private void deleteRangePattern(Range range) {

        // when bottom byte on forkOffset position < top byte in same position, there must be matches existing


You demonstrate courage by wading into this code

its been the best source of learning a whole lot more of neat tricks within ruler.

timbray · 2024-07-31T00:10:48Z

src/main/software/amazon/event/ruler/ComparableNumber.java

 */
 class ComparableNumber {
+    // Use scientific notation to define the double number directly to avoid losing Precision by calculation


nice useful comment

baldawar added 5 commits July 25, 2024 22:22

support for 1e11. backing up data

f4bb22b

Adding 5e11 support. pending docs and code cleanup

e28b94d

code cleanup for digit sequence

1c2db8a

Readme update

655483a

version bump

0aa0fcf

Add more testcases for WHEN_CompareIsPassedComparableNumbers_THEN_ItO…

f7cfc23

…rdersThemCorrectly

baldawar marked this pull request as ready for review July 26, 2024 22:49

baldawar changed the title ~~[WIP] Use base64 radix to support 5e11~~ Use base64 radix to support 5e11 to -5e11 Jul 26, 2024

timbray approved these changes Jul 31, 2024

View reviewed changes

baldawar merged commit d04e3f0 into main Aug 6, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use base64 radix to support 5e11 to -5e11 #171

Use base64 radix to support 5e11 to -5e11 #171

baldawar commented Jul 26, 2024 •

edited

Loading

baldawar commented Jul 26, 2024

timbray Jul 31, 2024

baldawar Aug 6, 2024

timbray Jul 31, 2024

		@@ -325,7 +325,7 @@ private void deleteRangePattern(Range range) {

		// when bottom byte on forkOffset position < top byte in same position, there must be matches existing

Use base64 radix to support 5e11 to -5e11 #171

Use base64 radix to support 5e11 to -5e11 #171

Conversation

baldawar commented Jul 26, 2024 • edited Loading

Issue #, if available: #42

Description of changes:

Benchmark / Performance (for source code changes):

baldawar commented Jul 26, 2024

timbray Jul 31, 2024

Choose a reason for hiding this comment

baldawar Aug 6, 2024

Choose a reason for hiding this comment

timbray Jul 31, 2024

Choose a reason for hiding this comment

baldawar commented Jul 26, 2024 •

edited

Loading