#5140: Tags merge optimization #4959

mstyura · 2024-04-10T21:11:33Z

I've observed unexpectedly significant CPU time spent in Tags merge function in one of the production services which rely on micrometer:

I've slightly refactored internals of Tags class to take advantage of tags set is being already sorted array, which could be then combined with other sorted set in linear time.

I've also added microbenchmark for tags concat operation to see the performance gain before I can test this change in production.

Below I'm providing results from Apple M2 Pro + JDK 21 machine:

Updated benchmark results:
Baseline (b410174 + ac33ae9):

Benchmark                               Mode  Cnt   Score   Error  Units
TagsBenchmark.dotAnd                    avgt    2  98.417          ns/op
TagsBenchmark.of                        avgt    2  45.250          ns/op
TagsBenchmark.tagsOfOrderedTagsSet10    avgt    2  69.997          ns/op
TagsBenchmark.tagsOfOrderedTagsSet2     avgt    2  22.632          ns/op
TagsBenchmark.tagsOfOrderedTagsSet4     avgt    2  32.953          ns/op
TagsBenchmark.tagsOfUnorderedTagsSet10  avgt    2  90.041          ns/op
TagsBenchmark.tagsOfUnorderedTagsSet2   avgt    2  23.264          ns/op
TagsBenchmark.tagsOfUnorderedTagsSet4   avgt    2  47.629          ns/op

Full JMH output

> Task :micrometer-benchmarks-core:io.micrometer.benchmark.core.TagsBenchmark.main()
# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.dotAnd

# Run progress: 0.00% complete, ETA 00:05:20
# Fork: 1 of 1
# Warmup Iteration   1: 90.438 ns/op
# Warmup Iteration   2: 87.639 ns/op
Iteration   1: 87.696 ns/op
Iteration   2: 109.139 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.dotAnd":
  98.417 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.of

# Run progress: 12.50% complete, ETA 00:04:41
# Fork: 1 of 1
# Warmup Iteration   1: 46.801 ns/op
# Warmup Iteration   2: 46.672 ns/op
Iteration   1: 45.042 ns/op
Iteration   2: 45.459 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.of":
  45.250 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.tagsOfOrderedTagsSet10

# Run progress: 25.00% complete, ETA 00:04:01
# Fork: 1 of 1
# Warmup Iteration   1: 70.377 ns/op
# Warmup Iteration   2: 70.155 ns/op
Iteration   1: 70.001 ns/op
Iteration   2: 69.993 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.tagsOfOrderedTagsSet10":
  69.997 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.tagsOfOrderedTagsSet2

# Run progress: 37.50% complete, ETA 00:03:21
# Fork: 1 of 1
# Warmup Iteration   1: 22.845 ns/op
# Warmup Iteration   2: 22.739 ns/op
Iteration   1: 22.661 ns/op
Iteration   2: 22.604 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.tagsOfOrderedTagsSet2":
  22.632 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.tagsOfOrderedTagsSet4

# Run progress: 50.00% complete, ETA 00:02:40
# Fork: 1 of 1
# Warmup Iteration   1: 34.573 ns/op
# Warmup Iteration   2: 32.922 ns/op
Iteration   1: 32.926 ns/op
Iteration   2: 32.979 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.tagsOfOrderedTagsSet4":
  32.953 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.tagsOfUnorderedTagsSet10

# Run progress: 62.50% complete, ETA 00:02:00
# Fork: 1 of 1
# Warmup Iteration   1: 96.586 ns/op
# Warmup Iteration   2: 90.373 ns/op
Iteration   1: 90.258 ns/op
Iteration   2: 89.825 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.tagsOfUnorderedTagsSet10":
  90.041 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.tagsOfUnorderedTagsSet2

# Run progress: 75.00% complete, ETA 00:01:20
# Fork: 1 of 1
# Warmup Iteration   1: 21.841 ns/op
# Warmup Iteration   2: 23.070 ns/op
Iteration   1: 23.304 ns/op
Iteration   2: 23.225 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.tagsOfUnorderedTagsSet2":
  23.264 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.tagsOfUnorderedTagsSet4

# Run progress: 87.50% complete, ETA 00:00:40
# Fork: 1 of 1
# Warmup Iteration   1: 48.443 ns/op
# Warmup Iteration   2: 47.487 ns/op
Iteration   1: 47.721 ns/op
Iteration   2: 47.536 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.tagsOfUnorderedTagsSet4":
  47.629 ns/op


# Run complete. Total time: 00:05:21

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

NOTE: Current JVM experimentally supports Compiler Blackholes, and they are in use. Please exercise
extra caution when trusting the results, look into the generated code to check the benchmark still
works, and factor in a small probability of new VM bugs. Additionally, while comparisons between
different JVMs are already problematic, the performance difference caused by different Blackhole
modes can be very significant. Please make sure you use the consistent Blackhole mode for comparisons.

Benchmark                               Mode  Cnt   Score   Error  Units
TagsBenchmark.dotAnd                    avgt    2  98.417          ns/op
TagsBenchmark.of                        avgt    2  45.250          ns/op
TagsBenchmark.tagsOfOrderedTagsSet10    avgt    2  69.997          ns/op
TagsBenchmark.tagsOfOrderedTagsSet2     avgt    2  22.632          ns/op
TagsBenchmark.tagsOfOrderedTagsSet4     avgt    2  32.953          ns/op
TagsBenchmark.tagsOfUnorderedTagsSet10  avgt    2  90.041          ns/op
TagsBenchmark.tagsOfUnorderedTagsSet2   avgt    2  23.264          ns/op
TagsBenchmark.tagsOfUnorderedTagsSet4   avgt    2  47.629          ns/op

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

Benchmark                     Mode  Cnt    Score   Error  Units
TagsMergeBenchmark.mergeTags  avgt    2  275.544          ns/op

Full JMH output:

> Task :micrometer-benchmarks-core:io.micrometer.benchmark.core.TagsMergeBenchmark.main()
# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsMergeBenchmark.mergeTags

# Run progress: 0.00% complete, ETA 00:00:40
# Fork: 1 of 1
# Warmup Iteration   1: 276.906 ns/op
# Warmup Iteration   2: 275.613 ns/op
Iteration   1: 275.369 ns/op
Iteration   2: 275.719 ns/op


Result "io.micrometer.benchmark.core.TagsMergeBenchmark.mergeTags":
  275.544 ns/op


# Run complete. Total time: 00:00:40

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

NOTE: Current JVM experimentally supports Compiler Blackholes, and they are in use. Please exercise
extra caution when trusting the results, look into the generated code to check the benchmark still
works, and factor in a small probability of new VM bugs. Additionally, while comparisons between
different JVMs are already problematic, the performance difference caused by different Blackhole
modes can be very significant. Please make sure you use the consistent Blackhole mode for comparisons.

Benchmark                     Mode  Cnt    Score   Error  Units
TagsMergeBenchmark.mergeTags  avgt    2  275.544          ns/op

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

With optimisations (05512c2):

Benchmark                               Mode  Cnt   Score   Error  Units
TagsBenchmark.dotAnd                    avgt    2  40.889          ns/op
TagsBenchmark.of                        avgt    2  36.411          ns/op
TagsBenchmark.tagsOfOrderedTagsSet10    avgt    2  31.092          ns/op
TagsBenchmark.tagsOfOrderedTagsSet2     avgt    2   3.885          ns/op
TagsBenchmark.tagsOfOrderedTagsSet4     avgt    2   9.395          ns/op
TagsBenchmark.tagsOfUnorderedTagsSet10  avgt    2  31.206          ns/op
TagsBenchmark.tagsOfUnorderedTagsSet2   avgt    2   3.872          ns/op
TagsBenchmark.tagsOfUnorderedTagsSet4   avgt    2   9.486          ns/op

Full JMH output:

> Task :micrometer-benchmarks-core:io.micrometer.benchmark.core.TagsBenchmark.main()
# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.dotAnd

# Run progress: 0.00% complete, ETA 00:05:20
# Fork: 1 of 1
# Warmup Iteration   1: 41.712 ns/op
# Warmup Iteration   2: 41.486 ns/op
Iteration   1: 40.870 ns/op
Iteration   2: 40.907 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.dotAnd":
  40.889 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.of

# Run progress: 12.50% complete, ETA 00:04:41
# Fork: 1 of 1
# Warmup Iteration   1: 31.779 ns/op
# Warmup Iteration   2: 31.810 ns/op
Iteration   1: 31.638 ns/op
Iteration   2: 41.185 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.of":
  36.411 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.tagsOfOrderedTagsSet10

# Run progress: 25.00% complete, ETA 00:04:01
# Fork: 1 of 1
# Warmup Iteration   1: 31.355 ns/op
# Warmup Iteration   2: 31.382 ns/op
Iteration   1: 31.102 ns/op
Iteration   2: 31.083 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.tagsOfOrderedTagsSet10":
  31.092 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.tagsOfOrderedTagsSet2

# Run progress: 37.50% complete, ETA 00:03:21
# Fork: 1 of 1
# Warmup Iteration   1: 3.997 ns/op
# Warmup Iteration   2: 4.002 ns/op
Iteration   1: 3.895 ns/op
Iteration   2: 3.875 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.tagsOfOrderedTagsSet2":
  3.885 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.tagsOfOrderedTagsSet4

# Run progress: 50.00% complete, ETA 00:02:40
# Fork: 1 of 1
# Warmup Iteration   1: 10.041 ns/op
# Warmup Iteration   2: 9.914 ns/op
Iteration   1: 9.432 ns/op
Iteration   2: 9.357 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.tagsOfOrderedTagsSet4":
  9.395 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.tagsOfUnorderedTagsSet10

# Run progress: 62.50% complete, ETA 00:02:00
# Fork: 1 of 1
# Warmup Iteration   1: 31.330 ns/op
# Warmup Iteration   2: 31.318 ns/op
Iteration   1: 31.111 ns/op
Iteration   2: 31.302 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.tagsOfUnorderedTagsSet10":
  31.206 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.tagsOfUnorderedTagsSet2

# Run progress: 75.00% complete, ETA 00:01:20
# Fork: 1 of 1
# Warmup Iteration   1: 4.043 ns/op
# Warmup Iteration   2: 4.025 ns/op
Iteration   1: 3.880 ns/op
Iteration   2: 3.865 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.tagsOfUnorderedTagsSet2":
  3.872 ns/op


# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsBenchmark.tagsOfUnorderedTagsSet4

# Run progress: 87.50% complete, ETA 00:00:40
# Fork: 1 of 1
# Warmup Iteration   1: 9.849 ns/op
# Warmup Iteration   2: 9.861 ns/op
Iteration   1: 9.545 ns/op
Iteration   2: 9.426 ns/op


Result "io.micrometer.benchmark.core.TagsBenchmark.tagsOfUnorderedTagsSet4":
  9.486 ns/op


# Run complete. Total time: 00:05:21

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

NOTE: Current JVM experimentally supports Compiler Blackholes, and they are in use. Please exercise
extra caution when trusting the results, look into the generated code to check the benchmark still
works, and factor in a small probability of new VM bugs. Additionally, while comparisons between
different JVMs are already problematic, the performance difference caused by different Blackhole
modes can be very significant. Please make sure you use the consistent Blackhole mode for comparisons.

Benchmark                               Mode  Cnt   Score   Error  Units
TagsBenchmark.dotAnd                    avgt    2  40.889          ns/op
TagsBenchmark.of                        avgt    2  36.411          ns/op
TagsBenchmark.tagsOfOrderedTagsSet10    avgt    2  31.092          ns/op
TagsBenchmark.tagsOfOrderedTagsSet2     avgt    2   3.885          ns/op
TagsBenchmark.tagsOfOrderedTagsSet4     avgt    2   9.395          ns/op
TagsBenchmark.tagsOfUnorderedTagsSet10  avgt    2  31.206          ns/op
TagsBenchmark.tagsOfUnorderedTagsSet2   avgt    2   3.872          ns/op
TagsBenchmark.tagsOfUnorderedTagsSet4   avgt    2   9.486          ns/op

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

Benchmark                     Mode  Cnt   Score   Error  Units
TagsMergeBenchmark.mergeTags  avgt    2  77.791          ns/op

Full JMH output:

> Task :micrometer-benchmarks-core:io.micrometer.benchmark.core.TagsMergeBenchmark.main()
# JMH version: 1.37
# VM version: JDK 21.0.1, OpenJDK 64-Bit Server VM, 21.0.1+12-LTS
# VM invoker: /Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java
# VM options: -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 2 iterations, 10 s each
# Measurement: 2 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: io.micrometer.benchmark.core.TagsMergeBenchmark.mergeTags

# Run progress: 0.00% complete, ETA 00:00:40
# Fork: 1 of 1
# Warmup Iteration   1: 78.302 ns/op
# Warmup Iteration   2: 77.982 ns/op
Iteration   1: 77.720 ns/op
Iteration   2: 77.862 ns/op


Result "io.micrometer.benchmark.core.TagsMergeBenchmark.mergeTags":
  77.791 ns/op


# Run complete. Total time: 00:00:40

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

NOTE: Current JVM experimentally supports Compiler Blackholes, and they are in use. Please exercise
extra caution when trusting the results, look into the generated code to check the benchmark still
works, and factor in a small probability of new VM bugs. Additionally, while comparisons between
different JVMs are already problematic, the performance difference caused by different Blackhole
modes can be very significant. Please make sure you use the consistent Blackhole mode for comparisons.

Benchmark                     Mode  Cnt   Score   Error  Units
TagsMergeBenchmark.mergeTags  avgt    2  77.791          ns/op

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

Part of #5140.

mstyura · 2024-04-10T21:21:37Z

Hello @jonatan-ivanov! Could you please review this MR? Thanks a lot in advance!

mstyura · 2024-04-16T04:39:32Z

Hello @shakuzen, could you please review this merge request when you have a chance? Thank you in advance!

shakuzen · 2024-04-22T03:39:56Z

Thanks for the pull request. It's on our list of things to review.

mstyura · 2024-04-23T05:55:34Z

Thanks! Is there anything I can do to facilitate review, like maybe separating MR into 2 smaller independent MRs: one with benchmark and follow up with refactoring? So it may be easier to review?

mstyura · 2024-05-15T20:26:26Z

Hello there! May I kindly ask what is the status of review of this PR? May I somehow facilitate review making it happen any time soon? Thanks a lot in advance!

shakuzen

Thank you for all the work on this. It's very appreciated. We just haven't had a chance to give it the attention it deserves yet. I'll be busy traveling for a conference through the end of next week, but I should be able to properly review and test this after that. I've left some initial quick feedback.

micrometer-core/src/main/java/io/micrometer/core/instrument/Tags.java

mstyura · 2024-09-12T14:40:53Z

Updated benchmark results in PR description to latest master (+ extra benchmark) + proposed changes.

mstyura · 2024-09-30T21:20:48Z

@shakuzen could you please take another look at this MR once you have time? Thanks a lot in advance!

shakuzen

Thanks for the improvement and added benchmarks. Sorry it took so long to get through review. We'll want to apply the same changes to KeyValues which is essentially a copy of Tags for use with the Observation related API. I can take care of doing this so we don't need to further block getting this merged.

mstyura · 2024-10-03T08:47:46Z

@shakuzen thank you very much for reviewing and accepting the PR!

See gh-4959. Resolves gh-5140

mstyura force-pushed the optimize-tags-merge branch from 32ced6c to dfbdde3 Compare May 23, 2024 07:19

mstyura mentioned this pull request May 23, 2024

Improve performance of merging two Tags/KeyValues instances #5140

Closed

mstyura force-pushed the optimize-tags-merge branch 4 times, most recently from 6b96b4d to e7a42c8 Compare May 23, 2024 13:45

shakuzen reviewed May 23, 2024

View reviewed changes

micrometer-core/src/main/java/io/micrometer/core/instrument/Tags.java Show resolved Hide resolved

micrometer-core/src/main/java/io/micrometer/core/instrument/Tags.java Outdated Show resolved Hide resolved

mstyura force-pushed the optimize-tags-merge branch 2 times, most recently from ad6ba40 to 18708d3 Compare September 12, 2024 13:23

mstyura requested a review from shakuzen September 12, 2024 13:33

mstyura added 2 commits September 12, 2024 16:12

Added benchmark to measure Tags.and(Tags) operation.

ac33ae9

Optimised Tags merge.

05512c2

mstyura force-pushed the optimize-tags-merge branch from 18708d3 to 05512c2 Compare September 12, 2024 14:13

mstyura changed the title ~~Tags merge optimization~~ https://github.com/micrometer-metrics/micrometer/issues/5140 Tags merge optimization Oct 1, 2024

mstyura changed the title ~~https://github.com/micrometer-metrics/micrometer/issues/5140 Tags merge optimization~~ #5140: Tags merge optimization Oct 1, 2024

shakuzen approved these changes Oct 3, 2024

View reviewed changes

shakuzen merged commit 1482cdf into micrometer-metrics:main Oct 3, 2024
6 checks passed

shakuzen added a commit that referenced this pull request Oct 4, 2024

Apply performance improvement from Tags to KeyValues

1d498f6

See gh-4959. Resolves gh-5140

chemicL mentioned this pull request Oct 4, 2024

Skip redundant tag deduplication reactor/reactor-core#3902

Merged

izeye added a commit to izeye/micrometer that referenced this pull request Nov 19, 2024

Polish micrometer-metricsgh-4959

0d6a98c

izeye mentioned this pull request Nov 19, 2024

Polish gh-4959 #5692

Merged

shakuzen pushed a commit that referenced this pull request Nov 20, 2024

Polish gh-4959 (#5692)

e074c11

chemicL mentioned this pull request Dec 5, 2024

Parallelization issues in getLowCardinalityKeyValues #4356

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#5140: Tags merge optimization #4959

#5140: Tags merge optimization #4959

mstyura commented Apr 10, 2024 •

edited

Loading

mstyura commented Apr 10, 2024

mstyura commented Apr 16, 2024

shakuzen commented Apr 22, 2024

mstyura commented Apr 23, 2024

mstyura commented May 15, 2024

shakuzen left a comment

mstyura commented Sep 12, 2024

mstyura commented Sep 30, 2024

shakuzen left a comment

mstyura commented Oct 3, 2024

#5140: Tags merge optimization #4959

#5140: Tags merge optimization #4959

Conversation

mstyura commented Apr 10, 2024 • edited Loading

mstyura commented Apr 10, 2024

mstyura commented Apr 16, 2024

shakuzen commented Apr 22, 2024

mstyura commented Apr 23, 2024

mstyura commented May 15, 2024

shakuzen left a comment

Choose a reason for hiding this comment

mstyura commented Sep 12, 2024

mstyura commented Sep 30, 2024

shakuzen left a comment

Choose a reason for hiding this comment

mstyura commented Oct 3, 2024

mstyura commented Apr 10, 2024 •

edited

Loading