Some benchmark cleanup #3

hassila · 2023-11-21T11:25:09Z

Added ARC metric, some overall cleanup to use the built-in support for inner loops.

mustiikhalil · 2023-11-21T11:29:17Z

tests/swift/benchmarks/Benchmarks/FlatbuffersBenchmarks/FlatbuffersBenchmarks.swift


-  Benchmark("structs") { benchmark in
-    let structCount = 1_000_000
-
+  Benchmark("Structs", configuration: kiloConfiguration) { benchmark in
+    let structCount = 1_000


why are the structCount 1_000 now?

Yeah, should have commented that - sorry - it would crash due to what looks like OOM with 1M:

* thread #2, queue = 'com.apple.root.default-qos.cooperative', stop reason = Swift runtime failure: Not enough bits to represent the passed value frame #0: 0x000000010012a47c FlatbuffersBenchmarks`Int.convertToPowerofTwo.getter [inlined] Swift runtime failure: Not enough bits to represent the passed value at <compiler-generated>:0 [opt] * frame #1: 0x000000010012a47c FlatbuffersBenchmarks`Int.convertToPowerofTwo.getter [inlined] generic specialization <Swift.UInt32, Swift.Int> of Swift.UnsignedInteger< where τ_0_0: Swift.FixedWidthInteger>.init<τ_0_0 where τ_1_0: Swift.BinaryInteger>(τ_1_0) -> τ_0_0 at <compiler-generated>:0 [opt] frame #2: 0x000000010012a47c FlatbuffersBenchmarks`Int.convertToPowerofTwo.getter(self=4294967296) at Int+extension.swift:31:13 [opt] frame #3: 0x00000001001202c8 FlatbuffersBenchmarks`ByteBuffer.Storage.reallocate(size=80, writerSize=2147483576, alignment=8, self=0x00000001005c4b70) at ByteBuffer.swift:88:27 [opt] frame #4: 0x0000000100120c98 FlatbuffersBenchmarks`closure #1 (Swift.UnsafeRawBufferPointer) -> () in FlatBuffers.ByteBuffer.push<τ_0_0 where τ_0_0: FlatBuffers.NativeStruct>(elements: Swift.Array<τ_0_0>) -> () at ByteBuffer.swift:371:16 frame #5: 0x0000000100129720 FlatbuffersBenchmarks`partial apply for closure #1 in ByteBuffer.push<A>(elements:) at <compiler-generated>:0 [opt] frame #6: 0x0000000190348b6c libswiftCore.dylib`Swift.Array.withUnsafeBytes<τ_0_0>((Swift.UnsafeRawBufferPointer) throws -> τ_1_0) throws -> τ_1_0 + 352 frame #7: 0x0000000100126efc FlatbuffersBenchmarks`FlatBufferBuilder.createVector<A>(ofStructs:) [inlined] FlatBuffers.ByteBuffer.push<τ_0_0 where τ_0_0: FlatBuffers.NativeStruct>(elements=<unavailable>, self=FlatBuffers.ByteBuffer @ 0x000000016fe863e0) -> () at ByteBuffer.swift:246:14 [opt] frame #8: 0x0000000100126ec0 FlatbuffersBenchmarks`FlatBufferBuilder.createVector<A>(structs=<unavailable>, self=FlatBuffers.FlatBufferBuilder @ 0x000000016fe863d8) at FlatBufferBuilder.swift:626:9 [opt] frame #9: 0x0000000100130d80 FlatbuffersBenchmarks`closure #12 in closure #1 in variable initialization expression of benchmarks(benchmark=<unavailable>, array=5 values) at FlatbuffersBenchmarks.swift:153:25 [opt] frame #10: 0x00000001000a51d0 FlatbuffersBenchmarks`BenchmarkExecutor.run(_:) at Benchmark.swift:344:13 [opt] frame #11: 0x00000001000a51b8 FlatbuffersBenchmarks`BenchmarkExecutor.run(benchmark=0x00000001006b0d80, self=0x0000000100605480) at BenchmarkExecutor.swift:53:23 [opt] frame #12: 0x00000001000c6988 FlatbuffersBenchmarks`BenchmarkRunner.run(self=Benchmark.BenchmarkRunner @ 0x0000000100b8c440) at BenchmarkRunner.swift:192:49 [opt] frame #13: 0x00000001000c2f84 FlatbuffersBenchmarks`static BenchmarkRunnerHooks.main(self=0x00000001001ad048) at BenchmarkRunner.swift:92 [opt] frame #14: 0x000000010015fd58 FlatbuffersBenchmarks`specialized thunk for @escaping @convention(thin) @async () -> () at <compiler-generated>:0 [opt]

(previously the inner loop was a single iteration, when we moved up to kiloConfiguration we run 1K inner loops, so it gets to 1M total - but I guess the question here is really "what do you want to measure"? - we are putting 1M vectors into the single fb here - what is the intended desired benchmark really?)

Can we keep the million but make the iterations less? Or we clear the buffer after each iteration?

Clearing buffer gave runtime of 220 seconds (+another 220 seconds for the single warmup iteration).

I will return it back to how it was with single iteration, it still gives 10+ samples and runtime ~223ms.

Structs ╒════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕ │ Metric │ p0 │ p25 │ p50 │ p75 │ p90 │ p99 │ p100 │ Samples │ ╞════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡ │ Malloc (total) │ 21 │ 21 │ 21 │ 21 │ 21 │ 21 │ 21 │ 13 │ ├────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤ │ Memory (resident peak) (M) │ 155 │ 155 │ 155 │ 155 │ 156 │ 156 │ 156 │ 13 │ ├────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤ │ Releases (K) │ 6000 │ 6000 │ 6000 │ 6000 │ 6000 │ 6000 │ 6000 │ 13 │ ├────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤ │ Time (total CPU) (ms) │ 223 │ 223 │ 224 │ 225 │ 226 │ 228 │ 228 │ 13 │ ├────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤ │ Time (wall clock) (ms) │ 224 │ 225 │ 225 │ 226 │ 226 │ 229 │ 229 │ 13 │ ╘════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

But in the code above we are only measuring the amount it takes to add 5 structs into the fb right? or are we measuring the amount of time we add 5 structs a million time into a buffer? as in the total time?

That is 5 structs a million times into a buffer, for _ in benchmark.scaledIterations { is 1M times

Okay then, perfect!

Some benchmark cleanup

1f6bcd3

github-actions bot added the swift label Nov 21, 2023

hassila mentioned this pull request Nov 21, 2023

[Swift] Migrating benchmarks to a newer lib. google/flatbuffers#8168

Merged

mustiikhalil approved these changes Nov 21, 2023

View reviewed changes

hassila added 2 commits November 21, 2023 12:58

Return back to 1M structs

1d826e4

Tweak Structs benchmark

c93b3fc

mustiikhalil merged commit 1310f14 into mustiikhalil:update-struct-pushing-to-buffer Nov 21, 2023
1 check passed

hassila deleted the benchmark-fixes branch November 21, 2023 12:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some benchmark cleanup #3

Some benchmark cleanup #3

hassila commented Nov 21, 2023 •

edited

Loading

mustiikhalil Nov 21, 2023

hassila Nov 21, 2023

hassila Nov 21, 2023

mustiikhalil Nov 21, 2023

hassila Nov 21, 2023

hassila Nov 21, 2023

hassila Nov 21, 2023

mustiikhalil Nov 21, 2023

hassila Nov 21, 2023

mustiikhalil Nov 21, 2023

Some benchmark cleanup #3

Some benchmark cleanup #3

Conversation

hassila commented Nov 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hassila commented Nov 21, 2023 •

edited

Loading