-
Notifications
You must be signed in to change notification settings - Fork 148
/
bench8.txt
1166 lines (988 loc) · 62.7 KB
/
bench8.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
27 July 2016, 2pm
---------------------
Notes: Got a loop version working. There were some fixes with boundschecks and
with loading a value from an MMatrix. (It turns out that mutable
containers will copy their entire tuple across and index that, so we
now revert to a pointer-based approach for loads as well as stores).
The second set or results has re-enabled the bounds checking on MMatrix.
This seems a little silly...
Nice summary above, detailed results below.
=====================================
Benchmarks for 2×2 matrices
=====================================
Matrix multiplication
---------------------
Array -> 9.547811 seconds (250.00 M allocations: 16.764 GB, 11.76% gc time)
SArray -> 0.449305 seconds (5 allocations: 208 bytes)
MArray -> 2.040162 seconds (125.00 M allocations: 5.588 GB, 12.73% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 5.163249 seconds (6 allocations: 384 bytes)
MArray -> 1.104846 seconds (6 allocations: 256 bytes)
Matrix addition
---------------
Array -> 4.287471 seconds (100.00 M allocations: 6.706 GB, 10.43% gc time)
SArray -> 0.072807 seconds (5 allocations: 208 bytes)
MArray -> 0.652166 seconds (50.00 M allocations: 2.235 GB, 16.05% gc time)
Matrix addition (mutating)
--------------------------
Array -> 1.309730 seconds (6 allocations: 384 bytes)
MArray -> 0.168466 seconds (5 allocations: 208 bytes)
=====================================
Benchmarks for 3×3 matrices
=====================================
Matrix multiplication
---------------------
Array -> 3.973188 seconds (74.07 M allocations: 6.623 GB, 12.92% gc time)
SArray -> 0.326989 seconds (5 allocations: 240 bytes)
MArray -> 2.248258 seconds (37.04 M allocations: 2.759 GB, 14.06% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 2.237091 seconds (6 allocations: 480 bytes)
MArray -> 0.795372 seconds (6 allocations: 320 bytes)
Matrix addition
---------------
Array -> 2.610709 seconds (44.44 M allocations: 3.974 GB, 11.81% gc time)
SArray -> 0.073024 seconds (5 allocations: 240 bytes)
MArray -> 0.896849 seconds (22.22 M allocations: 1.656 GB, 21.33% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.872791 seconds (6 allocations: 480 bytes)
MArray -> 0.145895 seconds (5 allocations: 240 bytes)
=====================================
Benchmarks for 4×4 matrices
=====================================
Matrix multiplication
---------------------
Array -> 6.526125 seconds (31.25 M allocations: 3.492 GB, 6.61% gc time)
SArray -> 0.369290 seconds (5 allocations: 304 bytes)
MArray -> 1.964021 seconds (15.63 M allocations: 2.095 GB, 12.05% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 4.540372 seconds (6 allocations: 576 bytes)
MArray -> 0.748238 seconds (6 allocations: 448 bytes)
Matrix addition
---------------
Array -> 2.260800 seconds (25.00 M allocations: 2.794 GB, 15.11% gc time)
SArray -> 0.065871 seconds (5 allocations: 304 bytes)
MArray -> 0.875674 seconds (12.50 M allocations: 1.676 GB, 21.51% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.720545 seconds (6 allocations: 576 bytes)
MArray -> 0.139145 seconds (5 allocations: 304 bytes)
=====================================
Benchmarks for 5×5 matrices
=====================================
Matrix multiplication
---------------------
Array -> 4.506792 seconds (16.00 M allocations: 2.742 GB, 7.33% gc time)
SArray -> 0.397713 seconds (5 allocations: 368 bytes)
MArray -> 1.707654 seconds (8.00 M allocations: 1.550 GB, 10.01% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 3.176690 seconds (6 allocations: 832 bytes)
MArray -> 0.771324 seconds (6 allocations: 576 bytes)
Matrix addition
---------------
Array -> 2.034313 seconds (16.00 M allocations: 2.742 GB, 16.13% gc time)
SArray -> 0.092189 seconds (5 allocations: 368 bytes)
MArray -> 0.821592 seconds (8.00 M allocations: 1.550 GB, 20.84% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.649601 seconds (6 allocations: 832 bytes)
MArray -> 0.138822 seconds (5 allocations: 368 bytes)
=====================================
Benchmarks for 6×6 matrices
=====================================
Matrix multiplication
---------------------
Array -> 2.943930 seconds (9.26 M allocations: 1.863 GB, 7.59% gc time)
SArray -> 0.398034 seconds (5 allocations: 496 bytes)
MArray -> 1.647219 seconds (4.63 M allocations: 1.449 GB, 9.93% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 2.186647 seconds (6 allocations: 960 bytes)
MArray -> 0.762931 seconds (6 allocations: 832 bytes)
Matrix addition
---------------
Array -> 1.654780 seconds (11.11 M allocations: 2.235 GB, 15.89% gc time)
SArray -> 0.105666 seconds (5 allocations: 496 bytes)
MArray -> 0.874112 seconds (5.56 M allocations: 1.738 GB, 22.38% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.610698 seconds (6 allocations: 960 bytes)
MArray -> 0.134774 seconds (5 allocations: 496 bytes)
=====================================
Benchmarks for 7×7 matrices
=====================================
Matrix multiplication
---------------------
Array -> 2.410168 seconds (5.83 M allocations: 1.564 GB, 7.75% gc time)
SArray -> 0.406037 seconds (5 allocations: 608 bytes)
MArray -> 1.544988 seconds (2.92 M allocations: 1.216 GB, 8.89% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 1.834131 seconds (6 allocations: 1.219 KB)
MArray -> 0.743859 seconds (6 allocations: 1.031 KB)
Matrix addition
---------------
Array -> 1.534938 seconds (8.16 M allocations: 2.190 GB, 16.60% gc time)
SArray -> 0.112465 seconds (5 allocations: 608 bytes)
MArray -> 0.861025 seconds (4.08 M allocations: 1.703 GB, 22.17% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.586710 seconds (6 allocations: 1.219 KB)
MArray -> 0.136711 seconds (5 allocations: 608 bytes)
=====================================
Benchmarks for 8×8 matrices
=====================================
Matrix multiplication
---------------------
Array -> 1.539716 seconds (3.91 M allocations: 1.193 GB, 9.33% gc time)
SArray -> 0.407405 seconds (5 allocations: 704 bytes)
MArray -> 0.904102 seconds (1.95 M allocations: 1013.279 MB, 12.16% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 1.108063 seconds (6 allocations: 1.375 KB)
MArray -> 0.637545 seconds (6 allocations: 1.219 KB)
Matrix addition
---------------
Array -> 1.371784 seconds (6.25 M allocations: 1.909 GB, 16.29% gc time)
SArray -> 0.115900 seconds (5 allocations: 704 bytes)
MArray -> 0.801574 seconds (3.13 M allocations: 1.583 GB, 21.91% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.572474 seconds (6 allocations: 1.375 KB)
MArray -> 0.133947 seconds (5 allocations: 704 bytes)
=====================================
Benchmarks for 9×9 matrices
=====================================
Matrix multiplication
---------------------
Array -> 1.367863 seconds (2.74 M allocations: 1004.694 MB, 8.65% gc time)
SArray -> 0.469426 seconds (5 allocations: 832 bytes)
MArray -> 0.843368 seconds (1.37 M allocations: 879.107 MB, 11.73% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 1.071786 seconds (6 allocations: 1.594 KB)
MArray -> 0.611493 seconds (6 allocations: 1.469 KB)
Matrix addition
---------------
Array -> 1.303038 seconds (4.94 M allocations: 1.766 GB, 15.66% gc time)
SArray -> 0.120040 seconds (5 allocations: 832 bytes)
MArray -> 0.794289 seconds (2.47 M allocations: 1.545 GB, 22.13% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.536772 seconds (6 allocations: 1.594 KB)
MArray -> 0.133377 seconds (5 allocations: 832 bytes)
=====================================
Benchmarks for 10×10 matrices
=====================================
Matrix multiplication
---------------------
Array -> 1.146869 seconds (2.00 M allocations: 885.010 MB, 8.96% gc time)
SArray -> 0.440042 seconds (5 allocations: 1.031 KB)
MArray -> 0.810863 seconds (1.00 M allocations: 854.492 MB, 11.55% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 0.899671 seconds (6 allocations: 1.906 KB)
MArray -> 0.599173 seconds (6 allocations: 1.906 KB)
Matrix addition
---------------
Array -> 1.259030 seconds (4.00 M allocations: 1.729 GB, 15.95% gc time)
SArray -> 0.121259 seconds (5 allocations: 1.031 KB)
MArray -> 0.837926 seconds (2.00 M allocations: 1.669 GB, 22.49% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.520041 seconds (6 allocations: 1.906 KB)
MArray -> 0.132315 seconds (5 allocations: 1.031 KB)
=====================================
Benchmarks for 11×11 matrices
=====================================
Matrix multiplication
---------------------
Array -> 0.929304 seconds (1.50 M allocations: 802.490 MB, 11.67% gc time)
SArray -> 0.448736 seconds (5 allocations: 1.141 KB)
MArray -> 0.532826 seconds (751.32 k allocations: 722.241 MB, 2.86% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 0.876624 seconds (6 allocations: 2.281 KB)
MArray -> 0.569864 seconds (6 allocations: 2.125 KB)
Matrix addition
---------------
Array -> 0.655690 seconds (3.31 M allocations: 1.724 GB, 7.36% gc time)
SArray -> 0.122948 seconds (5 allocations: 1.141 KB)
MArray -> 0.343905 seconds (1.65 M allocations: 1.552 GB, 9.69% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.509762 seconds (6 allocations: 2.281 KB)
MArray -> 0.132118 seconds (5 allocations: 1.141 KB)
=====================================
Benchmarks for 12×12 matrices
=====================================
Matrix multiplication
---------------------
Array -> 0.857118 seconds (1.16 M allocations: 706.425 MB, 9.62% gc time)
SArray -> 0.436621 seconds (5 allocations: 1.297 KB)
MArray -> 0.716204 seconds (578.71 k allocations: 644.613 MB, 9.25% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 0.681968 seconds (6 allocations: 2.594 KB)
MArray -> 0.587816 seconds (6 allocations: 2.438 KB)
Matrix addition
---------------
Array -> 1.181404 seconds (2.78 M allocations: 1.656 GB, 15.78% gc time)
SArray -> 0.124253 seconds (5 allocations: 1.297 KB)
MArray -> 0.754099 seconds (1.39 M allocations: 1.511 GB, 21.02% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.501721 seconds (6 allocations: 2.594 KB)
MArray -> 0.132368 seconds (5 allocations: 1.297 KB)
=====================================
Benchmarks for 13×13 matrices
=====================================
Matrix multiplication
---------------------
Array -> 0.846211 seconds (910.34 k allocations: 659.802 MB, 9.18% gc time)
SArray -> 0.472782 seconds (5 allocations: 1.484 KB)
MArray -> 0.722831 seconds (455.17 k allocations: 590.349 MB, 8.73% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 0.681942 seconds (6 allocations: 3.063 KB)
MArray -> 0.603141 seconds (6 allocations: 2.813 KB)
Matrix addition
---------------
Array -> 1.162698 seconds (2.37 M allocations: 1.675 GB, 15.92% gc time)
SArray -> 0.125623 seconds (5 allocations: 1.484 KB)
MArray -> 0.762534 seconds (1.18 M allocations: 1.499 GB, 21.35% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.496573 seconds (6 allocations: 3.063 KB)
MArray -> 0.132130 seconds (5 allocations: 1.484 KB)
=====================================
Benchmarks for 14×14 matrices
=====================================
Matrix multiplication
---------------------
Array -> 0.771841 seconds (728.87 k allocations: 639.489 MB, 9.75% gc time)
SArray -> 0.483155 seconds (5 allocations: 1.750 KB)
MArray -> 0.838476 seconds (364.44 k allocations: 567.199 MB, 10.69% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 0.612230 seconds (6 allocations: 3.688 KB)
MArray -> 0.613999 seconds (6 allocations: 3.344 KB)
Matrix addition
---------------
Array -> 0.601864 seconds (2.04 M allocations: 1.749 GB, 7.12% gc time)
SArray -> 0.126051 seconds (5 allocations: 1.750 KB)
MArray -> 0.350896 seconds (1.02 M allocations: 1.551 GB, 9.67% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.491759 seconds (6 allocations: 3.688 KB)
MArray -> 0.131698 seconds (5 allocations: 1.750 KB)
=====================================
Benchmarks for 15×15 matrices
=====================================
Matrix multiplication
---------------------
Array -> 0.607219 seconds (592.60 k allocations: 583.224 MB, 4.67% gc time)
SArray -> 0.656509 seconds (5 allocations: 1.922 KB)
MArray -> 0.661923 seconds (296.30 k allocations: 510.887 MB, 1.77% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 0.617122 seconds (6 allocations: 4.125 KB)
MArray -> 7.320878 seconds (208.89 M allocations: 4.040 GB, 8.28% gc time)
Matrix addition
---------------
Array -> 0.732605 seconds (1.78 M allocations: 1.709 GB, 18.32% gc time)
SArray -> 0.126683 seconds (5 allocations: 1.922 KB)
MArray -> 0.329896 seconds (888.89 k allocations: 1.497 GB, 10.03% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.485586 seconds (6 allocations: 4.125 KB)
MArray -> 0.131635 seconds (5 allocations: 1.922 KB)
=====================================
Benchmarks for 16×16 matrices
=====================================
Matrix multiplication
---------------------
Array -> 0.510547 seconds (488.28 k allocations: 514.089 MB, 7.06% gc time)
SArray -> 0.667007 seconds (5 allocations: 2.219 KB)
MArray -> 0.620571 seconds (244.14 k allocations: 491.737 MB, 4.68% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 0.485333 seconds (6 allocations: 4.406 KB)
MArray -> 6.213549 seconds (195.31 M allocations: 3.842 GB, 4.98% gc time)
Matrix addition
---------------
Array -> 0.816417 seconds (1.56 M allocations: 1.607 GB, 12.33% gc time)
SArray -> 0.127461 seconds (5 allocations: 2.219 KB)
MArray -> 0.598246 seconds (781.25 k allocations: 1.537 GB, 16.05% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.481684 seconds (6 allocations: 4.406 KB)
MArray -> 0.132054 seconds (5 allocations: 2.219 KB)
==========================================================================
==========================================================================
==========================================================================
==========================================================================
=====================================
Benchmarks for 2×2 matrices
=====================================
SMatrix * SMatrix compilation time (unrolled): 0.507703 seconds (234.15 k allocations: 9.595 MB)
SMatrix * SMatrix compilation time (chunks): 0.254606 seconds (105.14 k allocations: 4.431 MB)
SMatrix * SMatrix compilation time (loop): 0.207108 seconds (76.04 k allocations: 3.132 MB)
MMatrix * MMatrix compilation time (unrolled): 0.235399 seconds (126.85 k allocations: 5.010 MB)
MMatrix * MMatrix compilation time (chunks): 0.068589 seconds (15.42 k allocations: 678.168 KB)
MMatrix * MMatrix compilation time (loop): 0.041889 seconds (16.92 k allocations: 734.504 KB, 38.18% gc time)
Mat * Mat compilation time: 0.601159 seconds (416.09 k allocations: 17.799 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (unrolled): 0.018166 seconds (8.96 k allocations: 363.437 KB)
A_mul_B!(MMatrix, MMatrix) compilation time (chunks): 0.155252 seconds (58.35 k allocations: 2.436 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (BLAS): 0.079163 seconds (14.33 k allocations: 635.079 KB)
Matrix multiplication
---------------------
Array -> 9.486381 seconds (250.00 M allocations: 16.764 GB, 13.37% gc time)
Mat -> 1.510038 seconds (5 allocations: 208 bytes)
SArray -> 0.449355 seconds (5 allocations: 208 bytes)
MArray -> 2.149813 seconds (125.00 M allocations: 5.588 GB, 15.35% gc time)
SArray (unrolled) -> 0.449064 seconds (5 allocations: 208 bytes)
SArray (chunks) -> 2.122054 seconds (5 allocations: 208 bytes)
SArray (loop) -> 0.489965 seconds (5 allocations: 208 bytes)
MArray (unrolled) -> 2.139714 seconds (125.00 M allocations: 5.588 GB, 15.22% gc time)
MArray (chunks) -> 2.236246 seconds (125.00 M allocations: 5.588 GB, 14.63% gc time)
MArray (loop) -> 1.977764 seconds (125.00 M allocations: 5.588 GB, 16.58% gc time)
MArray (via SArray) -> 1.697460 seconds (125.00 M allocations: 5.588 GB, 19.34% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 5.163106 seconds (6 allocations: 384 bytes)
MArray -> 1.101876 seconds (6 allocations: 256 bytes)
MArray (unrolled) -> 1.101895 seconds (6 allocations: 256 bytes)
MArray (chunks) -> 1.658395 seconds (6 allocations: 256 bytes)
MArray (BLAS gemm!) -> 18.880402 seconds (6 allocations: 256 bytes)
Matrix addition
---------------
Array -> 4.391376 seconds (100.00 M allocations: 6.706 GB, 11.84% gc time)
Mat -> 0.085040 seconds (5 allocations: 208 bytes)
SArray -> 0.069372 seconds (5 allocations: 208 bytes)
MArray -> 0.685436 seconds (50.00 M allocations: 2.235 GB, 19.14% gc time)
MArray (via SArray) -> 0.666855 seconds (50.00 M allocations: 2.235 GB, 20.07% gc time)
Matrix addition (mutating)
--------------------------
Array -> 1.338964 seconds (6 allocations: 384 bytes)
MArray -> 0.259047 seconds (5 allocations: 208 bytes)
=====================================
Benchmarks for 3×3 matrices
=====================================
SMatrix * SMatrix compilation time (unrolled): 0.188167 seconds (144.48 k allocations: 5.718 MB)
SMatrix * SMatrix compilation time (chunks): 0.030126 seconds (25.78 k allocations: 1.051 MB)
SMatrix * SMatrix compilation time (loop): 0.037194 seconds (21.30 k allocations: 919.035 KB, 10.88% gc time)
MMatrix * MMatrix compilation time (unrolled): 0.208962 seconds (168.44 k allocations: 6.826 MB)
MMatrix * MMatrix compilation time (chunks): 0.030095 seconds (22.34 k allocations: 977.368 KB)
MMatrix * MMatrix compilation time (loop): 0.044735 seconds (31.69 k allocations: 1.355 MB)
Mat * Mat compilation time: 0.304094 seconds (111.42 k allocations: 4.873 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (unrolled): 0.032990 seconds (22.92 k allocations: 839.524 KB)
A_mul_B!(MMatrix, MMatrix) compilation time (chunks): 0.032063 seconds (19.49 k allocations: 824.979 KB)
A_mul_B!(MMatrix, MMatrix) compilation time (BLAS): 0.032908 seconds (15.68 k allocations: 672.276 KB, 19.60% gc time)
Matrix multiplication
---------------------
Array -> 5.031830 seconds (74.07 M allocations: 6.623 GB, 17.37% gc time)
Mat -> 0.713650 seconds (5 allocations: 240 bytes)
SArray -> 0.326910 seconds (5 allocations: 240 bytes)
MArray -> 2.238982 seconds (37.04 M allocations: 2.759 GB, 15.13% gc time)
SArray (unrolled) -> 0.326685 seconds (5 allocations: 240 bytes)
SArray (chunks) -> 0.749846 seconds (5 allocations: 240 bytes)
SArray (loop) -> 0.403389 seconds (5 allocations: 240 bytes)
MArray (unrolled) -> 1.556791 seconds (37.04 M allocations: 2.759 GB, 12.79% gc time)
MArray (chunks) -> 1.160646 seconds (37.04 M allocations: 2.759 GB, 9.29% gc time)
MArray (loop) -> 1.077874 seconds (37.04 M allocations: 2.759 GB, 10.00% gc time)
MArray (via SArray) -> 0.862185 seconds (37.04 M allocations: 2.759 GB, 12.71% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 2.254701 seconds (6 allocations: 480 bytes)
MArray -> 0.794907 seconds (6 allocations: 320 bytes)
MArray (unrolled) -> 0.799572 seconds (6 allocations: 320 bytes)
MArray (chunks) -> 1.125394 seconds (6 allocations: 320 bytes)
MArray (BLAS gemm!) -> 7.714614 seconds (6 allocations: 320 bytes)
Matrix addition
---------------
Array -> 2.182477 seconds (44.44 M allocations: 3.974 GB, 10.06% gc time)
Mat -> 0.072569 seconds (5 allocations: 240 bytes)
SArray -> 0.065348 seconds (5 allocations: 240 bytes)
MArray -> 0.495607 seconds (22.22 M allocations: 1.656 GB, 14.26% gc time)
MArray (via SArray) -> 0.436304 seconds (22.22 M allocations: 1.656 GB, 16.32% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.872445 seconds (6 allocations: 480 bytes)
MArray -> 0.152380 seconds (5 allocations: 240 bytes)
=====================================
Benchmarks for 4×4 matrices
=====================================
SMatrix * SMatrix compilation time (unrolled): 0.224239 seconds (178.64 k allocations: 7.173 MB)
SMatrix * SMatrix compilation time (chunks): 0.041547 seconds (35.00 k allocations: 1.455 MB)
SMatrix * SMatrix compilation time (loop): 0.048329 seconds (34.15 k allocations: 1.446 MB)
MMatrix * MMatrix compilation time (unrolled): 0.271763 seconds (246.88 k allocations: 9.849 MB, 2.11% gc time)
MMatrix * MMatrix compilation time (chunks): 0.043038 seconds (34.40 k allocations: 1.491 MB)
MMatrix * MMatrix compilation time (loop): 0.068971 seconds (55.86 k allocations: 2.315 MB)
Mat * Mat compilation time: 0.207613 seconds (162.46 k allocations: 7.095 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (unrolled): 0.062473 seconds (55.96 k allocations: 1.741 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (chunks): 0.050295 seconds (32.14 k allocations: 1.307 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (BLAS): 0.031308 seconds (18.36 k allocations: 774.540 KB)
Matrix multiplication
---------------------
Array -> 5.412238 seconds (31.25 M allocations: 3.492 GB, 4.45% gc time)
Mat -> 0.566407 seconds (5 allocations: 304 bytes)
SArray -> 0.368646 seconds (5 allocations: 304 bytes)
MArray -> 1.286629 seconds (15.63 M allocations: 2.095 GB, 4.39% gc time)
SArray (unrolled) -> 0.374016 seconds (5 allocations: 304 bytes)
SArray (chunks) -> 0.644981 seconds (5 allocations: 304 bytes)
SArray (loop) -> 0.515207 seconds (5 allocations: 304 bytes)
MArray (unrolled) -> 1.280614 seconds (15.63 M allocations: 2.095 GB, 4.36% gc time)
MArray (chunks) -> 0.973442 seconds (15.63 M allocations: 2.095 GB, 5.75% gc time)
MArray (loop) -> 0.914126 seconds (15.63 M allocations: 2.095 GB, 6.17% gc time)
MArray (via SArray) -> 0.890311 seconds (15.63 M allocations: 2.095 GB, 6.50% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 4.579211 seconds (6 allocations: 576 bytes)
MArray -> 0.745864 seconds (6 allocations: 448 bytes)
MArray (unrolled) -> 0.747835 seconds (6 allocations: 448 bytes)
MArray (chunks) -> 0.990759 seconds (6 allocations: 448 bytes)
MArray (BLAS gemm!) -> 3.296751 seconds (6 allocations: 448 bytes)
Matrix addition
---------------
Array -> 1.422700 seconds (25.00 M allocations: 2.794 GB, 9.24% gc time)
Mat -> 0.065779 seconds (5 allocations: 304 bytes)
SArray -> 0.065629 seconds (5 allocations: 304 bytes)
MArray -> 0.418997 seconds (12.50 M allocations: 1.676 GB, 10.76% gc time)
MArray (via SArray) -> 0.394248 seconds (12.50 M allocations: 1.676 GB, 11.67% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.718432 seconds (6 allocations: 576 bytes)
MArray -> 0.148090 seconds (5 allocations: 304 bytes)
=====================================
Benchmarks for 5×5 matrices
=====================================
SMatrix * SMatrix compilation time (unrolled): 0.108983 seconds (105.32 k allocations: 4.458 MB)
SMatrix * SMatrix compilation time (chunks): 0.056842 seconds (47.49 k allocations: 1.987 MB)
SMatrix * SMatrix compilation time (loop): 0.068834 seconds (50.65 k allocations: 2.152 MB)
MMatrix * MMatrix compilation time (unrolled): 0.187439 seconds (230.97 k allocations: 8.876 MB, 2.90% gc time)
MMatrix * MMatrix compilation time (chunks): 0.061561 seconds (50.44 k allocations: 2.175 MB)
MMatrix * MMatrix compilation time (loop): 0.099553 seconds (92.96 k allocations: 3.666 MB)
Mat * Mat compilation time: 0.279467 seconds (213.78 k allocations: 8.773 MB, 2.26% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (unrolled): 0.109701 seconds (108.14 k allocations: 3.205 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (chunks): 0.079903 seconds (51.05 k allocations: 1.999 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (BLAS): 0.043121 seconds (21.81 k allocations: 907.198 KB)
Matrix multiplication
---------------------
Array -> 4.273199 seconds (16.00 M allocations: 2.742 GB, 7.07% gc time)
Mat -> 1.059544 seconds (5 allocations: 368 bytes)
SArray -> 0.397279 seconds (5 allocations: 368 bytes)
MArray -> 1.439405 seconds (8.00 M allocations: 1.550 GB, 5.43% gc time)
SArray (unrolled) -> 0.397350 seconds (5 allocations: 368 bytes)
SArray (chunks) -> 0.614774 seconds (5 allocations: 368 bytes)
SArray (loop) -> 0.570072 seconds (5 allocations: 368 bytes)
MArray (unrolled) -> 1.441453 seconds (8.00 M allocations: 1.550 GB, 5.32% gc time)
MArray (chunks) -> 0.920037 seconds (8.00 M allocations: 1.550 GB, 8.44% gc time)
MArray (loop) -> 0.991645 seconds (8.00 M allocations: 1.550 GB, 7.72% gc time)
MArray (via SArray) -> 0.937700 seconds (8.00 M allocations: 1.550 GB, 8.28% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 3.209851 seconds (6 allocations: 832 bytes)
MArray -> 0.750820 seconds (6 allocations: 576 bytes)
MArray (unrolled) -> 0.750442 seconds (6 allocations: 576 bytes)
MArray (chunks) -> 0.827616 seconds (6 allocations: 576 bytes)
MArray (BLAS gemm!) -> 2.472844 seconds (6 allocations: 576 bytes)
Matrix addition
---------------
Array -> 1.880658 seconds (16.00 M allocations: 2.742 GB, 15.83% gc time)
Mat -> 0.279578 seconds (5 allocations: 368 bytes)
SArray -> 0.091593 seconds (5 allocations: 368 bytes)
MArray -> 0.496703 seconds (8.00 M allocations: 1.550 GB, 15.75% gc time)
MArray (via SArray) -> 0.475943 seconds (8.00 M allocations: 1.550 GB, 16.13% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.647675 seconds (6 allocations: 832 bytes)
MArray -> 0.135975 seconds (5 allocations: 368 bytes)
=====================================
Benchmarks for 6×6 matrices
=====================================
SMatrix * SMatrix compilation time (unrolled): 0.183810 seconds (180.40 k allocations: 7.607 MB, 1.67% gc time)
SMatrix * SMatrix compilation time (chunks): 0.076249 seconds (62.91 k allocations: 2.645 MB)
SMatrix * SMatrix compilation time (loop): 0.094080 seconds (70.83 k allocations: 3.013 MB)
MMatrix * MMatrix compilation time (unrolled): 0.327162 seconds (417.26 k allocations: 15.859 MB, 1.93% gc time)
MMatrix * MMatrix compilation time (chunks): 0.084155 seconds (70.65 k allocations: 3.062 MB)
MMatrix * MMatrix compilation time (loop): 0.138687 seconds (138.34 k allocations: 5.326 MB)
Mat * Mat compilation time: 0.250885 seconds (195.29 k allocations: 7.922 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (unrolled): 0.180421 seconds (184.33 k allocations: 5.368 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (chunks): 0.120692 seconds (76.55 k allocations: 2.874 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (BLAS): 0.061817 seconds (28.17 k allocations: 1.084 MB)
Matrix multiplication
---------------------
Array -> 2.449254 seconds (9.26 M allocations: 1.863 GB, 7.09% gc time)
Mat -> 0.783253 seconds (5 allocations: 496 bytes)
SArray -> 0.396094 seconds (5 allocations: 496 bytes)
MArray -> 1.197277 seconds (4.63 M allocations: 1.449 GB, 3.19% gc time)
SArray (unrolled) -> 0.396217 seconds (5 allocations: 496 bytes)
SArray (chunks) -> 0.540697 seconds (5 allocations: 496 bytes)
SArray (loop) -> 0.570500 seconds (5 allocations: 496 bytes)
MArray (unrolled) -> 1.188196 seconds (4.63 M allocations: 1.449 GB, 3.16% gc time)
MArray (chunks) -> 0.744241 seconds (4.63 M allocations: 1.449 GB, 5.10% gc time)
MArray (loop) -> 0.837136 seconds (4.63 M allocations: 1.449 GB, 4.55% gc time)
MArray (via SArray) -> 0.776484 seconds (4.63 M allocations: 1.449 GB, 4.91% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 2.170870 seconds (6 allocations: 960 bytes)
MArray -> 0.760663 seconds (6 allocations: 832 bytes)
MArray (unrolled) -> 0.760441 seconds (6 allocations: 832 bytes)
MArray (chunks) -> 0.799162 seconds (6 allocations: 832 bytes)
MArray (BLAS gemm!) -> 1.708944 seconds (6 allocations: 832 bytes)
Matrix addition
---------------
Array -> 0.968873 seconds (11.11 M allocations: 2.235 GB, 9.61% gc time)
Mat -> 0.327420 seconds (5 allocations: 496 bytes)
SArray -> 0.105728 seconds (5 allocations: 496 bytes)
MArray -> 0.399402 seconds (5.56 M allocations: 1.738 GB, 11.42% gc time)
MArray (via SArray) -> 0.375473 seconds (5.56 M allocations: 1.738 GB, 12.23% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.609895 seconds (6 allocations: 960 bytes)
MArray -> 0.134533 seconds (5 allocations: 496 bytes)
=====================================
Benchmarks for 7×7 matrices
=====================================
SMatrix * SMatrix compilation time (unrolled): 0.291963 seconds (286.12 k allocations: 12.028 MB, 0.83% gc time)
SMatrix * SMatrix compilation time (chunks): 0.103585 seconds (80.71 k allocations: 3.381 MB)
SMatrix * SMatrix compilation time (loop): 0.125094 seconds (94.67 k allocations: 4.043 MB)
MMatrix * MMatrix compilation time (unrolled): 0.543905 seconds (682.80 k allocations: 25.651 MB, 2.26% gc time)
MMatrix * MMatrix compilation time (chunks): 0.110778 seconds (94.44 k allocations: 4.081 MB)
MMatrix * MMatrix compilation time (loop): 0.184499 seconds (191.92 k allocations: 7.267 MB)
Mat * Mat compilation time: 0.278040 seconds (239.29 k allocations: 9.662 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (unrolled): 0.280459 seconds (291.01 k allocations: 8.358 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (chunks): 0.178844 seconds (108.03 k allocations: 3.948 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (BLAS): 0.084870 seconds (36.16 k allocations: 1.316 MB)
Matrix multiplication
---------------------
Array -> 2.083707 seconds (5.83 M allocations: 1.564 GB, 5.41% gc time)
Mat -> 0.730077 seconds (5 allocations: 608 bytes)
SArray -> 0.406089 seconds (5 allocations: 608 bytes)
MArray -> 1.543726 seconds (2.92 M allocations: 1.216 GB, 8.96% gc time)
SArray (unrolled) -> 0.405930 seconds (5 allocations: 608 bytes)
SArray (chunks) -> 0.514614 seconds (5 allocations: 608 bytes)
SArray (loop) -> 0.596168 seconds (5 allocations: 608 bytes)
MArray (unrolled) -> 1.549886 seconds (2.92 M allocations: 1.216 GB, 9.01% gc time)
MArray (chunks) -> 1.043045 seconds (2.92 M allocations: 1.216 GB, 13.35% gc time)
MArray (loop) -> 1.216366 seconds (2.92 M allocations: 1.216 GB, 11.44% gc time)
MArray (via SArray) -> 1.096689 seconds (2.92 M allocations: 1.216 GB, 12.72% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 1.833919 seconds (6 allocations: 1.219 KB)
MArray -> 0.753623 seconds (6 allocations: 1.031 KB)
MArray (unrolled) -> 0.754889 seconds (6 allocations: 1.031 KB)
MArray (chunks) -> 0.685418 seconds (6 allocations: 1.031 KB)
MArray (BLAS gemm!) -> 1.476325 seconds (6 allocations: 1.031 KB)
Matrix addition
---------------
Array -> 1.123645 seconds (8.16 M allocations: 2.190 GB, 13.45% gc time)
Mat -> 0.324092 seconds (5 allocations: 608 bytes)
SArray -> 0.112396 seconds (5 allocations: 608 bytes)
MArray -> 0.855265 seconds (4.08 M allocations: 1.703 GB, 22.72% gc time)
MArray (via SArray) -> 0.844391 seconds (4.08 M allocations: 1.703 GB, 23.16% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.586847 seconds (6 allocations: 1.219 KB)
MArray -> 0.133550 seconds (5 allocations: 608 bytes)
=====================================
Benchmarks for 8×8 matrices
=====================================
SMatrix * SMatrix compilation time (unrolled): 0.458754 seconds (427.95 k allocations: 17.933 MB, 1.19% gc time)
SMatrix * SMatrix compilation time (chunks): 0.131881 seconds (102.03 k allocations: 4.272 MB)
SMatrix * SMatrix compilation time (loop): 0.163543 seconds (122.19 k allocations: 5.232 MB)
MMatrix * MMatrix compilation time (unrolled): 0.868310 seconds (1.04 M allocations: 38.963 MB, 1.43% gc time)
MMatrix * MMatrix compilation time (chunks): 0.144765 seconds (125.97 k allocations: 5.370 MB)
MMatrix * MMatrix compilation time (loop): 0.243939 seconds (253.85 k allocations: 9.564 MB, 2.40% gc time)
Mat * Mat compilation time: 0.332050 seconds (284.34 k allocations: 11.485 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (unrolled): 0.758404 seconds (1.05 M allocations: 36.623 MB, 1.76% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (chunks): 0.216179 seconds (80.00 k allocations: 2.374 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (BLAS): 0.117190 seconds (46.23 k allocations: 1.589 MB)
Matrix multiplication
---------------------
Array -> 1.565243 seconds (3.91 M allocations: 1.193 GB, 9.89% gc time)
Mat -> 10.485373 seconds (875.00 M allocations: 13.039 GB, 17.84% gc time)
SArray -> 0.408163 seconds (5 allocations: 704 bytes)
MArray -> 0.743007 seconds (1.95 M allocations: 1013.279 MB, 8.65% gc time)
SArray (unrolled) -> 0.412884 seconds (5 allocations: 704 bytes)
SArray (chunks) -> 0.468833 seconds (5 allocations: 704 bytes)
SArray (loop) -> 0.615983 seconds (5 allocations: 704 bytes)
MArray (unrolled) -> 1.289432 seconds (1.95 M allocations: 1013.279 MB, 4.84% gc time)
MArray (chunks) -> 0.716657 seconds (1.95 M allocations: 1013.279 MB, 17.42% gc time)
MArray (loop) -> 0.843184 seconds (1.95 M allocations: 1013.279 MB, 2.89% gc time)
MArray (via SArray) -> 0.737564 seconds (1.95 M allocations: 1013.279 MB, 3.22% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 1.141281 seconds (6 allocations: 1.375 KB)
MArray -> 0.646085 seconds (6 allocations: 1.219 KB)
MArray (unrolled) -> 0.746431 seconds (6 allocations: 1.219 KB)
MArray (chunks) -> 0.652494 seconds (6 allocations: 1.219 KB)
MArray (BLAS gemm!) -> 0.867956 seconds (6 allocations: 1.219 KB)
Matrix addition
---------------
Array -> 0.759221 seconds (6.25 M allocations: 1.909 GB, 9.25% gc time)
Mat -> 0.302765 seconds (5 allocations: 704 bytes)
SArray -> 0.115573 seconds (5 allocations: 704 bytes)
MArray -> 0.372142 seconds (3.13 M allocations: 1.583 GB, 11.55% gc time)
MArray (via SArray) -> 0.335238 seconds (3.13 M allocations: 1.583 GB, 12.78% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.571963 seconds (6 allocations: 1.375 KB)
MArray -> 0.132970 seconds (5 allocations: 704 bytes)
=====================================
Benchmarks for 9×9 matrices
=====================================
SMatrix * SMatrix compilation time (unrolled): 0.673235 seconds (611.48 k allocations: 25.545 MB, 0.70% gc time)
SMatrix * SMatrix compilation time (chunks): 0.173112 seconds (125.98 k allocations: 5.242 MB)
SMatrix * SMatrix compilation time (loop): 0.208446 seconds (153.36 k allocations: 6.584 MB, 2.88% gc time)
MMatrix * MMatrix compilation time (unrolled): 1.348455 seconds (1.50 M allocations: 56.262 MB, 1.01% gc time)
MMatrix * MMatrix compilation time (chunks): 0.191181 seconds (165.36 k allocations: 6.866 MB, 2.89% gc time)
MMatrix * MMatrix compilation time (loop): 0.303084 seconds (326.66 k allocations: 12.360 MB)
Mat * Mat compilation time: 0.460842 seconds (395.11 k allocations: 15.602 MB, 1.36% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (unrolled): 1.103146 seconds (1.51 M allocations: 52.942 MB, 1.34% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (chunks): 0.310410 seconds (104.00 k allocations: 3.047 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (BLAS): 0.162483 seconds (57.74 k allocations: 1.888 MB)
Matrix multiplication
---------------------
Array -> 1.289305 seconds (2.74 M allocations: 1004.694 MB, 7.36% gc time)
Mat -> 4.837835 seconds (5 allocations: 832 bytes)
SArray -> 0.464717 seconds (5 allocations: 832 bytes)
MArray -> 0.834239 seconds (1.37 M allocations: 879.107 MB, 11.30% gc time)
SArray (unrolled) -> 0.405008 seconds (5 allocations: 832 bytes)
SArray (chunks) -> 0.464537 seconds (5 allocations: 832 bytes)
SArray (loop) -> 0.604719 seconds (5 allocations: 832 bytes)
MArray (unrolled) -> 1.467048 seconds (1.37 M allocations: 879.107 MB, 6.48% gc time)
MArray (chunks) -> 0.824633 seconds (1.37 M allocations: 879.107 MB, 11.79% gc time)
MArray (loop) -> 1.176981 seconds (1.37 M allocations: 879.107 MB, 15.51% gc time)
MArray (via SArray) -> 0.632092 seconds (1.37 M allocations: 879.107 MB, 3.52% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 1.071867 seconds (6 allocations: 1.594 KB)
MArray -> 0.627937 seconds (6 allocations: 1.469 KB)
MArray (unrolled) -> 0.737009 seconds (6 allocations: 1.469 KB)
MArray (chunks) -> 0.627941 seconds (6 allocations: 1.469 KB)
MArray (BLAS gemm!) -> 0.864848 seconds (6 allocations: 1.469 KB)
Matrix addition
---------------
Array -> 0.739531 seconds (4.94 M allocations: 1.766 GB, 8.72% gc time)
Mat -> 0.311596 seconds (5 allocations: 832 bytes)
SArray -> 0.118771 seconds (5 allocations: 832 bytes)
MArray -> 0.356388 seconds (2.47 M allocations: 1.545 GB, 12.09% gc time)
MArray (via SArray) -> 0.325803 seconds (2.47 M allocations: 1.545 GB, 13.18% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.542328 seconds (6 allocations: 1.594 KB)
MArray -> 0.132698 seconds (5 allocations: 832 bytes)
=====================================
Benchmarks for 10×10 matrices
=====================================
SMatrix * SMatrix compilation time (unrolled): 0.980081 seconds (842.37 k allocations: 35.084 MB, 1.03% gc time)
SMatrix * SMatrix compilation time (chunks): 0.219328 seconds (153.71 k allocations: 6.386 MB)
SMatrix * SMatrix compilation time (loop): 0.261099 seconds (188.22 k allocations: 8.133 MB, 2.30% gc time)
MMatrix * MMatrix compilation time (unrolled): 2.160220 seconds (2.09 M allocations: 79.076 MB, 0.94% gc time)
MMatrix * MMatrix compilation time (chunks): 0.235305 seconds (210.59 k allocations: 8.655 MB, 3.26% gc time)
MMatrix * MMatrix compilation time (loop): 0.373521 seconds (408.08 k allocations: 15.445 MB)
Mat * Mat compilation time: 0.497137 seconds (367.28 k allocations: 14.088 MB, 1.57% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (unrolled): 1.520252 seconds (2.07 M allocations: 73.312 MB, 1.62% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (chunks): 0.431080 seconds (131.33 k allocations: 3.794 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (BLAS): 0.215738 seconds (71.16 k allocations: 2.263 MB)
Matrix multiplication
---------------------
Array -> 1.048162 seconds (2.00 M allocations: 885.010 MB, 7.35% gc time)
Mat -> 4.668110 seconds (5 allocations: 1.031 KB)
SArray -> 0.440042 seconds (5 allocations: 1.031 KB)
MArray -> 0.718002 seconds (1.00 M allocations: 854.492 MB, 9.44% gc time)
SArray (unrolled) -> 0.403820 seconds (5 allocations: 1.031 KB)
SArray (chunks) -> 0.439154 seconds (5 allocations: 1.031 KB)
SArray (loop) -> 0.604035 seconds (5 allocations: 1.031 KB)
MArray (unrolled) -> 1.778149 seconds (1.00 M allocations: 854.492 MB, 3.81% gc time)
MArray (chunks) -> 0.712092 seconds (1.00 M allocations: 854.492 MB, 9.45% gc time)
MArray (loop) -> 1.013756 seconds (1.00 M allocations: 854.492 MB, 6.64% gc time)
MArray (via SArray) -> 0.767091 seconds (1.00 M allocations: 854.492 MB, 8.77% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 0.900739 seconds (6 allocations: 1.906 KB)
MArray -> 0.593009 seconds (6 allocations: 1.906 KB)
MArray (unrolled) -> 0.731870 seconds (6 allocations: 1.906 KB)
MArray (chunks) -> 0.594974 seconds (6 allocations: 1.906 KB)
MArray (BLAS gemm!) -> 0.732844 seconds (6 allocations: 1.906 KB)
Matrix addition
---------------
Array -> 1.031291 seconds (4.00 M allocations: 1.729 GB, 14.43% gc time)
Mat -> 0.313233 seconds (5 allocations: 1.031 KB)
SArray -> 0.121360 seconds (5 allocations: 1.031 KB)
MArray -> 0.664802 seconds (2.00 M allocations: 1.669 GB, 20.20% gc time)
MArray (via SArray) -> 0.637558 seconds (2.00 M allocations: 1.669 GB, 21.38% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.530702 seconds (6 allocations: 1.906 KB)
MArray -> 0.132468 seconds (5 allocations: 1.031 KB)
=====================================
Benchmarks for 11×11 matrices
=====================================
SMatrix * SMatrix compilation time (unrolled): 1.392235 seconds (1.13 M allocations: 46.788 MB, 0.92% gc time)
SMatrix * SMatrix compilation time (chunks): 0.272512 seconds (184.43 k allocations: 7.632 MB)
SMatrix * SMatrix compilation time (loop): 0.311724 seconds (226.74 k allocations: 9.821 MB, 2.17% gc time)
MMatrix * MMatrix compilation time (unrolled): 3.729487 seconds (2.81 M allocations: 107.071 MB, 1.42% gc time)
MMatrix * MMatrix compilation time (chunks): 0.286259 seconds (261.26 k allocations: 10.552 MB)
MMatrix * MMatrix compilation time (loop): 0.471264 seconds (498.07 k allocations: 18.831 MB, 2.49% gc time)
Mat * Mat compilation time: 0.430023 seconds (422.13 k allocations: 16.085 MB, 1.78% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (unrolled): 2.098040 seconds (2.77 M allocations: 97.663 MB, 2.11% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (chunks): 0.598654 seconds (161.61 k allocations: 4.574 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (BLAS): 0.296579 seconds (86.07 k allocations: 2.657 MB)
Matrix multiplication
---------------------
Array -> 1.129949 seconds (1.50 M allocations: 802.490 MB, 9.76% gc time)
Mat -> 4.344536 seconds (5 allocations: 1.141 KB)
SArray -> 0.448301 seconds (5 allocations: 1.141 KB)
MArray -> 0.765119 seconds (751.32 k allocations: 722.241 MB, 10.83% gc time)
SArray (unrolled) -> 0.408657 seconds (5 allocations: 1.141 KB)
SArray (chunks) -> 0.448133 seconds (5 allocations: 1.141 KB)
SArray (loop) -> 0.620216 seconds (5 allocations: 1.141 KB)
MArray (unrolled) -> 1.979061 seconds (751.32 k allocations: 722.241 MB, 4.17% gc time)
MArray (chunks) -> 0.762747 seconds (751.32 k allocations: 722.241 MB, 11.16% gc time)
MArray (loop) -> 1.061680 seconds (751.32 k allocations: 722.241 MB, 7.75% gc time)
MArray (via SArray) -> 0.798909 seconds (751.32 k allocations: 722.241 MB, 10.34% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 0.879396 seconds (6 allocations: 2.281 KB)
MArray -> 0.581203 seconds (6 allocations: 2.125 KB)
MArray (unrolled) -> 1.223903 seconds (6 allocations: 2.125 KB)
MArray (chunks) -> 0.583573 seconds (6 allocations: 2.125 KB)
MArray (BLAS gemm!) -> 0.736948 seconds (6 allocations: 2.125 KB)
Matrix addition
---------------
Array -> 1.267805 seconds (3.31 M allocations: 1.724 GB, 18.54% gc time)
Mat -> 0.315558 seconds (5 allocations: 1.141 KB)
SArray -> 0.122799 seconds (5 allocations: 1.141 KB)
MArray -> 0.788707 seconds (1.65 M allocations: 1.552 GB, 23.11% gc time)
MArray (via SArray) -> 0.782581 seconds (1.65 M allocations: 1.552 GB, 23.29% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.521752 seconds (6 allocations: 2.281 KB)
MArray -> 0.132127 seconds (5 allocations: 1.141 KB)
=====================================
Benchmarks for 12×12 matrices
=====================================
SMatrix * SMatrix compilation time (unrolled): 1.971084 seconds (1.47 M allocations: 60.830 MB, 0.62% gc time)
SMatrix * SMatrix compilation time (chunks): 0.339555 seconds (218.46 k allocations: 8.979 MB, 1.97% gc time)
SMatrix * SMatrix compilation time (loop): 0.389020 seconds (268.92 k allocations: 11.770 MB)
MMatrix * MMatrix compilation time (unrolled): 6.200838 seconds (3.67 M allocations: 140.902 MB, 1.20% gc time)
MMatrix * MMatrix compilation time (chunks): 0.345823 seconds (317.90 k allocations: 12.753 MB)
MMatrix * MMatrix compilation time (loop): 0.587458 seconds (597.78 k allocations: 22.794 MB, 5.50% gc time)
Mat * Mat compilation time: 0.548837 seconds (483.22 k allocations: 18.312 MB, 14.22% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (unrolled): 2.769810 seconds (3.75 M allocations: 130.185 MB, 2.44% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (chunks): 0.786729 seconds (206.68 k allocations: 5.636 MB)
A_mul_B!(MMatrix, MMatrix) compilation time (BLAS): 0.380845 seconds (102.64 k allocations: 3.086 MB)
Matrix multiplication
---------------------
Array -> 0.649478 seconds (1.16 M allocations: 706.425 MB, 5.91% gc time)
Mat -> 4.484301 seconds (5 allocations: 1.297 KB)
SArray -> 0.440806 seconds (5 allocations: 1.297 KB)
MArray -> 0.677562 seconds (578.71 k allocations: 644.613 MB, 8.54% gc time)
SArray (unrolled) -> 0.415102 seconds (5 allocations: 1.297 KB)
SArray (chunks) -> 0.442180 seconds (5 allocations: 1.297 KB)
SArray (loop) -> 0.643261 seconds (5 allocations: 1.297 KB)
MArray (unrolled) -> 1.974074 seconds (578.71 k allocations: 644.613 MB, 2.87% gc time)
MArray (chunks) -> 0.679407 seconds (578.71 k allocations: 644.613 MB, 8.67% gc time)
MArray (loop) -> 0.995913 seconds (578.71 k allocations: 644.613 MB, 5.70% gc time)
MArray (via SArray) -> 0.725085 seconds (578.71 k allocations: 644.613 MB, 8.16% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 0.683155 seconds (6 allocations: 2.594 KB)
MArray -> 0.594863 seconds (6 allocations: 2.438 KB)
MArray (unrolled) -> 1.331355 seconds (6 allocations: 2.438 KB)
MArray (chunks) -> 0.594744 seconds (6 allocations: 2.438 KB)
MArray (BLAS gemm!) -> 0.563744 seconds (6 allocations: 2.438 KB)
Matrix addition
---------------
Array -> 0.653210 seconds (2.78 M allocations: 1.656 GB, 11.03% gc time)
Mat -> 0.320931 seconds (5 allocations: 1.297 KB)
SArray -> 0.123755 seconds (5 allocations: 1.297 KB)
MArray -> 0.658376 seconds (1.39 M allocations: 1.511 GB, 20.96% gc time)
MArray (via SArray) -> 0.642737 seconds (1.39 M allocations: 1.511 GB, 21.53% gc time)
Matrix addition (mutating)
--------------------------
Array -> 0.504197 seconds (6 allocations: 2.594 KB)
MArray -> 0.131480 seconds (5 allocations: 1.297 KB)
=====================================
Benchmarks for 13×13 matrices
=====================================
SMatrix * SMatrix compilation time (unrolled): 2.700817 seconds (1.88 M allocations: 77.491 MB, 0.78% gc time)
SMatrix * SMatrix compilation time (chunks): 0.420520 seconds (255.94 k allocations: 10.455 MB, 2.66% gc time)
SMatrix * SMatrix compilation time (loop): 0.454477 seconds (314.77 k allocations: 13.798 MB)
MMatrix * MMatrix compilation time (unrolled): 10.346382 seconds (4.71 M allocations: 184.403 MB, 1.07% gc time)
MMatrix * MMatrix compilation time (chunks): 0.423432 seconds (380.61 k allocations: 15.284 MB)
MMatrix * MMatrix compilation time (loop): 0.668217 seconds (706.61 k allocations: 27.155 MB, 1.79% gc time)
Mat * Mat compilation time: 0.526640 seconds (549.34 k allocations: 20.713 MB, 1.27% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (unrolled): 3.698506 seconds (4.78 M allocations: 168.404 MB, 4.54% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (chunks): 1.058430 seconds (245.47 k allocations: 6.605 MB, 1.20% gc time)
A_mul_B!(MMatrix, MMatrix) compilation time (BLAS): 0.502997 seconds (120.67 k allocations: 3.617 MB)
Matrix multiplication
---------------------
Array -> 0.643510 seconds (910.34 k allocations: 659.802 MB, 4.71% gc time)
Mat -> 5.343075 seconds (5 allocations: 1.484 KB)
SArray -> 0.471099 seconds (5 allocations: 1.484 KB)
MArray -> 0.633169 seconds (455.17 k allocations: 590.349 MB, 6.57% gc time)
SArray (unrolled) -> 0.770587 seconds (5 allocations: 1.484 KB)
SArray (chunks) -> 0.471333 seconds (5 allocations: 1.484 KB)
SArray (loop) -> 0.624530 seconds (5 allocations: 1.484 KB)
MArray (unrolled) -> 2.108866 seconds (455.17 k allocations: 590.349 MB, 1.91% gc time)
MArray (chunks) -> 0.629886 seconds (455.17 k allocations: 590.349 MB, 6.42% gc time)
MArray (loop) -> 0.948281 seconds (455.17 k allocations: 590.349 MB, 4.28% gc time)
MArray (via SArray) -> 0.678120 seconds (455.17 k allocations: 590.349 MB, 5.96% gc time)
Matrix multiplication (mutating)
--------------------------------
Array -> 0.686434 seconds (6 allocations: 3.063 KB)
MArray -> 0.616371 seconds (6 allocations: 2.813 KB)