Skip to content

Commit

Permalink
Merge pull request #550 from kroma-network/chore/update-benchmarks
Browse files Browse the repository at this point in the history
chore: update benchmarks
  • Loading branch information
chokobole authored Oct 24, 2024
2 parents 7bf9c56 + c0a4a48 commit 8f51ef6
Show file tree
Hide file tree
Showing 39 changed files with 331 additions and 266 deletions.
115 changes: 59 additions & 56 deletions benchmark/fft/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

```
Run on 13th Gen Intel(R) Core(TM) i9-13900K (32 X 5500 MHz CPU s)
Compiler: clang-15
CPU Caches:
L1 Data 48 KiB (x16)
L1 Instruction 32 KiB (x16)
Expand All @@ -17,75 +18,77 @@ CPU Caches:
L2 Unified 4096 KiB (x12)
```

Note: Run with `build --@rules_rust//:extra_rustc_flags="-Ctarget-cpu=native"` in your .bazelrc.user

### FFT

```shell
bazel run --config opt --//:has_matplotlib //benchmark/fft:fft_benchmark -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --vendor arkworks --vendor bellman --vendor halo2 --check_results
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft:fft_benchmark -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --vendor arkworks --vendor bellman --vendor halo2 --check_results
```

#### On Intel i9-13900K

| Exponent | Tachyon | Arkworks | Bellman | Halo2 |
| :------: | ------------ | ------------ | -------- | -------- |
| 16 | **0.000958** | 0.004086 | 0.007342 | 0.003784 |
| 17 | 0.032529 | **0.003283** | 0.012624 | 0.005433 |
| 18 | 0.014067 | **0.005768** | 0.025811 | 0.009372 |
| 19 | **0.008459** | 0.011465 | 0.05208 | 0.019333 |
| 20 | **0.016166** | 0.024533 | 0.106217 | 0.042381 |
| 21 | **0.039447** | 0.069444 | 0.212414 | 0.087621 |
| 22 | **0.125954** | 0.177245 | 0.431237 | 0.188843 |
| 23 | **0.297259** | 0.391987 | 0.835686 | 0.427426 |
| 16 | **0.002058** | 0.005143 | 0.006314 | 0.002249 |
| 17 | **0.002246** | 0.00334 | 0.015646 | 0.006193 |
| 18 | **0.010154** | 0.018807 | 0.046443 | 0.007574 |
| 19 | 0.022984 | **0.014652** | 0.076281 | 0.014506 |
| 20 | **0.02** | 0.02497 | 0.100082 | 0.042877 |
| 21 | **0.044831** | 0.075563 | 0.20222 | 0.067161 |
| 22 | **0.130201** | 0.179075 | 0.402452 | 0.169194 |
| 23 | **0.281398** | 0.394068 | 0.792004 | 0.372566 |

![image](/benchmark/fft/fft_benchmark_ubuntu_i9.png)

#### On Mac M3 Pro

| Exponent | Tachyon | Arkworks | Bellman | Halo2 |
| :------: | ------------ | ------------ | -------- | -------- |
| 16 | **0.002735** | 0.003468 | 0.007731 | 0.006372 |
| 17 | **0.005237** | 0.006043 | 0.015891 | 0.012804 |
| 18 | **0.009494** | 0.010686 | 0.027312 | 0.02485 |
| 19 | 0.020251 | **0.020156** | 0.055652 | 0.045714 |
| 20 | **0.038186** | 0.040006 | 0.110531 | 0.096778 |
| 21 | **0.085204** | 0.087181 | 0.228044 | 0.191695 |
| 22 | **0.166863** | 0.179635 | 0.472941 | 0.386844 |
| 23 | **0.347128** | 0.378249 | 0.970552 | 0.814043 |
| Exponent | Tachyon | Arkworks | Bellman | Halo2 |
| :------: | ------------ | -------- | -------- | -------- |
| 16 | **0.002526** | 0.003804 | 0.00784 | 0.005689 |
| 17 | **0.004694** | 0.005769 | 0.015577 | 0.01121 |
| 18 | **0.009246** | 0.010243 | 0.027834 | 0.022379 |
| 19 | **0.018328** | 0.020404 | 0.055661 | 0.041394 |
| 20 | **0.039683** | 0.041085 | 0.110702 | 0.086299 |
| 21 | **0.079138** | 0.087336 | 0.230857 | 0.175599 |
| 22 | **0.166646** | 0.177959 | 0.474296 | 0.352872 |
| 23 | **0.33996** | 0.363612 | 0.971581 | 0.748284 |

![image](/benchmark/fft/fft_benchmark_mac_m3.png)

### IFFT

```shell
bazel run --config opt --//:has_matplotlib //benchmark/fft:fft_benchmark -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --vendor arkworks --vendor bellman --vendor halo2 --run_ifft --check_results
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft:fft_benchmark -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --vendor arkworks --vendor bellman --vendor halo2 --run_ifft --check_results
```

#### On Intel i9-13900K

| Exponent | Tachyon | Arkworks | Bellman | Halo2 |
| :------: | ------------ | ------------ | -------- | ----------- |
| 16 | 0.003078 | 0.004531 | 0.007794 | **0.00297** |
| 17 | 0.011666 | **0.005005** | 0.012804 | 0.005309 |
| 18 | **0.005614** | 0.009204 | 0.025717 | 0.009741 |
| 19 | **0.007625** | 0.015332 | 0.050253 | 0.018729 |
| 20 | **0.016751** | 0.030142 | 0.111549 | 0.041873 |
| 21 | **0.039565** | 0.0715 | 0.222403 | 0.098125 |
| 22 | **0.140152** | 0.181124 | 0.415709 | 0.188011 |
| 23 | **0.317353** | 0.400472 | 0.845031 | 0.407396 |
| Exponent | Tachyon | Arkworks | Bellman | Halo2 |
| :------: | ------------ | -------- | -------- | ------------ |
| 16 | **0.001392** | 0.012028 | 0.009913 | 0.002413 |
| 17 | **0.002511** | 0.00427 | 0.01418 | 0.005731 |
| 18 | 0.01762 | 0.021167 | 0.034676 | **0.010811** |
| 19 | **0.009646** | 0.01447 | 0.058714 | 0.016038 |
| 20 | **0.030303** | 0.034815 | 0.104936 | 0.05337 |
| 21 | **0.047463** | 0.072579 | 0.199788 | 0.093146 |
| 22 | **0.146697** | 0.181389 | 0.391296 | 0.19874 |
| 23 | **0.285937** | 0.403596 | 0.82276 | 0.347876 |

![image](/benchmark/fft/ifft_benchmark_ubuntu_i9.png)

#### On Mac M3 Pro

| Exponent | Tachyon | Arkworks | Bellman | Halo2 |
| :------: | ------------ | -------- | -------- | -------- |
| 16 | **0.002766** | 0.004274 | 0.007948 | 0.006638 |
| 17 | **0.005883** | 0.006978 | 0.016308 | 0.013121 |
| 18 | **0.010532** | 0.012815 | 0.029066 | 0.028791 |
| 19 | **0.020781** | 0.024054 | 0.059351 | 0.048824 |
| 20 | **0.041061** | 0.048806 | 0.11825 | 0.099004 |
| 21 | **0.090855** | 0.101232 | 0.236775 | 0.210805 |
| 22 | **0.170776** | 0.203109 | 0.488306 | 0.423618 |
| 23 | **0.383255** | 0.454968 | 1.03129 | 0.881795 |
| 16 | **0.002798** | 0.003867 | 0.008102 | 0.005665 |
| 17 | **0.004882** | 0.005737 | 0.015998 | 0.011672 |
| 18 | **0.010308** | 0.010962 | 0.028118 | 0.022723 |
| 19 | **0.018724** | 0.021338 | 0.056855 | 0.042554 |
| 20 | **0.037687** | 0.043237 | 0.113848 | 0.089899 |
| 21 | **0.078429** | 0.092134 | 0.234585 | 0.174939 |
| 22 | **0.162542** | 0.189442 | 0.484644 | 0.361127 |
| 23 | **0.338646** | 0.392674 | 0.989173 | 0.765592 |

![image](/benchmark/fft/ifft_benchmark_mac_m3.png)

Expand All @@ -94,41 +97,41 @@ bazel run --config opt --//:has_matplotlib //benchmark/fft:fft_benchmark -- -k 1
### FFT

```shell
bazel run --config opt --config cuda --//:has_matplotlib //benchmark/fft:fft_benchmark_gpu -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --check_results
GOMP_SPINCOUNT=0 bazel run --config maxopt --config cuda --//:has_matplotlib //benchmark/fft:fft_benchmark_gpu -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --check_results
```

#### On RTX-4090

| Exponent | Tachyon CPU | Tachyon GPU |
| :------: | ----------- | ------------ |
| 16 | **0.00097** | 0.001231 |
| 17 | 0.002156 | **0.000667** |
| 18 | 0.003524 | **0.001297** |
| 19 | 0.007366 | **0.002654** |
| 20 | 0.015787 | **0.005877** |
| 21 | 0.03753 | **0.012573** |
| 22 | 0.122167 | **0.027632** |
| 23 | 0.268875 | **0.055971** |
| 16 | 0.002348 | **0.001** |
| 17 | 0.00204 | **0.001182** |
| 18 | 0.00393 | **0.002211** |
| 19 | 0.009317 | **0.004079** |
| 20 | 0.049204 | **0.008114** |
| 21 | 0.044158 | **0.01616** |
| 22 | 0.134064 | **0.032785** |
| 23 | 0.274101 | **0.066068** |

![image](/benchmark/fft/fft_benchmark_ubuntu_rtx_4090.png)

### IFFT

```shell
bazel run --config opt --config cuda --//:has_matplotlib //benchmark/fft:fft_benchmark_gpu -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --run_ifft --check_results
GOMP_SPINCOUNT=0 bazel run --config maxopt --config cuda --//:has_matplotlib //benchmark/fft:fft_benchmark_gpu -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --run_ifft --check_results
```

#### On RTX-4090

| Exponent | Tachyon | Tachyon GPU |
| :------: | -------- | ------------ |
| 16 | 0.000993 | **0.000833** |
| 17 | 0.001673 | **0.000643** |
| 18 | 0.003533 | **0.001305** |
| 19 | 0.007446 | **0.002701** |
| 20 | 0.016039 | **0.005882** |
| 21 | 0.03786 | **0.012817** |
| 22 | 0.126032 | **0.027767** |
| 23 | 0.32731 | **0.056064** |
| 16 | 0.002138 | **0.001341** |
| 17 | 0.00488 | **0.000933** |
| 18 | 0.003887 | **0.002502** |
| 19 | 0.00896 | **0.003806** |
| 20 | 0.017953 | **0.007745** |
| 21 | 0.043787 | **0.016268** |
| 22 | 0.132048 | **0.033012** |
| 23 | 0.291132 | **0.066022** |

![image](/benchmark/fft/ifft_benchmark_ubuntu_rtx_4090.png)
Binary file modified benchmark/fft/fft_benchmark_mac_m3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft/fft_benchmark_ubuntu_i9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft/fft_benchmark_ubuntu_rtx_4090.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft/ifft_benchmark_mac_m3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft/ifft_benchmark_ubuntu_i9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft/ifft_benchmark_ubuntu_rtx_4090.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
84 changes: 47 additions & 37 deletions benchmark/fft_batch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

```
Run on 13th Gen Intel(R) Core(TM) i9-13900K (32 X 5500 MHz CPU s)
Compiler: clang-15
CPU Caches:
L1 Data 48 KiB (x16)
L1 Instruction 32 KiB (x16)
Expand All @@ -17,70 +18,79 @@ CPU Caches:
L2 Unified 4096 KiB (x12)
```

### FFTBatch
Note: Run with `build --@rules_rust//:extra_rustc_flags="-Ctarget-cpu=native"` in your .bazelrc.user

```shell
bazel run --config opt --//:has_matplotlib //benchmark/fft_batch:fft_batch_benchmark -- -k 20 -k 21 -k 22 -k 23 -k 24 -k 25 -k 26 --vendor plonky3 -p baby_bear --check_results
```
### FFTBatch

WARNING: On Mac M3, tests beyond degree 24 are not feasible due to memory constraints.

#### On Intel i9-13900K

```shell
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft_batch:fft_batch_benchmark -- -k 20 -k 21 -k 22 -k 23 -k 24 -k 25 -k 26 --vendor plonky3 -p baby_bear --check_results
```

| Exponent | Tachyon | Plonky3 |
| :------- | ------------ | ------------ |
| 20 | 0.117925 | **0.110098** |
| 21 | 0.222959 | **0.218505** |
| 22 | 0.459209 | **0.447758** |
| 23 | 0.97874 | **0.964644** |
| 24 | 2.09675 | **2.092210** |
| 25 | **6.20441** | 6.98453 |
| 26 | **18.6084** | 20.7476 |
| 20 | **0.092595** | 0.094762 |
| 21 | **0.191168** | 0.193567 |
| 22 | 0.406239 | **0.384377** |
| 23 | 0.892501 | **0.842694** |
| 24 | 1.91177 | **1.90586** |
| 25 | **5.82862** | 7.34128 |
| 26 | **17.1807** | 20.3968 |

![image](/benchmark/fft_batch/fft_batch_benchmark_ubuntu_i9.png)

#### On Mac M3 Pro

| Exponent | Tachyon | Plonky3 |
| :------- | --------- | ------------ |
| 20 | 0.132521 | **0.072505** |
| 21 | 0.287744 | **0.140527** |
| 22 | 0.588894 | **0.280177** |
| 23 | 1.17446 | **0.621024** |
| 24 | 3.17213 | **2.399220** |
```shell
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft_batch:fft_batch_benchmark -- -k 20 -k 21 -k 22 -k 23 -k 24 --vendor plonky3 -p baby_bear --check_results
```

| Exponent | Tachyon | Plonky3 |
| :------- | -------- | ------------ |
| 20 | 0.083416 | **0.066952** |
| 21 | 0.194191 | **0.138168** |
| 22 | 0.408045 | **0.299547** |
| 23 | 0.955439 | **0.679252** |
| 24 | 11.8495 | **6.47188** |

![image](/benchmark/fft_batch/fft_batch_benchmark_mac_m3.png)

### CosetLDEBatch

```shell
bazel run --config opt --//:has_matplotlib //benchmark/fft_batch:fft_batch_benchmark -- -k 20 -k 21 -k 22 -k 23 -k 24 -k 25 --vendor plonky3 -p baby_bear --run_coset_lde --check_results
```

WARNING: On Mac M3, tests beyond degree 24 are not feasible due to memory constraints.
WARNING: On Intel i9-13900K, tests beyond degree 25 are not feasible due to memory constraints, and on Mac M3, tests beyond degree 24 are not feasible due to memory constraints.

#### On Intel i9-13900K

| Exponent | Tachyon | Plonky3 |
| :------- | ------------ | -------- |
| 20 | **0.414096** | 0.783275 |
| 21 | **0.828539** | 1.47701 |
| 22 | **1.784080** | 3.06198 |
| 23 | **3.673930** | 6.49181 |
| 24 | **9.325390** | 16.2383 |
| 25 | **25.66560** | 41.3335 |
```shell
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft_batch:fft_batch_benchmark -- -k 20 -k 21 -k 22 -k 23 -k 24 -k 25 --vendor plonky3 -p baby_bear --run_coset_lde --check_results
```

| Exponent | Tachyon | Plonky3 |
| :------- | ----------- | -------- |
| 20 | **0.46917** | 0.639744 |
| 21 | **0.92528** | 1.2923 |
| 22 | **1.87363** | 2.68427 |
| 23 | **4.06008** | 5.67987 |
| 24 | **9.6627** | 14.6164 |
| 25 | **25.7953** | 39.5498 |

![image](/benchmark/fft_batch/coset_lde_batch_benchmark_ubuntu_i9.png)

#### On Mac M3 Pro

```shell
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft_batch:fft_batch_benchmark -- -k 20 -k 21 -k 22 -k 23 -k 24 --vendor plonky3 -p baby_bear --run_coset_lde --check_results
```

| Exponent | Tachyon | Plonky3 |
| :------- | ------------ | ------------ |
| 18 | 0.100942 | **0.086087** |
| 19 | 0.214471 | **0.182212** |
| 20 | 0.481229 | **0.359246** |
| 21 | **0.981806** | 1.518190 |
| 22 | 3.86094 | **3.244580** |
| 23 | 7.50879 | **6.052250** |
| 20 | **0.318485** | 0.323865 |
| 21 | 0.667106 | **0.660975** |
| 22 | **1.44873** | 3.40795 |
| 23 | 8.27201 | **5.91238** |
| 24 | 39.9678 | **23.1033** |

![image](/benchmark/fft_batch/coset_lde_batch_benchmark_mac_m3.png)
Binary file modified benchmark/fft_batch/coset_lde_batch_benchmark_mac_m3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft_batch/coset_lde_batch_benchmark_ubuntu_i9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft_batch/fft_batch_benchmark_mac_m3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft_batch/fft_batch_benchmark_ubuntu_i9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 3 additions & 1 deletion benchmark/fft_batch/fft_batch_runner.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,16 +35,18 @@ class FFTBatchRunner {
math::RowMajorMatrix<F> result;
std::unique_ptr<Domain> domain =
Domain::Create(static_cast<size_t>(input.rows()));
base::TimeTicks start = base::TimeTicks::Now();
base::TimeTicks start;
if (run_coset_lde) {
const size_t kAddedBits = 1;
result =
math::RowMajorMatrix<F>(input.rows() << kAddedBits, input.cols());
start = base::TimeTicks::Now();
domain->CosetLDEBatch(input, kAddedBits,
F::FromMontgomery(F::Config::kSubgroupGenerator),
result);
} else {
result = input;
start = base::TimeTicks::Now();
domain->FFTBatch(result);
}
reporter_.AddTime(vendor, base::TimeTicks::Now() - start);
Expand Down
31 changes: 17 additions & 14 deletions benchmark/fri/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@

## CPU

```bash
```
Run on 13th Gen Intel(R) Core(TM) i9-13900K (32 X 5500 MHz CPU s)
Compiler: clang-15
CPU Caches:
L1 Data 48 KiB (x16)
L1 Instruction 32 KiB (x16)
Expand All @@ -17,32 +18,34 @@ CPU Caches:
L2 Unified 4096 KiB (x12)
```

Note: Run with `build --@rules_rust//:extra_rustc_flags="-Ctarget-cpu=native"` in your .bazelrc.user

```shell
bazel run --config opt --//:has_matplotlib //benchmark/fri:fri_benchmark -- -k 18 -k 19 -k 20 -k 21 -k 22 --batch_size 100 --input_num 4 --round_num 4 --log_blowup 2 --vendor plonky3 --check_results
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fri:fri_benchmark -- -k 18 -k 19 -k 20 -k 21 -k 22 --batch_size 100 --input_num 4 --round_num 4 --log_blowup 2 --vendor plonky3 --check_results
```

## On Intel i9-13900K

| Exponent | Tachyon | Plonky3 |
| :------- | ----------- | ------- |
| 18 | **2.97871** | 3.73433 |
| 19 | **5.76021** | 7.22556 |
| 20 | **11.2744** | 14.3306 |
| 21 | **22.5167** | 28.8935 |
| 22 | **47.6511** | 58.5402 |
| 18 | **1.59124** | 2.36518 |
| 19 | **2.87866** | 4.65791 |
| 20 | **6.06711** | 9.5114 |
| 21 | **12.1177** | 19.0475 |
| 22 | **24.4839** | 38.4716 |

![image](/benchmark/fri/fri_benchmark_ubuntu_i9.png)

## On Mac M3 Pro

WARNING: On Mac M3, high degree tests are not feasible due to memory constraints.

| Exponent | Tachyon | Plonky3 |
| :------- | ------- | ------------ |
| 18 | 3.68509 | **1.39107** |
| 19 | 7.37079 | **2.76483** |
| 20 | 14.9081 | **5.62375** |
| 21 | 30.3153 | **11.8295** |
| 22 | 64.8022 | **25.4490** |
| Exponent | Tachyon | Plonky3 |
| :------- | ------- | ------- |
| 18 | 3.96588 | 2.92354 |
| 19 | 7.95329 | 5.89079 |
| 20 | 15.8636 | 11.8225 |
| 21 | 46.1967 | 34.4965 |
| 22 | 182.084 | 124.7 |

![image](/benchmark/fri/fri_benchmark_mac_m3.png)
Binary file modified benchmark/fri/fri_benchmark_mac_m3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fri/fri_benchmark_ubuntu_i9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 8f51ef6

Please sign in to comment.