You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Benchmark the top implementations.
Rust and Go on your machine. Go is using the standard libsecp256k1 library, which is also wrapped in Nim: https://github.com/status-im/nim-secp256k1 Difficulty: Very easy
Benchmark Constantine on the same computer to have a baseline. CC=clang nimble bench_ec_g1_scalarmul
Projective coordinates will need to be commented out if previous step has been skipped, endomorphism acceleration will need to be commented out. Difficulty: Very easy
Expected speedup: 80% compared to endomorphism, 2.34x compared to baseline.
Reference paper:
Efficient and Secure Algorithms for GLV-Based Scalar
Multiplication and their Implementation on GLV-GLS
Curves (Extended Version)
Armando Faz-Hernández, Patrick Longa, Ana H. Sánchez, 2013 https://eprint.iacr.org/2013/158
This will require refactoring of the core Fp type to indicate if they are generic or use a special-form. This may use either separate FpGeneric and FpPseudoMersenne or a static enum:
## called "Crandall prime" or Pseudo-Mersenne Prime
## in the litterature
## which allows fast reduction from the fact that
## 2ᵐ-c ≡ 0 (mod p)
## <=> 2ᵐ ≡ c (mod p) [1]
## <=> a2ᵐ+b ≡ ac + b (mod p)
##
## This partially reduces the input in range [0, 2¹³⁰)
Difficulty: Medium on math, Hard on refactoring
Expected speedup:
Currently, most implementation splits the field Fp over 5x52-bit limbs or 10x26-bit limbs. This predates the existence of MULX, ADOX, ADCX instructions introduced in 2015 for accelerating bigint multiplication. The issue is that cost scales with the square of number of limbs, with by about 3n² so between 4 and 5 limbs, it's 48 vs 75 operations.
Constantine already provides fast broadwell multiplication and special prime reductions are significantly faster than Montgomery reductions. 25% over baseline
Following https://forum.nim-lang.org/t/10550#70556, this outlines step to put Nim at the top of https://programming-language-benchmarks.vercel.app/problem/secp256k1
Rust and Go on your machine. Go is using the standard libsecp256k1 library, which is also wrapped in Nim: https://github.com/status-im/nim-secp256k1
Difficulty: Very easy
Here:
constantine/constantine/math/arithmetic/finite_fields.nim
Lines 443 to 456 in 34baa74
This is blocking for projective coordinates benchmarking due to this:
constantine/constantine/math/elliptic/ec_shortweierstrass_projective.nim
Line 231 in 34baa74
For secp256k1, curve equation y² = x³ + ax + b, with a=0 and b=7.
- Optimal shortest addition chains can be found here: https://wwwhomes.uni-bielefeld.de/achim/addition_chain.html
Difficulty: Very easy
CC=clang nimble bench_ec_g1_scalarmul
Projective coordinates will need to be commented out if previous step has been skipped, endomorphism acceleration will need to be commented out.
Difficulty: Very easy
The constants can be generated from https://github.com/mratsim/constantine/blob/master/sage/derive_endomorphisms.sage. Note there is no G1 or G2 for non-pairing-friendly curves like secp256k1 or Banderwagon so the Sage code needs to be modified to handle both case. The constants can then be added to https://github.com/mratsim/constantine/tree/master/constantine/math/constants.
Efficient and Secure Algorithms for GLV-Based Scalar
Multiplication and their Implementation on GLV-GLS
Curves (Extended Version)
Armando Faz-Hernández, Patrick Longa, Ana H. Sánchez, 2013
https://eprint.iacr.org/2013/158
FpGeneric
andFpPseudoMersenne
or astatic enum
:constantine/constantine/math/config/type_ff.nim
Lines 16 to 33 in 34baa74
constantine/constantine/mac/mac_poly1305.nim
Lines 37 to 46 in 34baa74
Currently, most implementation splits the field Fp over 5x52-bit limbs or 10x26-bit limbs. This predates the existence of MULX, ADOX, ADCX instructions introduced in 2015 for accelerating bigint multiplication. The issue is that cost scales with the square of number of limbs, with by about 3n² so between 4 and 5 limbs, it's 48 vs 75 operations.
Constantine already provides fast broadwell multiplication and special prime reductions are significantly faster than Montgomery reductions.
25% over baseline
Jerome Solinas
https://cacr.uwaterloo.ca/techreports/1999/corr99-39.pdf
Jean Claude Bajard and Sylvain Duquesne
https://eprint.iacr.org/2020/665
This fuses multiplication and reduction by moduli of special form
Efficient Arithmetic In (Pseudo-)Mersenne Prime Order Fields
Kaushik Nath and Palash Sarkar
https://eprint.iacr.org/2018/985
constantine/constantine/signatures/bls_signatures.nim
Lines 37 to 49 in 34baa74
constantine/constantine/ethereum_bls_signatures.nim
Lines 224 to 228 in 34baa74
The text was updated successfully, but these errors were encountered: