Improvement/field inv pornin20 #106

Tabaie · 2021-12-06T08:43:59Z

Benchmarked on a VM with an AMD EPYC 7R32 processor

For bw6-761/fp speedup ranged from 71% to 77%.
For bn254/fp speedup ranged from 43% to 65%.

Speedup on my ARM laptop was nowhere near as dramatic.

The inconsistency in the small field performance is due to higher variability in the number of outer loop iterations, a random effect that gets smoothed out in higher field size. In general, the algorithm has a good asymptotic complexity and is expected to scale well.

TODOs:

Remove InverseOld and mulWRegularBf (cleanup)
Implement the "two update factors in one register trick" described in the paper (perf)
Implement some core functions in assembly (linearCombNonModular, mulWRegular, montReduceSigned) (perf)

CLAassistant · 2021-12-06T08:44:10Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ Tabaie
✅ gbotrel
❌ Arya Tabaie

Arya Tabaie seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

gbotrel · 2021-12-06T13:56:36Z

field/internal/templates/element/inverse_pornin20.go

+	return b
+}
+
+// Though we're defining k as a constant, this code "profoundly" assumes that the processor is 64 bit


"profoundly" assumes --> is optimized ?

Doesn't make a whole lot of sense. Removing.

gbotrel · 2021-12-06T14:00:38Z

field/internal/templates/element/inverse_pornin20.go

+// approximate a big number using its uppermost and lowermost bits
+func approximate(x *{{.ElementName}}, n int) uint64 {
+
+	if n <= 64 {


question, just based on the function comment; if say, n == 24, shouldn't this clear the 40 other bits to 0?

…inv-pornin20

…date-factors

Tabaie and others added 30 commits October 28, 2021 12:42

Ignore .DS_Store

c9c01b9

Ignore Goland project files

8bb3662

feat: Naive GCD works

6070316

feat: Naive GCD improved

b611212

perf: Field element - Word multiplication implemented

772377a

perf: Field element - Word multiplication implemented

175d545

feat: Full paper implemented, unknown bug

0583abc

feat: Full paper implemented, tests passing

280751c

feat: Test for corrective factor consistency

95f6292

perf: Batch each 2 u,v updates. Update factors correct result incorrect

1d85105

test: Consistency check on top

e245c06

fix: Optimization 3 works, but with many watches

488a908

fix: Optimization 3 works, removed debugging code, down to 1879 ns/op

bd597e3

fix: Non-const t: Precomputation gives little speedup: 1511,1463,1551

f1fba7b

style: Some commentary

a04c917

style: Some more commentary

11ceb6b

perf: field inverse optimizations

a56e55e

perf!: Towards 1 mont-red per u,v update instead of 2. mulModR dev'd

536a347

perf: Linear comb w 1 MontRed instead of 2. Slow (debug logic inline)

a0cafca

perf: Removed debug logic

6a4cd29

refactor: SOS Montgomery Reduction

5ed39c5

perf: Branch-free signed non-mont word multiplication

c746ab4

style: Hand-inlined rsh31, comments, single correction factor

f218a38

feat: Signed mont-reduce

9a4de1e

chore: signed/unsigned versions of SOS mont for comparison

e4d1197

feat: Three ways of dealing with signed numbers in montgomery reduction

89eac26

style: minor changes

916f502

perf: signed sos ftw

9346b8a

style: comments and proofs

d8ffac0

feat!: Code generated, faulty

ebc5645

Tabaie added 3 commits December 6, 2021 01:47

chore: Not demanding 64bit arch. TODO: Test correctness on one

c6663e9

perf: Replace mulWRegular with faster branched version

0051bfb

test: BenchInverse to call InverseOld

c63f97c

Tabaie requested a review from gbotrel December 6, 2021 08:44

gbotrel approved these changes Dec 6, 2021

View reviewed changes

Tabaie added 21 commits December 6, 2021 10:45

style: comments

e32238d

style: comments

636c8ec

Merge remote-tracking branch 'origin/develop' into improvement/field-…

5d9a7bf

…inv-pornin20

Merge branch 'improvement/field-inv-pornin20' into perf/compressed-up…

1fb780c

…date-factors

fix: Update factor negation works

e4f7831

feat!: Update factor compression works. No speedup? Broke Json conv

47389e4

perf: Inlined conversion factor manipulation

df96df1

perf: Combined updates factor to be signed, next: fewer helper vars

86e63e1

perf: fewer helper variables

c837489

style: mathfmt

5363e14

perf: Four update factor vars

550eb2e

perf: partial rollback for bn254-fp

98b284f

fix: semi-compressed bn254/fp

60ffe80

chore: generify semicompressed

370c3af

style: more expressive argument name for approximate

68bdbcd

chore: Take out InverseOld

d4e4e45

chore: staticcheck, correct commented formula for outer loop iterations

d29e2a5

fix: 32bit compatible assertMatch for bn254/fp

3157553

fix: fixed bug for 64b

29bcddc

chore: generify 32bit fix

286c036

revert: remove mathfmt (for now)

9d7926b

gbotrel merged commit c772392 into develop Dec 9, 2021

gbotrel deleted the improvement/field-inv-pornin20 branch December 9, 2021 18:44

yelhousni mentioned this pull request Dec 9, 2021

perf: Final exp. on BLS12-381 #108

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvement/field inv pornin20 #106

Improvement/field inv pornin20 #106

Tabaie commented Dec 6, 2021

CLAassistant commented Dec 6, 2021 •

edited

Loading

gbotrel Dec 6, 2021

Tabaie Dec 6, 2021

gbotrel Dec 6, 2021

Improvement/field inv pornin20 #106

Improvement/field inv pornin20 #106

Conversation

Tabaie commented Dec 6, 2021

CLAassistant commented Dec 6, 2021 • edited Loading

gbotrel Dec 6, 2021

Choose a reason for hiding this comment

Tabaie Dec 6, 2021

Choose a reason for hiding this comment

gbotrel Dec 6, 2021

Choose a reason for hiding this comment

CLAassistant commented Dec 6, 2021 •

edited

Loading