-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add schnorrsig batch verification #760
Conversation
Time to start rebasing on the nearly complete #558? |
15753e7
to
ec50168
Compare
…signatures f431b3f valgrind_ctime_test: Add schnorrsig_sign (Jonas Nick) 16ffa9d schnorrsig: Add taproot test case (Jonas Nick) 8dfd53e schnorrsig: Add benchmark for sign and verify (Jonas Nick) 4e43520 schnorrsig: Add BIP-340 compatible signing and verification (Jonas Nick) 7332d2d schnorrsig: Add BIP-340 nonce function (Jonas Nick) 7a703fd schnorrsig: Init empty experimental module (Jonas Nick) eabd9bc Allow initializing tagged sha256 (Jonas Nick) 6fcb5b8 extrakeys: Add keypair_xonly_tweak_add (Jonas Nick) 5825446 extrakeys: Add keypair struct with create, pub and pub_xonly (Jonas Nick) f001034 Separate helper functions for pubkey_create and seckey_tweak_add (Jonas Nick) 910d9c2 extrakeys: Add xonly_pubkey_tweak_add & xonly_pubkey_tweak_add_test (Jonas Nick) 176bfb1 Separate helper function for ec_pubkey_tweak_add (Jonas Nick) 4cd2ee4 extrakeys: Add xonly_pubkey with serialize, parse and from_pubkey (Jonas Nick) 47e6618 extrakeys: Init empty experimental module (Jonas Nick) 3e08b02 Make the secp256k1_declassify argument constant (Jonas Nick) Pull request description: This PR implements signing, verification and batch verification as described in [BIP-340](https://github.com/bitcoin/bips/blob/master/bip-0340.mediawiki) in an experimental module named `schnorrsig`. It includes the test vectors and a benchmarking tool. This PR also adds a module `extrakeys` that allows [BIP-341](https://github.com/bitcoin/bips/blob/master/bip-0341.mediawiki)-style key tweaking. (Adding ChaCha20 as a CSPRNG and batch verification was moved to PR #760). In order to enable the module run `./configure` with `--enable-experimental --enable-module-schnorrsig`. Based on apoelstra's work. ACKs for top commit: gmaxwell: ACK f431b3f (exactly matches the previous post-fixup version which I have already reviewed and tested) sipa: ACK f431b3f real-or-random: ACK f431b3f careful code review Tree-SHA512: e15e849c7bb65cdc5d7b1d6874678e275a71e4514de9d5432ec1700de3ba92aa9f381915813f4729057af152d90eea26aabb976ed297019c5767e59cf0bbc693
ec50168
to
03a61e3
Compare
rebased on master |
|
It's a bit unfortunate that this API doesn't really lend itself to cleanly supporting combined batches of BIP340 signature and taproot tweaks (which also need an EC multiplication). Given that this internally builds a batch object anyway, would it be reasonable to have that in the external API as well? So an idea could be that you:
|
OoO I like that constructions, it allows for lazy batching and verifying only when you're ready, which is also very useful for non-bitcoin applications by verifying things periodically when you have spare CPU time. |
Sounds like a reasonable plan. In particular, because it would be easy to add functions that manipulate the batch object for other schemes who need an EC mult at the end of verification. The current batch object only holds pointers to the elements, so care must be taken to ensure that they still exist at |
I think if it can be avoided it would be best to minimize holding pointers to caller provided objects, except in narrow cases (e.g. scratch)... lifetime management is hard for everyone. An alternative might be to have a function that takes a sigs countcount and pointers to arrays of pubkeys/signatures/messagehashes, then taproot count, and arrays for those. Less generic, but it would avoid needing to copy the inputs into library provided memory or retain pointers to caller provided objects. |
@gmaxwell The alternative is probably that the caller is going to do the copying into some batch object on their side instead, so I don't think it's that much of a difference. I think having the batch object have its own storage is probably better. That may mean that the caller should be able to select a maximum size (and once exceeded, transparently run validation of the already-provided batch?) |
Sounds fine to me, though I hope it doesn't need 2x the memory to store both the input and the intermediate work. :) |
I agree but I'm somewhat worried about how, this will probably require the caller to know the approximate size of the batch(or the amount of sigs/tweaks) when starting the batch. |
@elichai No, I mean the opposite! The caller shouldn't need to predict how large the batch will become - if they knew that, they wouldn't need it, as they could just choose to stop after a certain size instead. What I mean is that the caller gets to set a maximum memory usage limit, and when that limit would be exceeded, adding another entry to the batch just causes the batch validation to run on what was added so far - and remember the outcome of that. |
If what it processed so far failed, all further calls can be super fast because it's just going to return a fail. :P |
Taking short circuit evaluation of && to a next level. |
I like that :) it gives the caller a tradeoff between memory and CPU while not crippling them if they predicted wrongly the max size |
|
||
over1 = secp256k1_scalar_check_overflow(r1); | ||
over2 = secp256k1_scalar_check_overflow(r2); | ||
over_count++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BIP-0340 says we should also repeat if r1 or r2 are zero.
@@ -122,4 +122,9 @@ static SECP256K1_INLINE void secp256k1_scalar_cmov(secp256k1_scalar *r, const se | |||
*r = (*r & mask0) | (*a & mask1); | |||
} | |||
|
|||
SECP256K1_INLINE static void secp256k1_scalar_chacha20(secp256k1_scalar *r1, secp256k1_scalar *r2, const unsigned char *seed, uint64_t n) { | |||
*r1 = (seed[0] + n) % EXHAUSTIVE_TEST_ORDER; | |||
*r2 = (seed[1] + n) % EXHAUSTIVE_TEST_ORDER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the same goes for here. BIP-0340 says that 0 should be excluded.
How do people feel about the following API: int secp256k1_start_batch_size(size_t ops);
secp256k1_batch* secp256k1_start_batch(const secp256k1_context* ctx, secp256k1_scratch_space* scratch);
int secp256k1_batch_add_sig(ctx, batch, sig, msg, pubkey);
int secp256k1_batch_add_xpubkey_tweak_add_check(ctx, batch, parity, tweaked_pubkey, pubkey, tweak);
int secp256k1_batch_verify(ctx, batch); All the if (batch.len == batch.scratch_capacity) {
if (batch.failed) {return;}
batch.failed = !secp256k1_batch_verify(ctx, batch);
// clear the rest of the state
}
// add to batch
batch.len++
return
} (all the names are subject to bikeshedding) |
I like this idea of batch verifying in an |
I'm starting to think the We have (attempted) this nice streamable API for |
Tells you the size of the scratch space required for the amount of signatures/tweaks you want to batch |
Barring such an enhanced typedef int (secp256k1_batch_verify_gi_callback)(secp256k1_scalar *gi, size_t idx, void *data);
typedef int (secp256k1_batch_verify_callback)(secp256k1_scalar *na, secp256k1_scalar *nb, secp256k1_ge *pta, secp256k1_ge *ptb, size_t idx, void *data);
/* Verifies na_i*A_i = nb_i*B_i + ng_i * G for all i < n (with high probability). */
static int secp256k1_batch_verify(ctx, scratch, secp256k1_batch_verify_gi_callback cb_gi, secp256k1_batch_verify_callback cb, void *cbdata, size_t n);
secp256k1_batch_data_gi_from_sig(secp256k1_scalar *gi, sig);
secp256k1_batch_data_from_sig(secp256k1_scalar *na, secp256k1_scalar *nb, secp256k1_ge *pta, secp256k1_ge *ptb, sig, msg, pubkey);
secp256k1_batch_data_gi_from_xpubkey_tweak(secp256k1_scalar *gi);
secp256k1_batch_data_from_xpubkey_tweak(secp256k1_scalar *na, secp256k1_scalar *nb, secp256k1_ge *pta, secp256k1_ge *ptb, parity, tweaked_pubkey, pubkey, tweak); Edit: There are a couple of possible variations here. We could drop the |
My proposal was based on the idea that |
@roconnor-blockstream I don't see what the issue is with the multi-multiplication interface. The batch interface can do the aggregation of scalars before calling the multi-multiplication code. I also don't think we should be exposing a public interface for arbitrary EC operations/verifications. This library aims for a high-level interface of protocols. |
Similar results with GCC 7.5.0 on the same hardware pre-safegcd, with endo, with gmp:
pre-safegcd, with endo, without gmp:
post-safegcd:
|
Did benchmarks on a i7-7820HQ CPU with clock fixed at 2.6 Ghz. I do indeed observe a small regression on some GCC versions (7,8,10), but on clang it appears to go the other way around. I don't think there is much reason for concern here - we know there are variations in performance between compiler versions, and it's to be expected that different code will affect different compilers differently:
|
I noticed a higher variance in my benchmark than I was used to and re-run the experiment in a more controlled environment (gcc 10.2.0). I did not find a performance degradation post-rebase anymore. Single |
I added a commit to reduce the batch verification randomizers to 128 bits. This gives up to a 9% speedup. |
I'm intending to remove the batch verification speedup graph from BIP-340 and instead place it in libsecp's doc directory. Therefore, I added a commit that allows recreating said graph (originally proposed for BIP-340). I removed the log fit from the graph and instead increased the granularity. The shape of the graph may change again once the optimal pippenger threshold/windows are updated to reflect the latest improvements. |
ebf3d64
to
7ed9847
Compare
Added two commits:
|
7ed9847
to
869e709
Compare
Rebased the PR to (hopefully) fix CI issues. |
This is in preparation for schnorrsig_batch_verify.
Without this commit, 8192 points require 2 batches.
This is just a commit for benchmarks and should be improved if 128 bit randomizers are to be actually used. 1) it does not follow bip-schnorr batch verification 2) the randomizers are not uniformly distributed in [0, 2^128-1] for no reason 3) chacha output is thrown away
H/T roconnor-blockstream for this idea
Would it be a win to skip applying the endomorphism where the corresponding scalar is equal to zero? (or perhaps for strauss, more generally only applying endomorphism to digits that get used, which has not applying it at all for zero as a subset), |
If there's concern that our current
|
Hey! What is the status of this PR? What are the blockers to get it merged? :) BIP340 mentions BatchVerify and I'm asking because I would like to know when I can offer this as part of my API in a Swift wrapper around libsecp256k1. I guess there is no low-hanging fruit I can help with to get this merged? One does not simply do crypto... Thanks! |
Hey @Sajjon, this PR needs a significant overhaul before getting merged as discussed above. I proposed this as a project to https://www.summerofbitcoin.org/. If there are people wanting to do this project, my plan is that we will help them to get a PR ready this summer. |
I am interested in this. Should I start from PR #558 to understand it. |
@jonasnick should this be closed since it was basically superseded by #1134? |
yes, thanks |
This was part of #558 (for 20 months) to demonstrate the advantages of batch verification (see graph), but then removed to simplify #558 because there are still ongoing discussions: