layout |
---|
default |
This is an overview of foundational papers relevant to zkSNARKs. It’s mostly based on presentations by Jens Groth at the IC3 Bootcamp at Cornell University in July 2018 and at the 2nd ZKProof Standards workshop in Berkeley, CA in April 2019.
Jens talks about three recurring motifs:
- Language - what statements can we prove
- Security - what are our assumptions, unconditional soundness vs. unconditional zero knowledge
- Efficiency - prover, verifier computation, interaction, setup size, succinctness
The following papers in theoretical computer science/cryptography set the boundaries of what we are working with (note that some papers deal with more than one aspect of ZKPs):
[GMR85] Goldwasser, Micali, Rackoff [The Knowledge Complexity of Interactive Proof Systems]
- The paper which introduced the notion of interactive proofs, which constitute the most basic primitive underlying all modern zero-knowledge proof and argument systems
- GMR also define zero knowledge within the context of IP, along with notions such as completeness and soundness, which are fundamental for the theory of IP
- As a first use case, the others present zero knowledge interactive proofs for quadratic (non)residuosity
[GMW91] Goldreich, Micali, Wigderson [Proofs that Yield Nothing But their Validity or All Languages in NP have Zero-Knowledge Proofs]
- look at graph 3-coloring problem
- prove that NP-complete languages have proof systems
[GMW91] Goldreich, Micali, Wigderson [Proofs that Yield Nothing But their Validity or All Languages in NP have Zero-Knowledge Proofs]
- one-way functions suffice for computational zero-knowledge
[BC86] Brassard, Crépeau [All-or-Nothing Disclosure of Secrets]
- you can get perfect (unconditional) zero-knowledge as opposed to computational zero-knowledge
[BCC88] Brassard, Chaum, Crépeau [Minimum Disclosure Proofs of Knowledge]
- look at zero-knowledge against unbounded adversaries
[BFM88] Blum, de Santis, Micali, Persiano [Non-interactive Zero-knowledge]
- CRS (common reference string) for non-interactive ZKPs
[Kilian92] Kilian [A note on efficient zero-knowledge proofs and arguments]
- how to get succinct interactive proofs
Later on we have some ideas from pairing-based cryptography:
[BGN05] Dan Boneh, Eu-Jin Goh, Kobbi Nissim [Evaluating 2-DNF Formulas on Ciphertexts]
- pairing-based double-homomorphic encryption scheme
From there, we can ask, how to get perfect non-interactive zero-knowledge?
[GOS06] Jens Groth, Rafail Ostrovsky, Amit Sahai [Perfect Non-Interactive Zero Knowledge for NP]
- efficient, uses circuit-SAT
[GS08] Jens Groth, Amit Sahai [Efficient Non-Interactive Proof Systems for Bilinear Groups]
- efficient, uses a practical language (R1CS)
[Gro10] Jens Groth [Short pairing-based non-interactive zero-knowledge arguments]
- preprocessing zk-SNARKs
- power knowledge of exponent(KoE) assumption
- universal CRS, works for any circuit
And then, how to get succinctness?
[GW11] Gentry, Wichs [Separating Succinct Non-Interactive Arguments From All Falsifiable Assumptions]
- SNARG = Succinct non-interactive argument
- justifies need for KoE assumption
[BCCT12] Bitansky, Canetti, Chiesa, Tromer [From extractable collision resistance to succinct non-Interactive arguments of knowledge, and back again]
- zk-SNARK
- verifier time polynomial in statement size (ease of verification)
[GGPR13] Gennaro, Gentry, Parno, Raykova [Quadratic Span Programs and Succinct NIZKs without PCPs]
- quadratic span programs, quadratic arithmetic programs
- specialized CRS, tailored to specific circuit
This is a graph of various constructions in this video at 34:00
And at this point we begin to see implementations:
[PHGR13] Parno, Gentry, Howell, Raykova [Pinocchio: Nearly Verifiable Practical Computation]
Relating back to the three motifs, Jens talks about the following areas for improvement with regard to each of the three motifs:
-
Language: Pairing-friendly languages, interoperability between languages
-
Security: Setup - multi-crs, mpc-generated, updatable; Formal verification
-
Efficiency: Asymmetric pairings, commit-and-prove
- Introduction to Modern Cryptography by Katz, Lindell
- Algebra by M. Artin
- Abstract Algebra by Dummit and Foote
- Introduction To Commutative Algebra, by M. Atiyah and I. Macdonald
- Arithmetic of Elliptic Curves by J. Silverman
-
The motivating question in defensive ZKP compiler pipeline design is:
- If most general-purpose applications of ZKPs will require general-purpose programmable privacy, what compiler stack can allow developers to safely write programs and also produce efficient outputs?
-
This is the motivating question simply because the things that are many are the programs. Eventually, we’ll have a few proof systems and maybe a few different compiler stacks and those might have some bugs and we’ll need to iron out those bugs and formally verify those parts - but the proof systems and compiler stacks will be relatively small in number compared to the actual programs.
- Consider comparable examples, such as the Ethereum Virtual Machine (EVM) and the Solidity compiler. Early on in the history of the EVM and Solidity, the EVM and the Solidity compiler both had a lot of bugs, and often, bugs in programs were if not equally likely then at least non-trivially likely to be due to a bug in the compiler or in the EVM as opposed to a bug in the program caused by a developer not thinking about their model or writing the Solidity code correctly.
- But, over time, as the EVM and Solidity “solidified” and became better understood and more formally modeled in different ways, the frequency of those bugs has declined, in the case of the EVM perhaps to zero and in the case of Solidity to a very low number and to mostly minor ones. Now most of the problems, if you go on Rekt, most bugs nowadays are not bugs in the Solidity compiler or bugs in the EVM but rather bugs in code written by developers - sometimes mistakes caused by the model of the program itself being unclear or wrong (with respect to what the developer probably intended), sometimes mistakes in the use of the Solidity language to express that model, but in both cases the result of an incongruity at the developer level. If you’re interested in preventing bugs, you have to focus on this layer because most of the bugs are written by developers.
-
This holds doubly so for programs involving zero-knowledge proof systems (ZKPs) and privacy, because most developers aren’t going to be intricately familiar with the implementation details of zero-knowledge proof systems, and if the compilation stack requires that they understand these details in order to write programs safely, their programs are going to have a lot of bugs. It’s really hard to understand cryptography, especially if you don’t have a deep background in it and especially if you’re writing a complex application at the same time.
-
Of course, if you have a very safe compiler stack but it’s not fast enough and the performance costs are too high for actual applications, no one will be any safer because no one will use it. So, the question can be rephrased a wee bit:
- What compiler stacks can provide the most protection to developers, while also being efficient enough that they will actually be used?
-
There are some different constraints in circuit / ZKP compiler design than in the design of ordinary compilers.
- For one, there are some very specific “threshold latency constraints”, where execution times under a few seconds will be fine but execution times over that will render many applications impossible.
- In most cases, at least if you want privacy, users have to be creating the proofs, and when users are creating the proofs there’s a really big user experience difference between 3 seconds latency and 9 seconds latency, even though this is only a constant factor of 3.
- With normal compilers, you might be able to tolerate such small factor differences in the interest of making it easier to write programs (see: Javascript) - if you’re running a spreadsheet tax calculation program, for example, a difference of a factor of 3 may not actually matter that much - the person would just do something else while the program is running as long as they aren’t running it too frequently.
- With normal compilers you also tend to care about user interaction latency, which has deep implications in compiler design. For example, it means that languages with automatic memory management tend to do incremental garbage collection, which limits garbage collector latency, as opposed to simpler garbage collection algorithms which can cause unpredictable latency spikes.
- This doesn’t apply when writing circuits to be used in ZKP systems, because all of the interactions are non-interactive with respect to one individual program - the compiler should instead optimize for making the total prover execution time as low as possible.
-
Another class of parties who could potentially write bugs are the circuit compiler developers - us! - so it behooves the ecosystem to come up with at least some reusable complements of the compiler stack that can be shared by multiple parties and jointly verified. To that end, the subsequent parts of this post craft a high-level taxonomy of current approaches to circuit compilation and describe their trade-offs.
-
This taxonomy categorizes compilation approaches in three broad classes, which are defined and analyzed in order:
- DSL approach
- ISA/VM approach
- Direct approach
-
The first approach and the approach that quite logically the ecosystem started with is herein termed the “DSL approach”.
-
DSL stands for “domain-specific language”, and the DSL approach is to take something like R1CS or PLONK or AIR and build a language on top of it, crafting abstractions iteratively in order to simplify the job of the programmer.
-
DSL instructions in this sort of a system typically act like constraint-writing macros - some variables can be tracked, there’s some sort of simplified compilation going on, but there’s no intermediate instruction set or abstract machine, and instructions in the DSL translate pretty directly to constraints. Examples of this approach include:
-
While these languages are quite different, written with different syntaxes and targeting different proof systems, they are all following the same kind of general approach of starting with a low-level constraint system and iteratively adding abstractions on top. With the DSL approach, each program is compiled to its own circuit.
-
Advantages of the DSL approach:
- It’s relatively easy to write a DSL because you can iteratively build abstractions on top of constraints - you don’t need to have a whole separately-defined instruction set or architecture
- Because you have this low-level control, you can use a DSL as an easier way of writing hand-rolled circuit optimisations - it’s easy to reason about performance when the developer has a good understanding of exactly what constraints each instruction is translated into
-
Disadvantages of the DSL approach:
- DSLs aren’t as capable of high-level abstraction, at least not without more complex compiler passes and semantics more removed from the underlying constraint systems
- If there are a lot of different DSLs for different proof systems, developers are always going to need the semantics of some proof-system-specific language in order to write circuits, and the semantics of these DSLs are quite different than the semantics of the languages most developers already know
- If developers are writing programs that include some circuit components and some non-circuit components, those are two different semantics and languages, and the conversions in-between are a likely location for the introduction of bugs
- Most DSLs don’t built in much of the way of a type system for reasoning about the behavior about programs, so developers who wish to verify that their programs satisfy certain correctness criteria will need to use external tools in order to accomplish this
-
The next compilation approach is termed ISA (instruction set architecture) / VM (virtual machine) compilation. This approach is more involved.
- ISA/VM compilation requires clearly defining an intermediate virtual machine and instruction set architecture which then intermediates in the compilation pipeline between the high-level languages in which developers are writing programs and the low-level zero-knowledge proof systems with which proofs are being made.
- Although it is possible to write VM instructions directly, in this approach it is generally assumed that developers won’t (and these VMs are not optimized for developer ergonomics).
-
This can be done with an existing instruction set.
- For example, the Risc0 team has done this with the RISC-V microarchitecture - originally designed for CPUs with no thought given to ZKP systems - but Risc0 has written a STARK prover and verifier for RISC5 which is compatible with the existing specification of the RISC-V ISA.
-
This can also be done with a custom-built VM and instruction set designed specifically to be efficient for a particular proof system, and potentially for recursive verification in the VM or some other specific application you might want to support efficiently.
-
All of these approaches use one circuit for all programs, and support many higher-level languages - the higher-level languages just need to compile to the specified instruction set - and all of these examples are von Neumann architecture-style machines - so there’s a stack, operations on the stack, some form of memory, storage, I/O, and maybe some co-processors or built-in operations.
-
Advantages of the ISA/VM approach:
- Developers can use higher-level languages which they are either already familiar with or are easier to learn than low-level circuit DSLs
- In the case of using an existing microarchitecture, such as RISC-V, you can reuse tons of existing tooling - anything which compiles to RISC-V, which is anything which compiles to LLVM, which is almost everything you would care about - can be used with the Risc0 circuit, prover, and verifier
- Even in the case of a new instruction set architecture, such as with Miden VM and Triton VM, compilation techniques to these kinds of stack-based VMs for both imperative and functional languages are pretty well-known already, and compiler authors can reuse a lot of existing work that is not specific to ZKPs or circuits
-
Disadvantages of the ISA/VM approach:
- Although it’s unclear exactly what this cost is, there is probably going to be some cost of abstraction when compiling high-level programs to some intermediate instruction set - e.g. you’re doing stack operations, which can probably be checked more directly in a purpose-built circuit for any given program. There’s reason to expect that this will be especially true and that there will be a higher performance penalty if the instruction set architecture wasn’t built specifically for ZKPs (such as RISC-V).
- Arguably, this approach offers more power than applications actually need, and is going to be correspondingly harder to verify correctness of because it’s more powerful. There’s a lot of noise about Turing-completeness, but no one really wants to be able to run ZKP computations that don’t terminate because it would be impossible to create a proof in practice. This is different than in the interactive world - most interactive programs, say a spreadsheet, never terminate because they take in user input in a loop and take action according to the users’ input, then wait for more input until the user (or OS) tells the program to exit. Circuit execution, on the other hand, always terminates.
- If you want to verify the correctness of the compiler in the ISA/VM approach, you have to verify both the correctness of the translation from the higher-level language semantics into the intermediate instructions, and the correctness of the circuit which implements the intermediate ISA VM - although if you’re using existing or more powerful higher-level languages, there’s probably a lot of existing work that can be reused here.
-
The third approach, for lack of a better word, is just referred to here as direct. The basic idea here is to somehow directly compile higher-level programs to circuits.
- These higher-level programs could be functional, based on the lambda calculus, or some kind of imperative language, but the direct approach involves transforming this high-level language directly to a circuit without an intermediate instruction set.
- The best candidate for this sort of translation, at least for functional languages, is based on a paper called “Compiling to Categories” by Conal Elliot, which works by taking the kind of simple lambda calculus core that you can translate a functional language to, translating the semantics of lambda calculus operations to the semantics of a category, translating the categorical semantics into the category of polynomial operations, then translating the polynomial operations into a specific circuit.
- You can find a more detailed description in the GEB project repository here. Similar to the DSL approach, with the direct approach each program generates a unique circuit. Other kinds of direct compilation may be possible - arguably, there’s a sort of continuum of DSLs where they approach higher and higher-level semantics and get more and more like this kind of direct compilation.
-
Advantages of direct compilation:
- Developers can use a very high-level language and at least in principle you can get very high performance because you don’t lose any information when you compile this way (e.g. you don’t have to represent things in terms of intermediate stack operations)
- This approach is similar to some existing work in hardware circuit synthesis done outside of the context of ZKP systems but in the context of the end of Moore’s law and compiler pipelines that want to produce physical circuits and extremely parallel programs
- At least with “Compiling to Categories”, compilation has a very clear mathematical model that is easy to reason about and verify
-
Disadvantages of direct compilation:
- More complexity in the compiler pipeline, at least compared to the simple DSL approach
- Most existing tooling cannot be reused, since the core and compilation techniques are new and non-standard
- As in general with any approach that involves a higher degree of abstraction, it may be more difficult for developers to reason about the performance of their programs
-
So many compiler options! It’s all a little exhausting - and prompts the question: what might be expected to happen, and where, accordingly, can efforts best be spent right now?
- The three approaches outlined here have different trade-offs, particularly in terms of potential performance, low-level control, degree of abstraction, and difficulty of creating appropriate tooling.
- Due to these trade-offs, we’re likely to see a hybridization of approaches, where critical functions and gadgets (e.g. hash functions, Merkle proof verification, signature checks, recursive verification, other cryptography) are written directly in lower-level DSLs (1) or even in raw constraints, but these will be provided in standard libraries and then tied together with “business logic” written by developers in a higher-level language compiled using either the ISA/VM (2) or direct compilation (3) approaches.
- Hybridization is likely because it gives you the best of both worlds - you can spend your “optimisation time” (the time of people writing hand-optimized circuits) on the critical “hot spots” of your code which really need to be optimized and whose performance mostly determines the overall performance of your program, and use a less efficient but more general compilation pipeline for custom per-program “business logic” written by regular developers.
-
This is also similar to the structure of cryptographic systems today - if you look at how people implement cryptographic protocols, there’s usually hardware support (after standardization) for specific optimized primitives (hash functions, encryption algorithms), which are tied together in protocols themselves written in higher-level languages which are easier to develop in and reason about.
-
Finally, a minor exhortation: agreement on and convergence to specific standard primitives and intermediate interfaces can help facilitate deduplication of work across the ecosystem and an easier learning environment for developers.
- A good example of a moderately successful standardization effort in the space already is the IBC protocol, which is now being adopted by many different chains and ecosystems. Anoma is trying to help a little bit in this direction with VampIR proof-system-agnostic intermediate polynomial representation, and we’d be very happy to collaborate or learn about other standards efforts.
- If you are developing other standards efforts, please consider adding to this section or let us know at research@delendum.xyz
Name | Host Language | Features | GitHub |
---|---|---|---|
Libsnark | C++ | General-purpose proof systems, gadget libraries | https://github.com/scipr-lab/libsnark |
Bulletproofs | Rust | Single-party proofs, online multi-party computation, R1CS | https://github.com/dalek-cryptography/bulletproofs |
Bellman | Rust | Circuit traits, primitive structures, basic gadget implementations | https://github.com/zkcrypto/bellman |
gnark | Go | High level API with frontend and backend to design circuits | https://github.com/ConsenSys/gnark |
Arkworks | Rust | R1CS, curves, Groth16, finite field, curves | https://github.com/arkworks-rs |
Circomlib | Javascript | Circom templates | https://github.com/iden3/circomlib |
libSTARK | C++ | ZK-STARK library | https://github.com/elibensasson/libSTARK |
plonky2 | rust | SNARK implementation based on techniques from PLONK and FRI | https://github.com/mir-protocol/plonky2 |
plonk | rust | Pure Rust implementation of the PLONK ZKProof System | https://github.com/dusk-network/plonk |
Spartan | rust | High-speed zkSNARKs without trusted setup | https://github.com/microsoft/Spartan |
DIZK | Java | Distributed polynomial interpolation, Lagrange polynomials, multi-scalar multiplication | https://github.com/scipr-lab/dizk |
wasmsnark | Javascript | Generate zkSnark proofs and verify from web browser | https://github.com/iden3/wasmsnark |
jellyfish | rust | Rust Implementation of the PLONK ZKP System and Extensions | https://github.com/EspressoSystems/jellyfish |
libiop | C++ | IOP-based zkSNARKs | https://github.com/scipr-lab/libiop |
Nova | rust | Recursive SNARKs without trusted setup | https://github.com/microsoft/Nova |
- Proof generation is time-consuming, high-latency, and not efficient enough on traditional CPUs
- As L2 rollups and other applications of ZKP grow, we need to consider hardware as a potential breakthrough of performance improvement
- zk-mining: Plug-n-play FPGA accelerator cards (revenue through hardware sale)
- zk-Rollup: public, private and on-premise cloud (revenue through hourly premiere)
- Software auxiliary: SDK for dApps and API (license to program FPGA in developer-friendly way
- Permissioned networks, e.g. StarkNet
- Permissionless networks where provers compete on cost, e.g. Mina's Snarketplace
- Permissionless networks where provers compete on latency, e.g. Polygon Hermes & Zero
- Performance: latency, throughput, and power consumption
- Total cost of ownership (capital expenditures and maintenance costs)
- NRE cost (non recurring engineering: onboarding difficulties)
- Long-term comparative advantage: competitor performance may double in a year with the same MSPR (e.g. Intel, Nvidia GPU)
- Current popular structure is to combine CPU and FPGA/GPU to perform proof generation. CPU tackles single-threaded pieces at higher frequency and deal with non-determinism
- MSM and FFT can be parallelized, but arithmetization and commitments may be single threaded (different from the “embarrassingly parallel” PoW mining structure)
- Most FPGA design tools are proprietary: FPGAs have higher barrier to enter than GPU accelerations
- Full node is resource-draining: downloading and verifying the whole chain of blocks takes time and resources
- Not retail friendly to be on mobile and perform quick actions like sending transactions and verifying balances
- Trust-minimized interaction with full nodes and do not interact directly with the blockchain
- Much lighter on resources and storage but require higher network bandwidth
- Linear time based on chain length: verify from the root to the target height (may be from a trusted height, therefore constant length)
- Process overview:
- retrieve a chain of headers instead of full block
- verify the signatures of the intermediary signed headers and resolve disputes through checking with full nodes iteratively
- Superlight client: requires downloading only a logarithmic number of block headers while storing only a single block header between executions
Name | Mechanism | Prover Complexity | Verifier Complexity | Trusted Setup | Reference |
---|---|---|---|---|---|
Recursive SNARKs | Cycle of elliptic curves | High | Low | Yes | https://minaprotocol.com/wp-content/uploads/technicalWhitepaper.pdf |
KVC (Key-Value Commitment) | Constant size key-value map | Med-high | Medium | No | https://eprint.iacr.org/2020/1161.pdf |
AMT (authenticated multipoint evaluation tree) | authenticate a polynomial multipoint evaluation at the first n roots of unity | Medium | Medium | No | https://people.csail.mit.edu/devadas/pubs/scalable_thresh.pdf |
KZG Polynomial Commitment | Elliptic curve pairings with BLS12-381 | High | Low | Yes | https://www.iacr.org/archive/asiacrypt2010/6477178/6477178.pdf |
Verkle Tree | (with anonymous revocation) | High | Low | No | https://math.mit.edu/research/highschool/primes/materials/2018/Kuszmaul.pdf |
Vector Commitment | RSA and computational Diffie-Hellman over bilinear groups | High | Low | Yes | https://eprint.iacr.org/2011/495.pdf |
One-to-many prover VSS | Prover batching with sumcheck and Fiat-Shamir | Medium | Low | No | https://www.usenix.org/system/files/sec22summer_zhang-jiaheng.pdf |
- Proof systems encode computation as a set of polynomial equations defined over a field. Operating on a different field will lead to gigantic circuits.
- For pairing based SNARKs, there is a limited selection of pairing-friendly curves such as BN254 and BLS12-381.
- FFTs require factors of two for the curves in order to make the polynomial computations practical.
- Proof recursions require encoding arithmetic statements of a field into equations over a different field.
- Bridge operators and roll-ups need interoperability with non-pairing friendly and not highly 2-adic curves, such as Bitcoin and Ethereum’s ECDSA over secp256k1. (BN254 precompiled contract)
- Choosing the field is a tradeoff of speed and security.
- Brute force: decompose elements into limbs and operate on the limbs; Reduce and recompose the elements only when necessary
- Very expensive: may represent up to 1000X the original constraints in the circuit
- xJsnark framework
- gnark implementation
- 2-chains and cycles: use matching base fields to implement one curve inside of the other (assume pairing based schemes)
- Non-pairing based scheme can use non-pairing friendly or hybrid cycles at a cost, such as using PCS (polynomial commitment scheme) or pasta curves with linear time.
- Pasta Curves
- 2-chains of elliptic curves
- Lower the requirements on the field: Polynomial Commitment Scheme (PCS) may not involve elliptic curves at all to be instantiated on arbitrary fields
- A survey of elliptic curves for proof systems
- ECFFT: Fast Polynomial Algorithms over all Finite Fields
- Signature verification is a problem that arises in many ZK related projects, from Zcash to zk-rollups
- Different types of signatures depend on different field or curve arithmetic and need different speed-up methods
- Here we will discuss some different options for signature schemes, and strategies for efficient implementations
- Given a ZKP system, a simple signature scheme can be constructed as follows. The signer generates a random secret key sk, and hashes it to get the associated public key pk. They then prove, in zero knowledge, that they know some sk such that hash(sk) = pk. Picnic is a variant of this idea, using a block cipher rather than a hash function
- Since the statement being proved involves just a single evaluation of a hash or block cipher, these schemes can be highly efficient, particularly if using arithmetic-friendly primitives such as LowMC or Poseidon
- The caveat here is that the proof must be generated by the user who owns the secret key. If the signature is part of a larger circuit, such as Zcash’s spend circuit, this may be undesirable, as a user cannot outsource proving to another entity without revealing their secret key to that entity
- If we wish to support delegated proving, it may be best to use a conventional signature scheme such as ECDSA, and include its verification steps in our circuit. This way the user can generate just a signature, and a separate prover can prove that they witnessed a valid signature
- To make this efficient, we can choose an elliptic curve such that the base field of the curve matches the “native” field of our argument system. An example of this is the Baby Jubjub curve, whose base field matches the scalar field of alt_bn128. If we use an argument system such as Groth16 with alt_bn128 as our curve, Baby Jubjub arithmetic can be done “natively”
- However, using a curve like Baby Jubjub isn’t always an option. Sometimes we need compatibility with existing tools which support conventional curves. Polygon Zero, for example, must support Ethereum’s existing signature scheme, which is ECDSA over secp256k1. In these cases, we must simulate arithmetic in another field using bignum techniques
- For example, if we use standard multiplication algorithms for ED25519 field multiplication and range check each of the 5x5 limb products, we can reduce the cost by:
- Since 2^255 = 19 mod p_ed25519, you could combine limb products with the same weight mod 255 (in bits). The combined products will have a few more bits but it should be cheaper overall
- If we split each intermediate product into bits and use binary adders, we can minimize the number of splits by adding intermediate products with the same weight "natively" first. For example, if two intermediate products have a weight of 2^51 (the base), we can split the sum into 52 bits, which is just over half the cost of doing two 51-bit splits. When we have an intermediate product with a weight of 2^255 or more, we can change its weight to 1 and multiply it by 19, since 2^255 = 19 mod p_ed25519. That could help to further reduce the number of splits
- Alternatively, there's the Aztec approach of having the prover give a purported product, then checking the identity mod p_native (almost free) and then mod, say, 2^306 (can ignore limb products with weights of 306+)
- We're going to publish a couple other methods soon, which should be cheaper than the ones above
- These methods can be generalized to arbitrary low-degree formulas rather than simple multiplications. Checking a whole elliptic curve formula directly can be much more efficient
- Finally, we would expect the cost to come down significantly if you used a PLONK or STARK implementation that supported lookup-based range checks, instead of range checking via binary decomposition
- We can use incomplete formulae if we’re careful
- In the scalar multiplication code, we would suggest looking into windowing
- If we’re checking an ECDSA or EdDSA signature, we can leverage the fact that one of the multiplications has a fixed base, enabling various preprocessing tricks
- Curve-specific: secp256k1 supports the GLV endomorphism, for example, which allows us to split a curve multiplication into two smaller ones, opening the door to multi-scalar multiplication methods such as Yao’s
- Use “ZK native” signatures if possible (i.e. Picnic)
- Use “native” curves if possible (i.e. JubJub)
- In non-native curve arithmetic, for example ED25519, combine the limb products with the same weight mod 255 (in bits)
- minimize the number of splits by adding intermediate products with the same weight first. For example, when we have 2 intermediate projects of weight 2^51, we can split the sum into 52 bits
- If we have an intermediate product with weight of 2^255 or more, we can change its weight to 1 and multiply by 19, since 2^255 = 19 mod p_ed25519
- Use CRT related method like Aztec’s approach to have the prover give a purported product and check the identity mod p_native to ignore large limb products
- Use a PLONK or STARK can cut down the cost significantly
- Check windowing in the scalar multiplication code
- Leverage the constant generator in one of the multiplications in EdDSA
- GLV Endomorphism can split a curve multiplication into two smaller ones in curves like secp256k1
When comparing zk-SNARKs, we are usually primarily interested in three different metrics: prover time, verifier time, and proof length - some may also want to add quantum security and setup trust assumptions to that list.
Within the context of blockchains, there is one further criterion which I would argue to be as important as those three, and we'll give it the clunky name aggregation-friendliness.
A zk-SNARK with universal setup can be said to be aggregation-friendly
if (informally) there exists some m so that it is easy to merge m proofs
-
The aggregation time
$$|\mathfrak{A}(P_{1},...,P_{m})|$$ is (quasi-)linear in m -
The proof size
$$|P_{1,...,m}|$$ is sublinear in m and quasilinear in$$\max_{i}(|P_{i}|)$$ -
$$|\mathfrak{V}{\mathfrak{A}}(P{1,...,m})|$$ is quasilinear in $$|\mathfrak{V}({argmax}{i}(|P{i}|))|$$, where
$$\mathfrak{V}$$ is the verifier of the given zk-SNARK
So why is aggregation-friendliness so important to us? In blockchains,
especially for scalability purposes, SNARKs are used to prove the
validity of a bunch of statements at once. Now, if we want to prove a
block of n transactions of, say for simplicity, the same size m, with a
single zk-SNARK, the prover time will typically be O(
On the other hand, imagine if the same proof system were
aggregation-friendly, and let us for now ignore network limitations. You
could have n different provers, each proving one transaction in time
O(
To hammer the point home, assume we have two schemes,
$$S_{1}:=(\mathcal{P}{1}, \mathcal{V}{1}), S_{1}:=(\mathcal{P}{2}, \mathcal{V}{2})$$.
$$\mathcal{P}{1}$$ is more efficient that $$\mathcal{P}{2}$$, with the
respective proving times for a statement of length m being
$$t_{1} = 5m\log_{2}m, t_{2} = 10m\log^{2}{2}m$$. However, while $$S{2}$$
is aggregation-friendly,
$$10m\log^{2}{2}m\cdot 4\log^{2}{2}n = 40m\log^{2}{2}m\log^{2}{2}n$$
Let's look at how
-
For n = 2, m = 2, we have
$$t_{1}=40, t_{2}=80$$ -
For n = 8, m = 4, we have
$$t_{1} = 800, t_{2} = 5760$$ -
For n = 4096, m = 16, we have
$$t_{1} = 5242880, t_{2} = 1474560 \simeq 0.28\cdot t_{1}$$
Here we see that, although $$\mathcal{P}{1}$$ performs significantly better than $$\mathcal{P}{2}+\mathcal{A}$$ for small n, it's a very different story for big n, and should make it clear why aggregation-friendliness is important.
This post will focus on some different approaches to zk-SNARK proof aggregation. At a high level, there are three axes which divide them into categories:
-
Recursive/Batching: Recursive proofs are proofs of proofs. We refer to non-recursive aggregation methods as "batching"
-
Sequential/Parallel: Can the scheme be executed in parallel?
-
Algebraic/Arithmetization-based: This distinction will become clear once we look at some examples.
-
Recursive
-
Sequential
-
Algebraic
As so much of cryptography, zero-knowledge and otherwise, Halo is based
on elliptic curves. Specifically, it uses a polynomial commitment scheme
(PCS), which was previously used in another well-known zk-SNARK scheme,
namely Bulletproofs: Given an elliptic curve E (where we write
The "inner product argument" (IPA) is a protocol between the prover and the verifier where the prover tries to convince the verifier that he knows the polynomial corresponding to his commitment. Roughly speaking, he does this by iteratively chopping his tuples a and P in two equally long tuples and then putting them together so as to get two new tuples of half the length. Eventually, after a logarithmic number of rounds, the prover is left with a single equation, where, almost if and only if he has acted honestly throughout will he be able to satisfy the verifier's final challenge.
The problem is that, although the number of rounds is nice and small, the verifier still needs to compute two "inner products" of "vectors" of the original length. One may think that this ruins the protocol: the verifier's work is now linear rather than logarithmic. However, this is where the recursive proof composition (aggregation) comes in:
The aggregation technique largely relies on an ad hoc polynomial
g$${u{1},...,u_{k}}$$(x), which is defined in terms of the verifier's
challenges
To recap, the verifier previously had to do a linear-time operation for each statement. By invoking a certain polynomial with some nice properties, the prover can defer the final opening proof until he has computed and committed to arbitrarily many such polynomials. Halo people often talk about the "inner" and the "outer" circuit to reflect the distinction between the point of view of the prover (who finishes most of each proof before computing an aggregated proof for all together) and the verifier (for whom the proof really boils down to the linear-time computation she has to perform).
For specifics, we refer to the original Halo paper. The Halo approach is clearly algebraic in its nature: It depends on the nice homomorphic
properties of Pedersen commitments, as well as a particular polynomial.
Moreover, it is sequential: The prover iteratively adds commitments
Halo 2 is used by several protocols, and its main selling point is the aggregation property. Notably, Mina protocol uses Halo to create a constant-sized blockchain, in the sense that the validity of the current state is proven simply by verifying the validity of the proof of the previous state together with the transactions in the last block. Finally, a proof is computed, representing everything that has happened up to that state.
Electric Coin Company, the developer of the scheme, recently incorporated Halo 2 as part of their upgrade to the Orchard shielded pools.
Halo 2 is also used by the zkEVM rollup project Scroll, in order to aggregate multiple blocks into one final proof and thereby amortize the L1 verification costs between them.
-
Recursive
-
Batching
-
Arithmetization-based
Plonky2 emphasizes the modular nature of zk-SNARKs: We need some arithmetization scheme, a PCS, a zero test, and a consistency check between the arithmetization polynomials and the constraint polynomial.
The goal of the scheme is to make recursive proving extremely fast, i.e. for two (or more) Plonky2 proofs, proving the validity of them with a single proof should be a small computation.
As an aside, why not simply use PLONK? While an elliptic curve E defined over some finite field $$\mathbb{F}{q}$$ can have q points over $$\mathbb{F}{q}$$, those curve/field combinations are not appropriate within the setting of cryptography. Instead, say E($$\mathbb{F}{q})$$ has r elements for some prime r. Then the arithmetic of the group is defined over $$\mathbb{F}{r}$$, a different field altogether. This means that for a proving scheme for $$\mathbb{F}{q}$$-arithmetic satisfiability, the verifier has to perform arithmetic over $$\mathbb{F}{r}$$. This means that the recursive proof must use a circuit defined over $$\mathbb{F}{r}$$, the recursive proof of that a different field, and so on. Halo in fact heavily relies on the existence of 2-cycles of curves, namely tuples (E,q), (E',r) such that q and r are primes and $$#E(\mathbb{F}{q})=r, #E'(\mathbb{F}_{r})=q$$.
Back to Plonky2. Because of the aforementioned issues with elliptic curves, the scheme essentially takes advantage of the flexibility of the so-called Plonkish arithmetization, while using a PCS which doesn't rely on elliptic curves. Specifically, Plonky2 replaces the KZG-based proof with FRI. We will only briefly describe FRI here:
For any univariate polynomial f(x)
Roughly speaking, we can iteratively commit to polynomials
For the verifier, and hence the construction of recursive proofs, the main computational task lies in computing hashes to verify Merkle paths. The high-level idea is that while for vanilla PLONK complex arithmetic expressions (such as hash functions) lead to deep circuits, it's possible to implement custom arithmetic gates which themselves compute more complex expressions than addition and multiplication.
Plonky2 implements custom gates which ensure that the verifier circuit is shallow, as well as some other engineering optimizations, in order to make recursive proving as fast as possible. Plonky2 was created by the formerly Mir Protocol, now Polygon Zero team.
-
Recursive
-
Sequential
-
Arithmetization-based + Algebraic
Unlike the two preceding schemes, Nova is not universal; it can only be
used to incrementally verify computations
Nova uses a modified R1CS arithmetization which allows the prover to "fold" two instances (one representing the recursive "proof", one representing the current step of the computation) into one. The prover ultimately proves the validity of the final folded R1CS instance with a zk-SNARK, which, thanks to the properties of the IVC scheme, proves the validity of all n invocations of f.
Although not appropriate for most applications we're interested in within a blockchain context, we have included Nova because the idea of IVC is nevertheless interesting: Take some half-baked proofs, which by themselves are neither zero-knowledge nor succinct, then aggregate and construct a single proof at the end.
-
Batching
-
Parallel
-
Algebraic
The schemes we have seen up to this point have been zk-SNARKs optimized for aggregation, i.e. the aggregation techniques are essentially baked into the respective protocols. SnarkPack takes a different approach: it is constructed as an addition to an already existing zk-SNARK, namely Groth16. In this sense, SnarkPack is the first (and only) true aggregation scheme covered here.
Groth16 uses the popular KZG polynomial commitment scheme, and the
verification of the proof includes checking an equation of the form
One way of verifying this equation is to pick a random r
Now, bilinearity of e implies that the left side of the equation can be
rewritten as
For m proofs we can imagine the same approach yielding even greater benefits. Now, the above toy example does not fully apply to Groth16; in particular Y cannot be assumed to be 0, and B does not remain constant for different i.
For m Groth16 proofs
Denote by
The aggregated equation to be checked is of the form
This looks very linear, and therefore not all that nice, however, the prover can use the IPA technique also used in Halo, in order to ensure that the proof size as well as the verifier time are logarithmic in m.
-
Batching
-
Parallel
-
Algebraic
Since writing this section, We have come across the brand new paper aPlonk, which is essentially SnarkPack for PLONK. Nevertheless, the idea here is slightly different, and so it should still be worth exploring.
SnarkPack is Groth16-based, and therefore not relevant for applications relying on universal SNARKs, such as blockchain execution layers. There are some problems with adopting that kind of approach to KZG-based universal zk-SNARKs: For one, most of the ones we have are IOPs, in practice meaning that they rely on random challenges from the verifier. Each proof will have different challenges contained within it, and this is a problem for aggregation schemes relying on structural properties of e.g. the elliptic curve pairing in KZG.
That said, we can think of aggregation within different contexts:
-
In SnarkPack, one can simply take n proofs and aggregate them.
Note: for SnarkPack, specifically, the circuit has to be the same for all because Groth16 is non-universal. We are focused on schemes for aggregating proofs of arbitrary circuits into one.
-
One might imagine a synchronous scheme where n provers with n statements to prove, build an aggregated proof together. After each prover has completed a given round, the verifier sends the same random challenge to all of them
(1) is the SnarkPack approach, and it is ideal in the sense that a scheme like this offers the most flexibility and universality.
(2) will generally entail significant communication costs, however we can imagine each prover simply as a core on a GPU. In this case, the approach may well make sense (this is the approach taken by the aPlonk authors as well). This is also very much relevant to today's problems, as all current zk-rollups have one prover per block, i.e. there's not a decentralized network of provers each proving the validity of individual transactions, and there likely won't be for a long time.
For an approach based on homomorphism similar to SnarkPack, then, it's natural to look at PLONK, which within our framework can be seen as a universal counterpart to Groth16 - the main innovation of PLONK arguably lays in its arithmetization, which we consider to be irrelevant for this kind of aggregation scheme.
As we have seen, in PLONK, the goal is to prove the validity of some
equation of the form:
where the
where the problem is that we have different powers of
This does raise another issue, namely that the set $${a_{i}^{k}}{i,k}$$
cannot be assumed to be linearly independent. Setting
$$\alpha{i}:=L_{i}(y)$$ for the Lagrange polynomials
We are fairly certain that this approach will work with some tweaking. Of course, PLONK is not as simple as proving this one equation directly, and one does encounter some issues down the road.
-
Recursive
-
Parallel
-
Arithmetization-based
We observe that multilinear polynomials will become increasingly
popular for arithmetization, in particular with the recent release of
HyperPlonk, which has some exciting properties such as the possibility for high-degree custom gates. The main benefit of multilinear polynomials is the O(n) interpolation, as
opposed to O(n$$\log$$n) for univariate polynomials. We interpolate over
the Boolean hypercube
where for two length-t vectors x, y:
We see that if x and y are binary, Eq(x,y) = 1 iff x = y, 0 otherwise, hence the name.
There is an isomorphism between the additive group of all univariate
polynomials of max degree
Multilinear to univariate interpolation
This raises one (not directly related to aggregation) question: Is it possible to do O(n)
univariate interpolation by first doing multilinear interpolation over
the Boolean hypercube and then choosing a fitting map to turn into a
univariate polynomial? At this point, we may think that the answer is
obviously a positive one, since we have the power-to-subscript map and
its inverse, however that does not work; evaluations on the hypercube
are not mapped to the corresponding evaluations in
Sadly, there is reason to think this doesn't exist. First of all, someone would have probably already thought of this, since O(n) univariate interpolation would be a massive deal.
More concretely, the missing piece to our puzzle is the so-called pullback
For
The problem, then, is to find a closed form of the map
Better recursive proofs
In HyperPlonk, the authors use KZG commitments. For a Plonky-like approach, we would like something hash-based such as FRI. There does in fact exist a FRI analogue for multivariate polynomials. The problem occurs where we need to check the consistency of the witness with the final prover message of the sumcheck protocol.
Specifically, the prover uses sumcheck in order to check that an
equation of the kind
The issue here is that a Merkle commitment corresponding to the values
of
This leaves us with a few open questions:
-
Is there an efficient way for the prover to prove that he has computed
$$\omega(r_{1},...,r_{t})$$ correctly, for example using the formula mentioned? (see for details) -
If not, is there some other recursion-friendly PCS which could be used?
As it turns out, we have likely found a solution to 1 (see this video).
This seems to be a promising direction, due to the linear proving time and the increased flexibility with respect to custom gates. It also ought to be mentioned that the use of the sumcheck protocol adds a logarithmic term to the proof, which is more annoying for KZG-based than FRI-based proofs, as the latter are already polylogarithmic (bigger than the sumcheck part) whereas the former are constant-sized (much smaller) by default.
- Proofs for Inner Pairing Products and Applications
- SnarkPack: Practical SNARK Aggregation
- Recursive Proof Composition without a Trusted Setup
- Fractal: Post-Quantum and Transparent Recursive Proofs from Holography
- Fast Recursive Arguments with PLONK and FRI
-
Zero-knowledge proof systems require the following properties to be defined as a zero-knowledge proof:
- Completeness: If a statement is true, an honest verifier will be convinced of this by an honest prover
- Soundness: If a statement is false, then there is a negligible probably that a cheating prover can prove the validity of the statement to an honest verifier
- Zero-knowledge property: If a statement is true, then the verifier learns nothing about the statement other than the fact that the statement is true
-
As such, if one of the above properties are broken, we no longer have a valid zero-knowledge proof system
-
This section organizes known vulnerabilities in implementations of zero-knowledge proof systems by the above properties and an additional section mainly pertaining to more general cryptographic vulnerabilities
- Correctness of G_k, generated in the ownership proof, is not enforced in the balance proof in Lelantus
- Trail of Bits is publicly disclosing critical vulnerabilities that break the soundness of multiple implementations of zero-knowledge proof systems, including PlonK and Bulletproofs
- These vulnerabilities are caused by insecure implementations of the Fiat-Shamir transformation that allow malicious users to forge proofs for random statements
- The vulnerabilities in one of these proof systems, Bulletproofs, stem from a mistake in the original academic paper, in which the authors recommend an insecure Fiat-Shamir generation
- Addendum: Challenges should also be generated in such a way they are independently random.
- The following repositories were affected:
- The Fiat-Shamir hash computation must include all public values from the zero-knowledge proof statement and all public values computed in the proof (i.e., all random “commitment” values)
- Serving up zero-knowledge proofs
- The Frozen Heart vulnerability in PlonK
- The Frozen Heart vulnerability in Bulletproofs
- The Frozen Heart vulnerability in Girault’s proof of knowledge
- Coordinated disclosure of vulnerabilities affecting Girault, Bulletproofs, and PlonK
- In certain ZK-SNARK protocols, a trusted setup ceremony is devised in order to produce parameters for use in the proof generation of a particular statement
- However, there are extra parameters, deemed as toxic waste, that meant to be destroyed after the ceremony has been performed. If the toxic waste is not properly disposed of, a cheating prover can generate fake proofs and mislead and honest verifier
- Honest verifier zero-knowledge proofs (HVZKP) assume an honest verifier. This means that in the presence of malicious verifiers, non-interactive protocols should always be used
- These also exchange fewer messages between prover and verifier. A malicious verifier can employ different attacks depending on the proof system
- Using HVZKP in the wrong context
- UC non-interactive, proactive, threshold ECDSA with identifiable aborts (2020)
License | Permissions | Conditions | Projects |
---|---|---|---|
MIT | Commercial use, distribution, modification, private use | License and copyright notice | Scroll, Libsnark, Plonky2 |
GNU GPLv3 | Commercial use, distribution, modification, patent use, private use | Disclose source, license and copyright notice, state changes | Aleo, Tornado Cash. Aztec |
Mozilla Public License | Commercial use, distribution, modification, patent use, private use | Disclose source, license and copyright notice | / |
Apache License | Commercial use, distribution, modification, patent use, private use | License and copyright notice, state changes | O(1) Labs, StarkEx, Halo2, zkSync, Plonky2 |
BSD License | Commercial use, distribution, modification, patent use, private use | Disclose source, license and copyright notice | / |
BSL | Non-production use, distribution, modification, private use | Disclose source, license and copyright notice | Uniswap v3, Aave |
BOSL (Bootstrap Open Source License) | Commercial use, distribution, modification, private use | Open-source the improvements, improvements available under BOSL after 12 months, disclose source, license and copyright notice | Zcash (halo2’s initial launch) |
Polaris Prover License | Non-commercial use | No transfer of rights, state changes | StarkWare Prover |
- randomness is hard to generate on chain due to non-determinism, but we still want to be able to verify the randomness
- fairness of leader election is hard to ensure as the attacker may manipulate the randomness in election
- Anyone has to compute sequential steps to distinguish the output. No one can have a speed advantage.
- The result has to be efficiently verifiable in a short time (typically in log(t))
- injective rational maps (First attempt in original VDF paper): “weak VDF” requires large parallel processing
- Finite group of unknown order (Pietrazak and Wesolowski): use a trapdoor or Rivest-Shamir-Wagner time-lock puzzle
- Chia blockchain: use VDF for consensus algorithms
- Protocol Labs + Ethereum Foundation: co-funding grants for research of viability of building optimized ASICs for running a VDF
- https://vdfresearch.org/
- https://blog.trailofbits.com/2018/10/12/introduction-to-verifiable-delay-functions-vdfs/
-
How do we formally verify that a set of constraints used by a zero knowledge proof system has the desired characteristics, such as soundness, completeness, functional correctness, and zero knowledge?
-
Zero knowledge proof systems often use a mathematical constraint system such as R1CS or AIR to encode a computation. The zero knowledge proof is a probabilistic cryptographic proof that the computation was done correctly
-
Formal verification of a constraint system used in a zero knowledge proof requires
- (1) a formal specification of the computation that the constraint system is intended to encode
- (2) a formal model of the semantics of the constraint system
- (3) the specific set of constraints representing the computation
- (4) the theorem prover, and
- (5) the mechanized proof of a theorem relating (1) and (3)
-
To prove a general statement with certainty, it must be a purely mathematical statement which can be known to be true by reasoning alone, without needing to perform any tests. The reasoning must start from consistent assumptions (axioms) and proceed by truth-preserving inferences
- Example: given that a bachelor is defined as an unmarried man, and a married man by definition has a spouse, it can be proven with certainty that no bachelor has a spouse.
-
A proof theory codifies a consistent set of rules for performing truth-preserving inference. Proof theories can be used to construct proofs manually, with pencil and paper
- Example: sequent calculus for classical logic
-
Computer programs called proof assistants reduce labor and mitigate the risk of human error by helping the user to construct machine-checkable proofs
-
To use a proof assistant to prove a statement about a program, there are two main approaches:
- (1) Write the program in the proof assistant language and apply the proving facilities to the program directly
- (2) Use a transpiler to turn a program written in another language into an object which the proof assistant can reason about
-
Of these approaches, (1) seems preferable for the greater confidence provided by the proof being about exactly the (source) program being executed, as opposed to output of a transpiler which is assumed to have the same meaning as the source program
-
What motivates approach (2) is when (for whatever reason) the proof assistant language is not suitable as a language for developing the application in
-
There are also ways to prove a statement about a program without (directly) using a proof assistant:
- Use a verified compiler (e.g. CompCert), which turns a source program into an object program which provably has certain properties by virtue of (proven) facts about the verified compiler
- Note that there is a distinction between a verified compiler and a verifying compiler. CompCert itself is proved to generate binary that will always be semantically equivalent to the C input. A verifying compiler generates a binary along with a proof of correctness that the binary is semantically equivalent to the source.
- When a program is compiled by a verifying or verified compiler, we say that the compiler output is correct by construction (provided that the input is correct).
- Use an automatic proof search algorithm, which takes as input statements to be proven and outputs proofs of those statements if those statements are true and the proof search algorithm finds proofs
- Use a static analyzer, which takes as input a program and automatically checks for various kinds of issues using predetermined algorithms
- Use a verified compiler (e.g. CompCert), which turns a source program into an object program which provably has certain properties by virtue of (proven) facts about the verified compiler
-
Both of these approaches have limitations:
-
A verified compiler is limited in what statements it can prove about the resulting program: typically, just that the resulting program has the same meaning or behavior as the source program
-
An automatic proof search algorithm is limited in what statements it can prove by the sophistication of the algorithm and the computational power applied to it. Also, due to Gödel's incompleteness theorem, there cannot exist a proof search algorithm which would find a proof of any given true statement
-
A static analyzer is generally not capable of reasoning about the meaning of a program to see if it’s correct; it is only able to recognize patterns which always or often indicate some kind of issue
-
-
Formal verification for probabilistic proof systems, inclusive of ZK proof systems, encompasses two main problem spaces:
- Proving the intended properties of a general-purpose proving system, such as soundness, completeness, and zero knowledge (E.g., Bulletproofs, Halo 2, etc.)
- Proving the intended properties of a circuit, namely that it correctly encodes the intended relation
-
Problem (1) is generally considered a very difficult problem and has not been done for any significant proving system
-
Problem (2) can be done with a formal specification of the relation and a model of thcircuit semantics. Usually it requires proving functional correctness of functions defined within the relation as well as the existence of witness variables for every argument to the function
-
Denotational design provides a helpful way of thinking about both problem spaces (general and application-specific). The circuit denotes a set: namely, the set of public inputs for which the circuit is satisfiable
-
The goal of application specific circuit verification is to prove that the circuit denotes the intended relation
-
The goal of general purpose proving system verification is to prove that it has the intended properties with respect to the denotational semantics of circuits:
- Soundness means that if the verifier accepts a proof, then with high probability, the public input used to generate the proof (i.e., the statement being proven) is in the set denoted by the circuit (i.e., the statement is true)
- Completeness means that if a public input (i.e., a statement) is in the set denoted by the circuit (i.e., the statement is true), then the proving algorithm successfully outputs a proof which the verifier accepts
-
Example: consider a ZK proving system which one would use to show that one has a solution to a Sudoku puzzle, without revealing the solution
- The statements to be proven are of the form "X (a Sudoku puzzle) is solvable"
- The relation we intend is the set of solvable Sudoku puzzles
- Soundness means that if the verifier accepts a proof that X is solvable, then with high probability, X is solvable
- Completeness means that if X is solvable, then the prover creates a proof that X is solvable which the verifier accepts
- Zero knowledge means that having a proof that X is solvable does not reduce the computational difficulty of finding a solution to X
- To see this example worked out more formally, see the OSL whitepaper.
-
If you know that your circuit denotes the relation you intend, and you know that your general purpose proof system is sound and complete in the above senses, then you know that your application-specific proving system (i.e., the circuit plus the general proof system) has the intended soundness and completeness properties for that application
-
This suggests that, given a formally verified general-purpose proving system, and a verifying compiler from statements to circuits, one can solve the problem of proving correctness of application-specific proving systems without application-specific correctness proofs
-
Suppose that
- one can write the statement to be expressed by a circuit in a machine-readable, human-readable notation, where it is self-evident that the statement being written has the intended meaning or denotation
- one has a verifying compiler which turns that statement into a circuit which provably has the same denotation as the source statement
- circuit can be executed on a formally verified general-purpose probabilistic proving system
-
Then one can generate formally verified application-specific probabilistic proving systems without any additional proof writing for an additional application. This seems like a promising way forward towards a sustainable and cost effective approach to formal verification for ZK circuits
-
Here is a summary of some of the ways in which the ecosystem supports efficient execution of verified code:
- The proof assistants Coq and Agda do not provide for compilation of programs written in those languages to an efficiently executable form
- The language ATS provides proof facilities and purports to allow for programming with the efficiency of C and C++
- There are various means for transpiling code written in a mainstream language such as C or Haskell into a proof assistant, which allows for theorems to be proven about the extracted model of the source program
- You can synthesize an efficient binary program using Coq (e.g., using Fiat)
- The proof assistant ACL2 defines a subset of Common Lisp with a full formal logic. When a definition is executable, it can be compiled into efficient code, and because the language is a formal logic, you can define and prove theorems about the code
- There is a verifying compiler project, ATC, from ACL2 to C
- Imandra defines a subset of OCaml with a full formal logic and a theorem prover
- In the context of modern computing, most computationally intensive tasks deal with vector math and other embarassingly parallel problems which are done most efficiently on specialized hardware such as GPUs, FPGAs, and ASICs
- This is generally true of the problem of constructing proofs in probabilistic proof systems. Provers for these proof systems would be most efficient if implemented on specialized hardware, but in practice, they are usually implemented on CPUs, due to the greater ease of programming on CPUs and the greater availability of those skill sets in the labor market
- For creating a formally verified implementation of a probabilistic proof system which executes efficiently, it seems that the right goal is not to optimize for speed of execution on a CPU, but to target specialized hardware such as FPGAs, GPUs, or ASICs
- Unfortunately, tools for compiling formally verified programs to run on FPGAs, GPUs, or ASICs are more or less nonexistent as far as we know
- Decades of research exists on formal verification strategies for arithmetic circuits in the context of hardware verification
- See, e.g., Drechsler et al (2022) and Yu et al (2016)
- This work has limited industral applications, e.g., the AAMP5 (see Kern and Greenstreet (1997), page 43)
- This line of research is not directly applicable to formal verification of arithmetic circuits for zk-SNARKs, because arithmetic circuits in hardware and arithmetic circuits in zk-SNARKs are not quite the same things
- Orbis Labs is working on:
- A verifying Halo 2 circuit compiler for Σ¹₁ formulas
- Expected to be working in Q4 2022 or Q1 2023
- Orbis Specification Language (OSL), which provides a high level spec language which we can compile to Σ¹₁ formulas
- A toolchain (Miya) for developing formally verified, hardware accelerated probabilistic proof systems
- A theory of interaction combinator arithmetization, towards compiling formally verified code into circuits
- No timeline on this; still in basic research
- A verifying Halo 2 circuit compiler for Σ¹₁ formulas
- Kestrel Institute is a research lab that has proved functional correctness of a number of R1CS gadgets using the ACL2 theorem prover. (Kestrel also does a lot of other FV work, including on Yul, Solidity, Ethereum, and Bitcoin)
- They formalized and proved the functional correctness of parts of the Ethereum Semaphore R1CS
- They formalized and proved the functional correctness of parts of the Zcash Sapling protocol. An overview of the process:
- Used the ACL2 proof assistant to formalize specs of parts of the Zcash Sapling protocol
- Formalized rank 1 constraint systems (R1CS) in ACL2
- Used an extraction tool to fetch the R1CS gadgets from the Bellman implementation, and then used the Axe toolkit to lift the R1CS into logic
- Proved in ACL2 that those R1CS gadgets are semantically equivalent to their specs, implying soundness and completeness
- Aleo is developing programming languages such as Leo that compile to constraint systems such as R1CS
- Aleo aims to create a verifying compiler for Leo, with theorems of correct compilation generated and checked using the ACL2 theorem prover
- Aleo has also done post-hoc verification of R1CS gadgets using Kestrel Institute's Axe toolkit
- Nomadic Labs is a consulting firm that does a lot of work on Tezos and they built the Zcash Sapling protocol for shielded transactions into the Tezos blockchain as of the Edo upgrade. Kestrel Institute formally verified some of the R1CSes used in that protocol. (Nomadic Labs also does a lot of other FV work)
- Anoma team is working on the Juvix language as a first step toward creating more robust and reliable alternatives for formally verified smart contracts than existing languages
- Andrew Miller and Bolton Bailey are working on a formal verification of a variety of SNARK proof systems, using the Lean Theorem Prover, in the Algebraic Group Model
- Alex Ozdemir from Stanford is working on adding a finite field solver in cvc5 SMT Solver
- Lucas Clemente Vella and Leonardo Alt are working on SMT solver of polynomial equations over finite fields
- Veridise is working on:
- Ecne is a special-purpose automatic proof search tool which can prove
that an R1CS constraint system defines a function (total or partial)
- In other words, it proves that for any input values on which the system is satisfiable, there is a unique combination of output values on which the system is satisfied
- This proves, for an R1CS which is intended to be satisfiable on all possible inputs (denoting a function as opposed to a partial function), that there are enough constraints, in the sense that adding constraints could not change the denotation of the circuit if the denotation remains a partial function
- This does not imply soundness
- This approach has been proven to be useful in flushing out bugs in circuits
- Starkware is writing Lean proofs to check that circuits expressed as Cairo programs conform to their specs
- You could analyze each research effort / development project in terms of
- the language used for specification
- the language used to model the constraint system
- which theorem prover they are using
- what sorts of theorems are they proving, like soundness, completeness, functional completeness, and zero knowledge
- Other interesting attributes of project concerning FV for constraint systems are:
- whether the tooling and the formal verification are open source. It is hard to have confidence in a theorem when the components that went into proving that theorem are not available for inspection
- what is the trusted core, i.e., what software in the stack is not formally verified, and what are the possible consequences if it has bugs
- A lot of work needs to be done. There is not enough emphasis placed on formal verification in the security industry
- Based on the observations and arguments presented in this blog post, we think the following will be some interesting directions for future research and development:
- Build foundations for formally verifying zero knowledge proof systems:
- Generalizable proof techniques for proving the desired properties formally
- Reusable verified abstractions for proof systems, e.g., a polynomial commitment scheme library
- Can we formally verify systems that are designed to automatically make ZK circuits more efficient?
- For example: systems that choose a different circuit so that the setup MPC is more parallelizable or that allow a prover who learns part of the witness in advance to partially evaluate a circuit and use this information to compute proofs faster
- Improved specification languages and verified translators between specification languages
- Understand how to create formally verified programs to run on vectorized hardware, e.g., FPGAs, GPUs, and/or ASICs
- There are different ways to axiomatize a problem to prove it. Some categories are denotational semantics, axiomatic semantics, and operational semantics. Operational semantics is particularly useful for proving things about programs
- If you write a specification of a computation in a high-level formal language and compile it to a constraint system using a verified or verifying compiler, that is called correct by construction. If you take an existing constraint system and you try to prove properties about it (up to and including soundness and completeness) with respect to a specification in a high-level formal language, that is called post-hoc verification
- Allows one party to commit to some data by publishing a commitment
- Later they can reveal the data and convince anyone it is the same as what they committed to.
- Ideally, a commitment scheme should be:
- Hiding - It does not reveal the input that is committed
- Binding - It is impossible (or impractically difficult) to open a commitment to a different value that it was created with
- Let
$$G$$ and$$H$$ be two public generators for a large group where the discrete log is hard. - For an input,
$$x$$ , and hidden random value,$$r$$ , the Pedersen commitment is$$comm(x) = xG + rH$$ - The commitment is opened by revealing
$$x$$ and$$r$$ - These have some cool properties compared with hash commitments:
-
Additively Homomorphic:
$$comm(a) + comm(b) = comm(a+b)$$ -
Batchable:
$$x_1G_1 + x_2G_2 + ... + rH = \vec{x}\vec{G} + rH$$
-
Additively Homomorphic:
- Low-level tech: discrete log group
- Assumptions: discrete log
- Resources:
- Lecture by Dan Boneh: https://youtu.be/IkNZWJFcfcU
- Overview by Bill Buchanan: https://youtu.be/J9SOk9dIOCk
- A type of polynomial commitment scheme that employs structured reference string (SRS) and require trusted setup, thus producing toxic waste
- The verifier asks "what is the value of the polynomial at z"
- The prover responds with
$$y$$ and a proof$$h(s)$$ using:- the polynomial
$$f()$$ - the commitment -
$$f(s)$$ where$$s$$ is a secret point (toxic waste) - the proof -
$$h(s)$$ - the coordinates
$$z$$ ,$$y$$
- the polynomial
- The verifier is able to convince themselves this is true even if they don’t know
$$f()$$ - Require pairing-friendly curves to multiple elliptic curve points
- Low-level tech: pairing group
- Assumptions: discrete log + secure bilinear pairing
- Used for:
- zk-SNARKs
- Data Availability Sampling
- Libraries:
- https://github.com/proxima-one/kzg (Rust)
- https://github.com/sifraitech/kzg (Rust)
- https://github.com/protolambda/go-kzg (Go)
- https://github.com/benjaminion/c-kzg (C)
- https://github.com/mratsim/constantine/tree/master/research/kzg_poly_commit (Nim)
- https://github.com/Nashatyrev/jc-kzg (Java)
- Resources:
- A modification of Pedersen commitments to allow polynomial commitments
- Does not require trusted setup and pairing-friendly curves
- Larger proof size compared with KZG
- Low-level tech: discrete log group
- Assumptions: discrete log
- Used for:
- Halo2
- Kimchi
- Libraries:
- Resources:
- General overview: https://dankradfeist.de/ethereum/2021/07/27/inner-product-arguments.html
- Talk: https://youtu.be/dD_0Vn4BhmI
- Based around Reed-Solomon erasure coding
- Only requires hash-based (symmetrical) cryptography (quantum-resistant)
- No need for a trusted setup
- Much larger proof sizes
- Used for:
- STARKs
- Plonky2
- Resources:
- General overview: https://aszepieniec.github.io/stark-anatomy/fri.html
- Vitalik's deep dive: https://vitalik.ca/general/2017/11/22/starks_part_2.html
- Assumptions: collision-resistant hash function
- Based on unknown order groups
- Requires trusted setup if using RSA groups, not if using Class Groups
- Much slower verification time compared with others (O(n))
- Resources:
Scheme | Pedersen | KZG | IPA | FRI | DARK |
---|---|---|---|---|---|
Type | Scalar | Polynomial | Polynomial | Polynomial | Polynomial |
Low-level tech | Discrete log group | Pairing group | Discrete log group | Hash function | Unknown order group |
Setup |
|
|
|
|
|
Commitment | |||||
Proof size | |||||
Verify time | |||||
Prove time |
- Short NIZK that requires no trusted setup.
- Low-level tech: Pedersen commitments.
- Uses its own constraint system, which is easily convertible to R1CS.
- Assumptions: discrete log.
- Prover time:
$$O(N * log(N))$$ - Verifier time:
$$O(N)$$ - Proof size:
$$O(log(N))$$ - Works best with
dalek
curve having a Ristretto group (a compressed group of Ed25519 points).
- Range proofs (take only 600 bytes)
- Inner product proofs
- Verifiable secret shuffle
- Intermediary checks in MPC protocols
- Aggregated and distributed (with many private inputs) proofs
- Confidential TX for Bitcoin:
- Monero
- Stellar Shielded Tx (Cloak)
- https://github.com/dalek-cryptography/bulletproofs (Rust)
- https://github.com/adjoint-io/bulletproofs (Haskell)
- https://github.com/bbuenz/BulletProofLib (Java)
- https://github.com/lovesh/bulletproofs-r1cs-gadgets
- https://github.com/MarcKloter/bulletproofs_gadgets
- Short proof that needs no trusted setup.
- Dlog-based variant of the interactive Sigma protocols.
- Require a constant number (3) of public-key operations.
- Can be made non-interactive using Fiat-Shamir transform.
- Generalize Schnorr proofs to arbitrary linear functions.
- Multiple Sigma proofs can be combined to form a new proof (composition):
- if you have two Sigma protocols for proving statements
A
andB
you can compose them into a protocol for provingA or B
,A and B
,eq(A,B)
,all(n,A)
,eq-all(n,A)
; n
must be known at compile time.
- if you have two Sigma protocols for proving statements
- Assumptions:
- discrete log,
- honest verifier or Fiat-Shamir.
- Discrete log (dlog) proofs, prove knowledge of x such that g^x = y for publicly known values g, y.
- One-of-many dlogs, discrete log equality (dleq).
- Range proofs.
- Knowledge of message and randomness in a Pedersen commitment, equality of message in 2 Pedersen commitments.
- Pedersen commitment and ciphertext (twisted ElGamal) holds the same message.
- Verifiable Random Functions
- Signal Algebraic MACs for or group chats:
- Dleq proofs in the adaptor signatures
- ElGamal encryption in the Cryptography for #metoo
- https://github.com/LLFourn/secp256kfun/tree/master/sigma_fun (Rust)
- https://docs.rs/zkp/0.7.0/zkp (Rust)
- https://github.com/xlab-si/emmy (Go)
- https://pkg.go.dev/go.dedis.ch/kyber/v4/proof (Go)
- https://github.com/cryptobiu/libscapi (C++)
- https://github.com/spring-epfl/zksk (Python)
- Overview: https://medium.com/@loveshharchandani/zero-knowledge-proofs-with-sigma-protocols-91e94858a1fb
- Blog post about Maurer generalization of dlog-based Sigma protocols: https://cronokirby.com/posts/2022/08/the-paper-that-keeps-showing-up/
- Math: https://www.win.tue.nl/~berry/CryptographicProtocols/LectureNotes.pdf (pp. 47-61)
- Forum discussion: https://community.zkproof.org/t/standardizing-sigma-protocols/471
- Combines an efficient accumulation scheme with PLONKish arithmetization and needs no trusted setup.
- Superchargable with pre-computed lookup tables and custom gates.
- Flourishing developer ecosystem actively developed tooling.
- Low-level tech: IPA commitments.
- Assumptions: discrete log.
- Allows to choose (codewords rate):
- fewer rows => fast prover,
- fewer columns and constraints => fast verifier.
- Prover time:
$$O(N * log(N))$$ - Verifier time:
$$O(1)$$ > PLONK+KZG > Groth'16 - Proof size:
$$O(log(N)$$
- Arbitrary verifiable computation;
- Recursive proof composition;
- Circuit-optimized hashing based on lookup-based Sinsemilla hash function.
- https://github.com/zcash/halo2 - original (IPA)
- privacy-scaling-explorations/halo2wrong - replaces IPA to KZG
- https://github.com/Orbis-Tertius/halo2 - replaces IPA to FRI (WIP)
- https://github.com/nikkolasg/halo2-circuits
- https://github.com/icemelon/halo2-tutorial
- https://github.com/weijiekoh/halo2_circuits
- https://github.com/akinovak/halo2-semaphore
- https://github.com/timoth-y/halo2-encryption
- General overview: https://electriccoin.co/blog/explaining-halo-2/
- Talk: https://youtu.be/KdkVTEHUxgo?t=399
- Math: https://vitalik.ca/general/2021/11/05/halo.html
- Awesome: https://github.com/adria0/awesome-halo2
- Ecosystem showcase: https://youtu.be/JJi2TT2Ahp0
- Combines FRI with PLONKish arithmetization and needs no trusted setup
- Extensively optimized for modern processors (both x86-64 and ARM64) using:
- Ed448-Goldilocks, whose modulus allows working on 64-bit fields
- SIMD operations for efficient sponge hashing
- Leverages custom gates to significantly optimize non-native arithmetic (e.g. field division)
- Uses Poseidon sponge as the underlying hash function. Supports GMiMC, which will result in more efficiency, but might be less secure
- Assumptions: collision-resistant hash function
- Prover time:
$$O(log^2(N))$$ - 300ms/20s (depending on the codewords rate) - Verifier time:
$$O(log^2(N))$$ - Proof size:
$$O(N*log^2(N))$$ - 500kb/43kb (depending on the codewords rate)
- Recursive proof composition
- Circuit optimization using TurboPLONK's custom gates
- https://github.com/qope/plonky2-examples
- https://github.com/recmo/proto-ecdsa-plonky2
- https://github.com/mir-protocol/plonky2-semaphore
- https://github.com/timoth-y/plonky2-encryption
- https://github.com/mir-protocol/plonky2/blob/main/plonky2/plonky2.pdf
- https://proxima-one.github.io/maru/background-knowledge/plonky2.html
- MPC enables parties to carry out distributed computation on their private inputs.
- each party produces a computation transcript (its own view).
- Important observation #1: if the final result of MPC is wrong, then there's an inconsistent views somewhere.
- Secret Sharing allows splitting some secret among multiple parties, such that all of them need to cooperate for using it.
- Important observation #2: if only a subset of shares is revealed, then the secret remains unknown.
- The prover simulates an MPC protocol "in their head", which computes
$$f(x)$$ on a private input$$x$$ shared among$$n$$ virtual parties. - The verifier can then ask to reveal a subset of views, which isn't enough to recover private inputs, but gives some confidence that the probability of not seeing the inconsistency is small.
- It's not small enough, so we need to repeat this interaction multiple times (~300).
- This can then be made non-interactive by doing this in parallel and applying Fiat-Shamir transform.
- The underlying MPC protocol used for proof generation isn't important, only views are;
- This allows treating simulated MPC as a black box and skip computing, thus avoiding a lot of overhead.
- Can be efficiently applied for boolean circuits, unlike most other NIZKs that are only feasible with arithmetic circuits.
- Can be extended with RAM - shared memory with private addresses (paper)
- Above innovations allow to easily compile arbitrary program into the verifiably computation circuit.
- Prover time:
$$O(n)$$ - much less than with SNARKs - Verifier time:
$$O(n)$$ - Proof size:
$$O(n)$$ - The overhead of doing computation verifiably is much lower (
$$1/{n_{reps}*n_{parties}}$$ => ~1/600) than that of SNARKs (~1/millions). - Additions are free, but multiplications needed to be simulated many times.
- With pre-processing it's possible to reduce size and verifier complexity at the prover's expense.
- Complex programs that are poorly translatable to arithmetic circuits, e.g. SHA hashes.
- ML and other types of approximate calculations, that do floating point arithmetic (can be done bit-by-bit in boolean circuits).
- Delegated proving:
- You want to create a SNARK of some statement, but don't want to compute it yourself nor to reveal the witness to a third party.
- One thing you could do is create a MPCith proof and then the delegated party would create a new proof which verifies this one.
- Optimized confidential transactions using delegated proving
- The idea being that you would minimize latency for the transaction creator, where they create the proof as fast as possible, and then another entity could compress the proof with other techniques.
- https://github.com/trailofbits/reverie/tree/stacked-ikos (Rust)
- https://github.com/cronokirby/boo-hoo (Rust)
- https://github.com/cronokirby/Rem-Boo (Rust)
- Podcast episodes:
- Great intro paper (ZKBoo): https://eprint.iacr.org/2016/163.pdf
- Security analysis: https://eprint.iacr.org/2021/437.pdf