In applications where membership of the Merkle tree must be proved inside a SNARK, the concrete cost of expressing the compression function in the proof system (in terms of gates/R1CS constraints etc.) affects performance of the prover massively.
This has lead to the study/creation of so-called SNARK-friendly hash functions: hash functions with ‘’nice algebraic’’ descriptions over (large) finite fields, enabling an efficient description of the function as an arithmetic circuit over the field on which the SNARK operates. Unfortunately such compression functions are often inherently expensive to evaluate, due to their use of field operations in large finite fields – which is substantially slower than word/bit operations used in e.g. SHA256 or Blake2s.
For the in-circuit cost, using a SNARK-friendly hash function requires roughly \( \times \)100 fewer (R1CS) constraints:
- Blake2s: 21006 R1CS constraints.
- SHA256: 27534 R1CS constraints.
- Poseidon-128: 276 R1CS constraints.
- Rescue: 288 R1CS constraints.
However when evaluating the compression functions, we again see a factor \( \approx \times \) 100 difference, but with roles reversed:
The characteristics above suggest a performance trade-off when using the compression function for Merkle trees:
- Using a SNARK-Friendly Hash:
- Computing the Merkle tree is slow.
- Proving membership in zero-knowledge is very efficient.
- Using a CPU-Friendly Hash:
- Computing the Merkle tree is fast.
- Proving membership in zero-knowledge is much slower.
The observation in this post is a simple one: rather than using a single compression function, mix a CPU-friendly and SNARK-friendly hash. By hashing just the lower levels of the tree using a CPU-friendly hash function and the remaining upper levels using a SNARK-friendly hash function, most of the compression function calls during the construction of the tree will be the CPU-friendly hash function, while most of the compression calls for membership proofs (‘‘in-circuit’’) will be the SNARK-friendly hash function: meaning most of the ‘‘in-circuit’’ savings of using a SNARK-friendly hash function is preserved, while most of the ‘‘out-of-circuit’’ computational overhead is avoided.
For example: hashing just the lowest level of the tree using Blake2s rather than Poseidon (illustrated above), will almost half the cost of constructing the tree, while increasing the complexity of the membership proof by a constant 20730 constraints – independent of the number of leafs.
Below I have created a simple widget to illustrate the trade-off, naturally it is most pronounced for larger trees:
Because the number of nodes at each layer increases exponentially in the depth, the out-of-circuit performance of the hash function for the lower levels is much-much more important than for the higher levels when constructing the tree. While the cost of the in-circuit membership proof is independent of which layer is replaced.