Multivariate Sum-Check

This a short post explaining the multivariate sum-check: a fundamental subprotocol used throughout multivariate succinct arguments (e.g. Spartan, HyperPlonK, etc). Roughly speaking, it often serves the same role as the divisibility check in the univariate contexts.

This post assumes familiarity with finite fields, polynomials and lagrange interpolation/basis.

Introduction

The multivariate sum-check enables a prover to convince a verifier that a particular multivariate polynomial \( f(X_1, X_2, \ldots, X_k) \in \FF[X_1, \ldots, X_k] \) sums to a particular value \( y \) over a product of sets: \( H = S_1 \times S_2 \times \ldots \times S_k \), i.e. \[ y = \sum_{x_1 \in S_1} \sum_{x_2 \in S_2} \ldots \sum_{x_k \in S_k} f(x_1, x_2, \ldots, x_k) \] Normally \( H \) is a tensor space (hypercube) \( S_1 = S_2 = \ldots = S_k \); most commonly the Boolean hypercube \( H = \bin^k \): \[ y = \sum_{x_1 \in \bin} \sum_{x_2 \in \bin} \ldots \sum_{x_k \in \bin} f(x_1, x_2, \ldots, x_k) \]

Goal

Computing \( y \) is expensive: taking \( O(|H|) \) time, even with some clever memoization tricks. The goal of the sum-check protocol is for the verifier to outsource this computation to an untrusted prover.

Communication

In the multivariate sum-check protocol the prover sends \( \sum |S_i| \) field elements to the verifier and the verifier sends \( k \) field elements to the prover. This is logarithmic in the size of the hypercube \( H \) since \( |H| = \prod_i |S_i| \).

End of The Protocol

At the end of the protocol the verifier will need to evaluate the polynomial \( f(X_1, \ldots, X_k) \) at a single point \( (r_1, \ldots, r_k) \in \FF \) to check the prover’s claim. The verifier can either evaluate \( f(r_1, \ldots, r_k) \) themselves (which may/may not be expensive) or delegate this computation to the prover using a multivariate polynomial commitment (relying on cryptography).

Hence, with logarithmic communication and computation in the size of \( H \), the sum-check allows the verifier to reduce the problem of summing a polynomial over \( H \) to checking the value of that polynomial at a single point \( (r_1, \ldots, r_k) \).

The Sum-Check Protocol

Recall the claim we want to verify:

\[ y = \sum_{x_1 \in S_1, \ldots, x_k \in S_k} f(x_1, x_2, \ldots, x_k) \]

Define the polynomial “summing away” \( X_2, \ldots, X_k \):

\[ g_1(X_1) = \sum_{x_2 \in S_2, \ldots, x_k \in S_k} f(X_1, x_2, \ldots, x_k) \]

Obviously if we also “sum away” the last variable we get \( y \):

\[ \begin{aligned} y &= \sum_{x_1 \in S_1} g_1(x_1) \\ &= \sum_{x_1 \in S_1} \left( \sum_{x_2 \in S_2, \ldots, x_k \in S_k} f(x_1, x_2, \ldots, x_k) \right) \end{aligned} \]

At this point the sum-check protocol is straightforward:

The (untrusted) prover sends a polynomial \( g_1(X_1) \)
The verifier checks: \( y = \sum_{x_1 \in S_1} g_1(x_1) \)

At this point the verifier needs to check that \( g_1(X_1) \) is indeed the correct polynomial, i.e.

\[ g_1(X_1) = \sum_{x_2 \in S_2, \ldots, x_k \in S_k} f(X_1, x_2, \ldots, x_k) \]

To do so, the verifier samples \( r_1 \sample \FF \) and evaluating both sides at this point:

\[ \color{green}{g_1(r_1) } = \color{red}{\sum_{x_2 \in S_2, \ldots, x_k \in S_k} f(r_1, x_2, \ldots, x_k)} \]

The green part the verifier can compute directly, by evaluating the polynomial \( g_1(X_1) \) at \( r \). The red part requires a large summation, but the dimension of the hypercube has been reduced by one (we got rid of \( X_1 \)). Rather than computing this summation themselves, the verifier recursively use the sum-check protocol, asking the prover to show new the claim: \[ \color{blue}{ y’ = \sum_{x_2 \in S_2, \ldots, x_k \in S_k} f’(x_2, \ldots, x_k) } \]

Where \( f’(X_2, \ldots, X_k) = f(r_1, X_2, \ldots, X_k) \) and \( y’ = g(r_1) \).

End of The Protocol

After recursively applying the sum-check protocol above \( k \) times, replacing each variable one-by-one, the verifier will be left with the claim: \[ y = f(r_1, r_2, \ldots, r_k) \] At which point the verifier can either evaluate the polynomial herself (as in GKR) or delegate this computation to the prover using a polynomial commitment (as in e.g. Spartan).

Soundness

If \( g_1(X_1) \neq \sum_{x_2 \in S_2, \ldots, x_k \in S_k} f(X_1, x_2, \ldots, x_k) \) then: \[ \prob_{r_1 \sample \FF} \left[ g_1(r_1) = \sum_{x_2 \in S_2, \ldots, x_k \in S_k} f(r_1, x_2, \ldots, x_k) \right] \] Is at most \( \deg(g_1(X_1)) \ / \ |\FF| \) and the degree of \( g_1(X_1) \) is the maximal degree of \( X_1 \) in the polynomial \( f(X_1, X_2, \ldots, X_k) \) – usually a small constant: in the most common case of multilinear polynomials it is just 1. This follows from the Schwartz-Zippel lemma.

To get a soundness analysis for all rounds (after recursively applying the sum-check) one can apply a union bound over each recursion step and arrive at a soundness error of: \[ \frac{\sum_{i = 1}^k \deg_{X_i}(f(X_1, \ldots, X_k))}{|\FF|} \] In the multilinear case this is just \( k / |\FF| \).

Addendum: Efficiency via Lagrange Basis

Having the prover compute:

\[ g_1(X_1) = \sum_{x_2 \in S_2, \ldots, x_k \in S_k} f(X_1, x_2, \ldots, x_k) \]

Directly is expensive in most applications.

Instead the prover can compute smaller sums and combine them using Lagrange interpolation. Recall the Lagrange basis polynomials for \( S_1 \): for each \( x_1 \in S_1 \) we define the unique degree \( |S_1| - 1 \) polynomial \( L^{(x_1)}(X_1) \) such that:

It evaluates to \( 1 \) at \( x_1 \): \[ L^{(x_1)}(x_1) = 1 \]
It evaluates to \( 0 \) on the rest of \( S_1 \): \[ \forall x_1’ \in S_1 \setminus \{ x_1 \}. L^{(x_1)}(x_1’) = 0 \]

We can write the polynomial \( f(X_1, X_2, \ldots, X_n) \) in \(X_1 \)-Lagrange basis:

\[ \begin{aligned} f(X_1, X_2, &\ldots, X_n) = \\ &\sum_{x_1 \in S_1} L^{(x_1)}(X_1) \cdot f(x_1, X_2, \ldots, X_n) \end{aligned} \]

With this rewrite the prover can compute the polynomial \( g_1(X_1) \) as:

\[ g_1(X_1) = \sum_{x_1} L^{(x_1)}(X_1) \cdot y^{(x_1)} \] \[\quad \text{where} \quad y^{(x_1)} = \sum_{x_2 \in S_2, \ldots, x_k \in S_k} f(x_1, x_2, \ldots, x_k) \]

This assumes that \( |S_1| > \deg_{X_1}(f(X_1, X_2, \ldots, X_k)) \) which is usually the case in applications, e.g. in the common case of multilinear polynomials summed over the Boolean hypercube. Otherwise the prover needs to compute the sum for a larger \( S_1’ \supseteq S_1 \).