Zero Knowledge Proofs
Motivation
Concept of zero-knowledge (ZK)
In a deterministic world, the knowledge generated by a message can be analyzed by considering the running time and size of a turing machine which can generate that message. If there exists a constant sized turing machine which runs in polynomial time and can generate a message, then the recipient of that message receives no additional information upon receipt of the message (as it could simply run that turing machine). In a random world, we must extend our notion of the message, as the message is in fact a random variable which can be any element of some set of messages with probability that follows some distribution. Rather than discussing a specific deterministic turing machine, we instead consider a turing machine capable of generating the same distribution of messages. Further, rather than requiring the distributions of the output of the turing machine and the distribution of the message to be exactly the same, we can allow them to be computationally indistinguishable. So, by considering the runtime and size of such a turing machine, we can quantify the knowledge gained by a party on receiving a message.
Interactive Proofs
In a world where neither Alice nor Bob trust each other, then Alice may wish to prove something to Bob in such a way that Bob cannot share the proof with any other third party. To achieve this, Alice and Bob perform an interactive proof whereby they both receive the same input $x$ and Alice has a witness $w$ to their statement about $x$. They engage in several rounds of communication, at the end of which Bob should be convinced of Alice’s statement while learning nothing unnecessary about the witness, and being unable to replicate the proof for other parties.
Example: Where’s Waldo?
Imagine Alice, Bob and Carry are playing a game of Where’s Waldo. They want to not only see who finds Waldo first but second as well.
Alice: I found Waldo Bob: Oh! I dont believe you!! Carry: You need to prove without revealing the location of Waldo. Otherwise we won’t be able to judge second position.
For Alice to prove that she indeed found Waldo, she would need a Zero Knowledge Proof. ZKP is defined as a proof that reveals that the Prover has the proof, but the Verifier doesn’t learn anything about the problem from the proof.
In the above situation, Alice takes a very large sheet of paper with a cutout of Waldo. She then overlays that cutout on top of where he found Waldo. Since Bob and Carry do not know the orientation of the Where’s Waldo sheet underneath the large sheet, they still do not have any idea of where Waldo is on the sheet. With this, Alice was able to prove she knows where Waldo is, and Bob and Carry can still try to find Waldo to play for second place. Note that neither Bob nor Carry can replicate Alice’s proof to claim second place without doing the work to find Waldo themselves.
Overview
Interactive proofs
For a large number of problems, a deterministic ZKP does not exist. For such problems, an interactive Zero Knowledge Proofs can be used to prove with a very high probability that the Proof is correct. In such a problem, the Verifier and Prover play a game, the result of which can prove that the prover is incorrect, but doesn’t necessarily prove that he is correct. Repeating the same many times builds the confidence of the Verifier that the Prover does have the correct Proof. An interactive proof not only maintains zero knowledge, but adds randomness and interactivity to the proof.
Formal Definition of a Zero-Knowledge proof system
A pair of PPT algorithms are called a zero-knowledge proof system if together they are complete, sound, and satisfy the zero-knowledge property. Intuitively, the completeness property states that the verifier always accepts when interacting with the prover on the same valid inputs (and a valid corresponding witness); in other words, if the prover is not lying, then it can eventually convince the verifier.
The soundness property states that the verifier does not accept (except with negligible probability) when engaging with any (even all powerful) prover on invalid inputs; in other words, a prover cannot fake a proof.
The zero-knowledge property states that any output from a (possible cheating) verifier when interacting with the prover on some inputs could also be produced by some polynomial time simulator on the same inputs. In other words, this property states that anything the verifier learns by interacting with the prover could have been learned by the verifier anyway, in essentially the same amount of time. Note that this definition does not encompass non-uniform malicious verifiers (wherein the cheating verifier has additional inputs). Also, this definition is in relation to some specific language of “valid” inputs (so zero-knowledge proof systems are for a specific problem).
The formal definitions follow.
Completeness Two PPT algorithms $(P,V)$ satisfy the completeness property if and only if
$\forall k,\ \forall x \in L_k,\ \forall \text{ valid witnesses } w,\,\, P[V(1^k,x)\text{ accepts when interacting with }P(1^k,x,w)] = 1$ where $L$ is a language $\in NP$, and $L_k = L \cup {0,1}^{\leq k}$.
Soundness A verifier $(V)$ satisfies soundness if and only if $\forall x \notin L_k,\,\forall P^* \text{(even all powerful }P^*),\,\,Pr[V(1^k,x)\text{ accepts when interacting with }P^*(x)] = neg(k)$
Zero-Knowledge A prover satisfies zero-knowledge if and only if for all PPT verifiers $V^*$ there exists an expected time polynomial simulator $Sim$ such that for all $x \in L_k$ and witness $w$ for $x$, the view of $V^*(1^k,x)$ when interacting with $P(1^k,x,w)$ is computationally indistinguishable from the output of $Sim(1^k,x)$.
Example Problems with ZK interactive proofs
Example: 3-coloring graph
A graph $G_1$ is said to be 3-colored if all the vertices can be assigned gne of 3 colors such that no two vertices connected by an edge are of the same color. This problem is known to be NP-Complete. In a situation when someone requires to give 3-coloring of the graph without revealing the colouring itself, an interactive Zero Knowledgve Proof can be constructed as follows:
- Prover chooses a random permutation for the 3 colors of the graph
- Prover hides but commits the graph. (e.g. declares hash of vertices, coloring with a random nonce)
- The verifier chooses a random edge from the graph.
- The prover opens the coloring of the two vertices associated with the edge.
- If the colors are same, the Prover has an incorrect proof.
- If the colors are different, then the Prover might still not have the proof. If only one edge has incorrectly colored vertices, then the probability of finding it is $1/|E|$. To make the Verifier believe in the proof, all the steps are repeated several times. With the passing of each step, the confidence of the Verifier in the proof increases. After a certain threshold is achieved Verifier declares that the proof is correct.
Correctness: If the proof is correct, then after each round the confidence of the Verifier would increase, and thus he will be able to convince the verifier that he has a 3-coloring solution.
Soundness: If the proof is incorrect, then the Verifier, after enough rounds would, with a high probability, catch the defective edge in which both the associated vertices are the same color.
Zero-Knowledge: Assume that after seeing the proof, the verified can obtain some information about the problem. Consider a situation in which an adversarial Prover gets access to a time machine, using which he goes back in time each time he fails and repeats that round. Since the proof is valid, the Verifier must be able to extract some information about how to 3 color the graph, but since the Prover did not have any information about 3 coloring of the graph, all the information gained is unique to the Verifier. Thus, since the Verifier is unable to differentiate between the two situations (Honest and Adversarial), he would not gain any knowledge about the 3-coloring.
Example: Graph Isomorphism
Let $G_1$ and $G_2$ be two graphs with vertex and edge sets $(V_1,E_1)$ and $(V_2,E_2)$ respectively. The graphs are isomorphic if there exists a permutation $\phi: V_1 \rightarrow V_2$ such that $(u,v) \in E_1 \iff (\phi(u),\phi(v)) \in E_2$. We also abuse phi to denote the isomorphism on a graph, as in $\phi(G)$. Let $G_1 \sim G_2$ denote isomorphism between graph $G_1$ and $G_2$. Graph isomorphism defines a language $L:$ ${(G_1,G_2) | G_1 \sim G_2}$. Clearly, $L \in NP$ since given a $\phi$, it can certainly be checked for correctness in polynomial time. The protocol for Graph Isomorphism ZK proof proceeds as follows (note the soundness probability is only $1/2$ to begin). The prover $P$ begins with $G_1$, $G_2$ and $\phi$. The verifier $V$ begins only with $G_1$ and $G_2$.
- $P$ chooses a random permutation $H = \phi’(G_1)$, and sends this to $V$.
- $V$ chooses $b \in {1,2}$ uniformly at random and delivers this to $P$.
- $P$ will send a permutation from $G_{b}$ to $H$. If $b=1$ (or if $V$ sends anything besides what is expected) then $P$ sends $\phi’$. Else, $P$ sends $\phi’ \circ \phi^{-1}$
- $V$ accepts iff the permutation it receives, $\phi’ ‘$ is such that $\phi’ ‘(G_b) = H$.
We will now show this protocol is zero-knowledge, complete, and has soundness probability one-half. First, note the protocol is clearly polynomial. For completeness, assume the prover acts honestly, and note that the verifier accepts in all cases of $b$. This is immediate if $b=1$. If $b=2$, then note $\phi’‘(G_2) = \phi’\circ\phi^{-1}(G_2) = \phi’(G_2) = H$, and so the verifier accepts. For soundness, let $G_1\nsim G_2$, and $H$ is as defined in the protocol. Since either $G_1 \sim H$ or $G_2 \sim H$ (not both), $P$ can correctly respond only to one of $V$’s possible challenges during round 3. This holds since in one case $\phi’\circ\phi^{-1}(G_2)\neq H$ and in the other $\phi’(G_1)\neq H$. Thus, $P$ can only “fake it” successfully in response to at most one of two uniformly random options, meaning it succeeds with probability at most one half. Finally, we will show zero-knowledge. We must show that for all verifiers $V^*$, we can construct a simulator which can output the view of $V^*$. Note that simulator is allowed to use the verifier as a subroutine. Sim
- fix random coins $\omega$ for $V^*$
- repeat $k$ times:
- choose random permutation $\phi’$
- $b \leftarrow {0,1}$
- $H = \phi’(G_b)$
- $b’ \leftarrow V^*(1^k,(G_1,G_2),H; \omega)$
- If $b’ = b$ output view $(\omega,H,b,\phi’)$
- if none of the $k$ runs succeed, output $-1$
Note that the simulator controls everything about $V^*$, and so can fix its random coins ahead of time, and can run $V^*$ multiple times to obtain the view it needs. So, assume $G_1 \sim G_2$. Then $Pr[Sim\text{ outputs }-1]$ is negligible in $k$. The graph $H$ is independent from $b$, and since $\omega$ is fixed by the simulator ahead of time, the view of $V^*$ in any iteration is independent from $b$. Thus there is a $1/2$ probability on any iteration that $b = b’$, and so the probability the Sim outputs -1 is $2^{-k}$. Now, given that Sim != -1, the output of Sim is identically distributed to the view of $V^*$. First, clearly $\omega$ is identically distributed to the random coin tosses from $V^*$. $H$ is a random isomorphism of $G_b$ for some $b$, but this is identically distributed to a random isomorphism of $G_1$, as sent by $P$ in the protocol. Next, the challenge bit in the protocol is deterministic as a function of $G_b$ and $\omega$ (since $\omega$ is fixed) equal to $V^*(1^k,(G_1,G_2),H)$. Finally, $\phi’$ in the simulator is clearly a random isomorphism of, and so identically distributed to $\phi’$ in the protocol. Therefore, the ouput of Sim is identically distributed to the output of $V^*$, and so we achieve the property of zero-knowledge. Note that in this case, where the output of the simulator is identically distributed to the view of $V^*$, the proof is called a perfect ZK proof.
Now, this protocol has soundness error $1/2$, but this can be boosted by simply repeating the protocol $\ell$ times; for $\ell = \Omega(\log(k))$ the protocol remains polynomial, and now the probability for soundness error is $neg(k)$. Note that performing the protocol in parallel actually destroys the simulator, as the probability that it can guess all the $b$s becomes negligible in $\ell$. However, when the protocol is performed in sequence (incurring $\log(k)$ round complexity), the straightforward adaptation of the simulator suffices.
ZK Using Garbled Circuits - How to Prove Non-algebraic Statements Efficiently
To begin, note that zero-knowledge is a subset of secure two-party computation, such that any secure multi-party computation system can be used to achieve a zero-knowledge proof. To be precise, ZK is a secure two-party computation wherein the prover party has the input $w$, the witness party has no input, and the function being computed accepts if and only if $w$ is a valid witness for $x$. Note that since $x$ is not secret, in this formulation it is in fact part of the circuit, not an input. Prior to this work, zero-knowledge proofs were known for specific, structured (read: algebraic), languages. This paper proposed a protocol to establish ZK proofs for general languages based on Yao’s Garbled Circuits. We outline the protocol below.
Main ideas
The parties $P$ and $V$ run oblivoius transfers, with $P$ using his bits of witness as inputs, and $V$ providing the keys for the input layer of the garbled circuit. $V$ then sends to $P$ the garbled circuit, who evaluates it, and determines the output. However, rather than reveal this immediately, it instead commits to this output, and wait for the verifier to prove its honesty before delivering the output. Note the importance of the commitments made here - the prover must verify that the keys revealed by $V$ in the final step are consistent with the previously delivered keys. To avoid complicated notations, we will describe the protocol at a high level. Intuitively, the three requirements of a ZK proof system in this context are the same: correctness requires that it is possible to successfully recover the secret when evaluating an honestly generated garbled circuit using input keys corresponding to a valid witness of the shared input; soundness requires that no malicious evaluator can compute the secret unless it has access to the keys corresponding to a valid witness; finally, verifiability requires that a malicious constructor cannot construct a circuit which successfully accepts when evaluated on keys corresponding to a valid input but which can also output other values which are functions of the witness. This final property will give that a verifier cannot distinguish between different witnesses provided by the prover (intuitively satisfying the property of zero-knowleddge).
The Protocols
The paper presents a simpler scheme in the context of honest verifier, presented below. The correctness and soundness follows from correctness and privacy of Yao’s garbled circuits. Note that this protocol is not secure against an actively corrupted verifier, and the later protocol which provides this security is a major contribution of the work.
- For all witness bits $w_i$, $P$ sends $(choose,i,w_i)$ to oblivious transfer
- Oblivious transfer delivers all $(chosen,i)$ to $V$
- $V$ computes $(GC,{K_i^1,K_i^0},Z)\leftarrow Gen(1^k,f_y)$
- $V$ delivers all $(tranfer,i,K_1^i,K_0^i)$ to oblivious transfer
- Oblivious transfer delivers all messages $(transferred,i,K_i’)$ to $P$
- $V$ delivers $GC$ to $P$
- $P$ runs $Z’ \leftarrow Eval(GC,{K_i’})$ and delivers $Z’$ to $V$
- $V$ outputs accepts iff $Z = Z’$
While a detailed analysis of the zero-knowledge property is omitted, intuitively, a simulator can emulate the oblivious transfer, and so gain full access to a malicious prover’s witness. The simulator acts as a semi-honest verifier, but outputs accept iff the prover’s witness is valid. Then, soundness implies that the output is also identically distributed as a true verifier’s output. Similarly, to generate the view of a passively malicious $V^*$, a simulator emulates the oblivious transfer, and in step 7 sets $Z’ = Gen(1^k,f_y)$ rather than $Eval…$. Clearly from correctness, the two quantities are equal with high probability.
Now, a fully secure protocol against even actively malicious $V^*$ is presented. Steps 1-6 are as presented above with an oblivious choose-open-transfer (COT) protocol used instead of the oblivious transfer.
- $P$ evaluates $Z’ \leftarrow Eval(GC,{K_i’})$. If $Eval$ aborts set $Z’$ to -1.
- $P$ sends $(commit, 1, Z’)$ to commitment function, which sends $(commited, 1, |Z’|)$ to $V$.
- $V$ sends the message $(openall)$ to the COT protocol.
- The COT protocol delivers to $P$ the values $(transfer, i, K_0^i, K_1^i)$ (in essense, $V$ reveals the garbled circuit to $P$).
- $P$ runs $Verify(GC,{K_i^0,K_i^1})$. If the output is not accept, then $P$ aborts. Otherwise, it sends $(reveal,1)$ to the COT protocol.
- The COT protocol sends $(reveal,Z’,1)$ to $V$. $V$ accepts if $Z’=Z$.
The soundness and correctness of this protocol are similar to the previously presented protocol. We now prove that this protocol satisfies the zero knowledge property. If $P$ is malicious, we define a simulator that acts like a honest verifier but emulates the COT functionality. Therefore, it can gain direct access to $P$’s witness, and so the simulator accepts if the witness is valid. Clearly the view of the malicious $P$ is the same until the final step, where the simulator accepts only if the witness is valid. If the malicious prover is using a valid witness, then clearly the views are the same. Now suppose the prover is using an invalid witness but can input a $Z’ = Z$ to the commitment functionality in step 8. Note that due to properties of the commitment functionality, $Z’$ is clearly not computed using any knowledge of the revealed circuit from the $openall$ phase. Therefore, if such a prover existed, then it could break soundness of the garbled circuit.
In the case of a malicious verifier $V^*$, we define a simulator which simulates the COT functionality to derive the keys ${(K^*)_i^0,(K^*)_i^1}$. In step 6, the simulator receives the garbled circuit $GC$, and always sends the committed message to $V^*$, and waits to receive the $openall$ message. Finally, the simulator delivers to $V^*$ $Z’ \leftarrow Extract(GC,{(K^*)_i^0,(K^*)_i^1})$ if the $Verify$ accepts or $-1$ otherwise. If the verifier “cheats”, the simulator will always $commit$ to $-1$ in step 8. Note that an honest prover might commit to a different value here, but since the verifier is cheating, the verifier will see this difference in values neither in the simulated nor in the real world, as the protocol will end at step 11 when $Verify(…)$ rejects. If $V^*$ behaves honestly, then the simulator will commit to value $Z$, which is computationally close to the real world value. This follows from a property of garbled circuits which requires that the output of a garbled circuit be unique (and therefore independent of the witness) with high probability.
Conclusion
We have looked at two ways the interactive zero knowledge proofs work. Having proved that 3-coloring of a graph has a ZKP, we can derive a reduction from any problem in NP-Complete. This proves that for any NP-Complete problem there exist efficient Zero Knowledge Proofs. In the second method, we look at more practical efficient ZKPs taking the help of Garbled Circuits to preserve Zero knowledge. Compared to the first method, which gave a ZKP in $|E|^2$ time, which is theoretically efficient but not so useful in practice, the second method gives a practical implementation with complexity of the circuit depth.
References
https://eprint.iacr.org/2013/073.pdf
https://www.cs.cornell.edu/courses/cs4830/2010fa/lecnotes.pdf
http://www.cs.umd.edu/~jkatz/gradcrypto2/NOTES/lecture23.pdf
https://sites.duke.edu/compsci590_03_s2021/files/2021/03/finalZero-Knowledge-Proofs-1.pdf