Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Garbled circuits are one of the most widely used and promising tools for secure two-party computation. Garbled circuits were introduced by Yao [Yao82] and they were first implemented in Fairplay by Malkhi et al. [MNPS04].

The basic version of Yao’s protocol only guarantees security in the presence of passive corruptions (i.e., when the adversary follows the protocol but might try to learn more information from their view). From a very high level point of view, since garbling schemes hide (to some extent) the circuit which is being garbled, a malicious party can garble a different function from the one they are supposed to without the honest party noticing it, therefore breaking the security of the protocol. During the years many approaches have been proposed to construct GC-based protocols with strong security guarantees against adversaries who deviate arbitrarily from the protocol (i.e., malicious or active corruptions). The main technique for achieving active security in the GC context is the so called cut-and-choose approach: in a nutshell, cut-and-choose involves several copies of the same circuit being garbled; afterwards, a random subset of the garbled circuits are checked for correctness, while the rest are evaluated.

There are many different instantiations of the cut-and-choose approach: in 2007, Lindell and Pinkas [LP07] proposed a method which achieves security \(2^{-\kappa }\) by garbling approximately \(3\kappa \) copies of the circuit. This was improved in 2013 in several works [Lin13, HKE13, Bra13] using the so called forge-and-lose technique. In its most efficient instantiation [Lin13] this technique allows to achieve security \(2^{-\kappa }\) using only \(\kappa \) garbled circuits (by adding a “small” actively secure computation).

A different approach was taken by Nielsen and Orlandi in 2009 [NO09]. Using “LEGO style” cut-and-choose the overhead decreases logarithmically with the size of the circuit f i.e., it is possible to get security \(2^{-\kappa }\) using a replication factor of \(O(\kappa /\log |f|)\). While the original LEGO approach required to perform exponentiations for each gate in the circuit, subsequent work [FJN+13, FJNT15] got rid of this limitation and only uses generic assumptions. A variant of the LEGO approach has proven itself particularly useful in the amortized setting i.e., when the two parties are evaluating the same circuit f multiple times (say \(\ell \)) on different inputs. In this case the amortized overhead to get security \(2^{-\kappa }\) is \(O(\kappa /\log \ell )\) [LR14, HKK+14], and experimental validation achieves “blazing fast” results [LR15].

To summarize, while advanced styles of cut-and-choose techniques have shown that one can achieve practically-efficient actively-secure two-party computation in the amortized setting, in all of the above approaches the number of garbled circuits grows linearly with the security parameter. It is natural to ask whether this is an inherent limitation or whether it is possible to achieve actively-secure two-party computation based on garbled circuits with constant overhead.

1.1 Our Contribution

Before stating our contributions, it is important to clarify the question we are asking as much as possible: it is of course possible to achieve active security using a single garbled circuit and using the GMW compiler [GMW87] (i.e., proving in zero-knowledge that the circuit is well-formed). This is not a satisfactory solution since it is not black-box in the underlying garbling scheme and, therefore, does not preserve the efficiency of Yao’s protocol. Jarecki and Shmatikov [JS07] proposed an instantiation of this paradigm using a specific number theoretic assumption (Pailler’s cryptosystem [Pai99]): thanks to the algebraic nature of the underlying cryptosystem, the extra zero-knowledge proofs only add a constant overhead. We do not consider this solution satisfying either, since one of the strengths of Yao’s protocol (both in terms of security and efficiency) is that it only requires to perform symmetric key operations per gate in the circuit. Therefore, we are only interested in solutions that can be instantiated using any projective garbling scheme in a black-box way. We are now ready to ask our question:

Can we achieve actively-secure two-party computation protocols in the amortized setting with only a constant overhead over Yao’s protocol?

We answer the question positively: let \(p(\kappa )\) be an upper bound on the cost of generating, evaluating or checking a garbled gateFootnote 1, and let \(A(\kappa )\) be a fixed function (this term describes the cost of performing some “small” fixed actively secure computation and is independent of the circuit size and the number of executions). Then the amortized complexity of our protocol (for large enough circuits and number of executions) is bounded by:

$$\begin{aligned} O(1)\cdot |f|\cdot p(\kappa ) + A(\kappa ) \end{aligned}$$
(1)

i.e., only a constant overhead over Yao’s passive protocol and an additive factor independent of |f|.

1.2 Technical Overview

We give here a high level description of our techniques. We have two parties, Alice and Bob, respectively with inputs \(\{x_i^A,x_i^B\}_{i\in [\ell ]}\). At the end both parties should learn \(y_i=f(x^A_i,x^B_i)\) for all \(i\in [\ell ]\). (We will assume that both |f| and \(\ell \) are large, and that \(\ell \ge |f|\)).

Our protocol proceeds in five stages: in the first stage we let the parties commit to a key and exchange their inputs in an encrypted format (using a symmetric encryption scheme). In this way the inputs to all computations are well-defined already from this stage. We then let both parties garble \((1+\epsilon )\ell \) copies of the circuit (for some constant \(\epsilon \le 1\)) and, using cut-and-choose, verify that \(\epsilon \ell \) of them are correct: this guarantees that even if one of the two parties has been actively corrupted, there are at most \(O(\kappa )\) incorrect circuits among the unopened ones except with (negligible) probability \(2^{-\kappa }\). We then proceed to evaluate these circuits in both directions (as in the dual-execution protocol of Franklin and Mohassel [MF06]). Remember that Yao’s protocol is “almost” actively secure against corrupt evaluators, which means that at this stage the corrupt party learns the correct output of the function (together with unforgeable output labels) in all positions, while the honest party learns outputs (with corresponding output labels) in at least \(\ell -O(\kappa )\) positions.Footnote 2 For the remaining \(O(\kappa )\) positions the honest party might receive an incorrect output or no output at all. Therefore in the next stage, which we call the filling-in stage, we allow each party to ask for at most \(O(\kappa )\) re-computations using an actively secure protocol (without disclosing in which positions), and we enforce that the computation is performed on the same inputs (remember that the inputs are provided in encrypted format). This computation also outputs MACs on the new results. Note that the malicious party cannot gain anything in this stage, since the corrupt party has already learned the correct output, and learning it once more does not leak any extra information. After this stage we are ensured that both parties have \(\ell \) candidate outputs together with unforgeable certificates of their authenticity (either the output labels from the garbled circuits or the MACs from the do-overs). But still it might be that some of the outputs received by the honest party are incorrect and, therefore, different from the one received by the corrupt party. In the final stage of the protocol we run a kind of forge-and-lose sub-protocol, where both parties input all the received outputs to an actively secure computation. The computation finds the first position in which the outputs differ and recomputes the function for that index. Since the parties cannot lie about their outputs at this point (due to the unforgeable certificates), there is at most one party with the incorrect output, and that party is the honest one. Therefore, the other party must have cheated by garbling an incorrect circuit. To “punish” this party, we release the secret key of the malicious party to the honest party. (Crucially, the malicious party does not learn whether they have been caught or not, since this would open the door for selective-failure attacks). This allows the honest party to decrypt all the encrypted inputs and compute all the correct results “in the clear”.

To recap, here are the 5 stages of the protocol:

  1. 1.

    (encrypted input) Both parties exchange inputs. This is done by using a committed OT using some symmetric keys \((\sigma _A,\sigma _B)\in \{0,1\}^{O(\kappa )}\) as choice bits and then exchanging encrypted inputs

    $$\begin{aligned}X^A_i=E(\sigma _A,x_i^A) \text { and } X^B_i=E(\sigma _B,x_i^B)\end{aligned}$$
  2. 2.

    (cut-and-choose) both parties garble \((1+\epsilon )\ell \) circuits which compute

    $$\begin{aligned} (\sigma _A,X_A,\sigma _B,X_B) \mapsto f(D(\sigma _A,X_A),D(\sigma _B,X_B)) \end{aligned}$$

    and do cut-and-choose by checking \(\epsilon \ell \) circuits each.

  3. 3.

    (dual execution) The parties evaluate each of the \(\ell \) remaining circuits. Thanks to cut-and-choose, at most \(O(\kappa )\) circuits are incorrect, which in particular means there are at most \(O(\kappa )\) positions in which the honest party did not receive an output.

  4. 4.

    (filling-in) The parties run an actively secure protocol which recomputes the function in at most \(O(\kappa )\) positions. Using Merkle-trees based commitments we can make sure that a) the functions are recomputed on the same inputs as before and that b) the input to this protocol (and its complexity) does not grow linearly with \(\ell \). This protocol also outputs MACs for the recomputed values;

  5. 5.

    (forge-and-lose) At this point both parties have \(\ell \) outputs, but if one party is dishonest some of the outputs might still be different. So now the parties run an actively secure protocol which (a) finds the first position where the outputs are different and (b) recomputes the function in that position, finds out which party cheated, and reveals the secret key of the corrupt party to the honest party, which can therefore decrypt all inputs and recompute the function in the clear. Since MACs are used, a corrupt party cannot input a wrong output (which would make the honest party to look corrupt).

Stage 1 can be seen as a kind of committed oblivious transfer combined with an oblivious transfer extension, where we start with a “small” committed OT functionality for \(O(\kappa )\) pair of messages which are then used to provide very long inputs to a garbled circuit based computation: if \(a(\kappa )\) is the cost of evaluating a gate with an actively secure protocol and \(q_1(\kappa )\) is some function describing the complexity of the circuit computing \(O(\kappa )\) committed OTs on messages of length \(O(\kappa )\), then the complexity of this stage is \(q_1(\kappa )a(\kappa )\). If we set e.g., \(\epsilon :=1/4\) then the total complexity of Stage 2 and 3 is bounded by \( 5\ell (|f|+q_2(\kappa ))p(\kappa )\), where \(q_2\) represents the complexity of the decryption circuit D. Then, the complexity of Stage 4 is bounded by \( \psi \kappa (|f|+q_3(\kappa ,\log \ell )) a(\kappa )\) where \(q_3\) is the complexity of verifying the Merkle-tree commitment and computing a MAC, and \(\psi := \log _{1+\epsilon }(2)\) is a constant picked to guarantee that the probability that there are more than \(\psi \kappa \) bad circuits among the unchecked ones is less than \(2^{-\kappa }\). Finally the complexity of Stage 5 is \((\ell q_4(\kappa )+|f|)a(\kappa )\) where the \(q_4\) factor represents the complexity of verifying the certificates. Since \(a(\kappa )> p(\kappa )\) the total cost of the protocol is bounded by:

$$\begin{aligned} 5\ell |f|p(\kappa )+ (|f|+\ell ) A(\kappa ) \end{aligned}$$

where \(A(\kappa )\) collects all the terms which are independent of the circuit size or the number of computations. Now, when amortizing over \(\ell \) executions, and assuming that \(\ell \) is at least as large as |f|, we achieve the desired amortized complexity stated earlier in (1).

We can actually quantify the constant overhead over Yao’s protocol even more precisely, by looking at the actual cost (in PRF calls) for garbling (or checking) vs. evaluating a gate in some of the most common garbling schemes (i.e., instead of upper bounding it with p). Let g be the number of calls to a PRF (encryptions) performed during garbling/checking a gate and e be the number of calls to a PRF (decryptions) performed during the evaluation of a gate. Then \((g+e)\) is the exact computational cost (per gate) in the passive version of Yao’s protocol. In our protocol the exact cost is \((2+4\epsilon )g+2e\). In Yao’s original garbling \((g,e)=(4,4)\), in point-and-permute [BMR90] \((g,e)=(4,1)\) and finally in the half-gate construction [ZRE15] \((g,e)=(4,2)\) which means that when using \(\epsilon =1/4\) then the concrete overhead over Yao’s passive protocol is between 2.5 and 2.8.Footnote 3

We conclude by stressing that this work is of theoretical nature and therefore we have made no attempts in optimizing the concrete efficiency of any of the steps. On the contrary, since our protocol is already quite complex and involves several stages we have chosen at each turn simplicity of presentation over (concrete) efficiency. We leave it as an interesting open direction for future work to investigate whether the approach proposed in this paper might lead to practical efficiency.

2 Preliminaries and Notation

We review here the standard tools which are used in our protocol and their syntax.

Commitments. We use a computationally binding and computationally hiding commitment scheme \(\mathsf {Com}\) with commitment key \(ck\leftarrow \mathsf {CGen}(1^\kappa )\), and we use an informative but slightly abusive notation: we write \(\langle {x}\rangle \leftarrow \mathsf {Com}_{ck}(x,\mathsf {open}({x}))\) where \(\langle {x}\rangle \) is a commitment to the value x using randomness \(\mathsf {open}({x})\). In the proof we need the commitment to be extractable i.e., we need the simulator to be able to compute \(x\leftarrow \mathsf {Ext}(\mathsf {td},\langle {x}\rangle )\) using some trapdoor \(\mathsf {td}\) associated to the commitment key ck.

Merkle-tree Commitments. We use Merkle-tree based commitments with the following interface: Given a string of elements from some alphabet \(\mathbf {x}\in \varSigma ^n\) it is possible to compute a short commitment by running \(\mathsf {root}\leftarrow \mathsf {MT.C}(1^\kappa ,\mathbf {x})\). It is possible to construct a proof for a give position \(j\in [\ell ]\) by computing \(\pi \leftarrow \mathsf {MT.P}(\mathbf {x},j)\) and the proof can be verified running \(b \leftarrow \mathsf {MT.V}(\mathsf {root},j,x',\pi )\) with \(b\in \{\top ,\bot \}\). We want that \(b=\top \) when the prover is honest and \(x'=x_j\) (i.e., correctness), that the proof is short i.e., \(|\pi |=O(\kappa \log \ell )\) (compactness) and that no PPT adversary can produce a tuple \((\mathsf {root},j,x,x',\pi ,\pi ')\) such that \(x\ne x'\) and \(\mathsf {MT.V}(\mathsf {root},j,x,\pi )=\mathsf {MT.V}(\mathsf {root},j,x',\pi ')=\top \) (computational binding). (We do not need these commitments to be hiding, since they are only used to reduce the input size of the ideal functionality in the filling-in stage of the protocol).

Symmetric Encryption. We use an IND-CPA symmetric encryption scheme \((\mathsf {SE.E},\mathsf {SE.D})\) with key \(\sigma \in \{0,1\}^{8\kappa }\). We use lower-case letters for plaintexts and upper-case letters for ciphertexts, so \(X\leftarrow \mathsf {SE.E}(\sigma ,x)\) and \(x\leftarrow \mathsf {SE.D}(\sigma ,X)\). We need the encryption scheme to be secure even if \(\kappa \) bits of the secret key leak to the adversary (to counteract standard selective-failure attacks during the OT phase). This is done in the following way: we start by generating a uniformly random \(\kappa \) bit key \(\sigma '\), which is then encoded into a \(8\kappa \) bit long key \(\sigma \) using the (randomized) encoding scheme \((\mathsf {enc},\mathsf {dec})\) of Lindell and Pinkas [LP07] i.e., we compute \(\sigma \leftarrow \mathsf {enc}(\sigma ',r)\) with some randomness r. Now given any encryption scheme \(E',D'\) which is IND-CPA secure using a \(\kappa \)-bit long key, we define \(\mathsf {SE.E},\mathsf {SE.D}\) to be \(\mathsf {SE.E}(\sigma ,m)=E'(\mathsf {dec}(\sigma ),m)\) and \(\mathsf {SE.D}(\sigma ,c)=D'(\mathsf {dec}(\sigma ),m)\).

MAC Scheme. We use an unforgeable message authentication code (MAC) \((\mathsf {MAC.Tag},\mathsf {MAC.Ver})\) with key \(\tau \in \{0,1\}^\kappa \) and the following interface: one can compute a tag on a message x by computing \(t\leftarrow \mathsf {MAC.Tag}(\tau ,x)\) and the tag can be verified running \(\mathsf {MAC.Ver}(t,\tau ,x)\in \{\bot ,\top \}\).

Oblivious Transfer. We use the following notation for transforming a random OT on short, random messages (of length \(\kappa \)) into an OT on chosen messages of any lengths (using the same choice bits): we start with the sender knowing \(\mathsf {sen}=\{r_0,r_1\}\) (a pair of random strings in \(\{0,1\}^\kappa \)) and the receiver knowing \(\mathsf {rec}=\{\sigma ,r_{\sigma }\}\) (with \(\sigma \in \{0,1\}\)). Then we write

$$\begin{aligned} \mathsf {tra}_j\leftarrow \mathsf {OTTransfer}(\mathsf {sen},j,\{m_{0},m_{1}\}) \end{aligned}$$

for the process of encrypting the pair of messages \(\{m_{0},m_{1}\}\) using keys \(r_0,r_1\) respectively (using an IND-CPA symmetric encryption scheme) and

$$\begin{aligned} m_{\sigma }\leftarrow \mathsf {OTRetrieve}(\mathsf {rec},j,\mathsf {tra}_j) \end{aligned}$$

for the process of recovering \(m_{\sigma ,j}\) from \(\mathsf {tra}_j\). To ease the notation, we also allow a “vector” version of these OT commands i.e., if \(\mathbf {e}=\{m_{i,0},m_{i,1}\}_{i\in [n]}\) is a vector of n pairs of messages and \(\sigma \in \{0,1\}^n\) is a vector of n bits then we write \(\mathsf {rec}=\{\sigma _i,r_{i,\sigma _i}\}_{i\in [n]}\), \(\mathsf {sen}=\{r_{i,0},r_{i,1}\}_{i\in [n]}\) for the information known to the receiver and sender respectively, \(\mathsf {tra}_j\leftarrow \mathsf {OTTransfer}(\mathsf {sen},j,e)\) for the process of encrypting each pair of messages and finally \(M\leftarrow \mathsf {OTRetrieve}(\mathsf {rec},j,\mathsf {tra}_j)\) with \(M=\{m_{i,\sigma _i}\}_{i\in [n]}\). (In the proof of security we also use \(e\leftarrow \mathsf {OTRetrieve}(\mathsf {sen},j,\mathsf {tra}_j)\) to denote the process of recovering all pairs of messages using the keys known to the sender).

Garbled Circuits. We use the generalization of the notation introduced by Bellare et al. [BHR12] already used in [JKO13, FNO15]: a garbling scheme is a tuple of algorithms

$$\begin{aligned} (\mathsf {GC.Gb},\mathsf {GC.Ev},\mathsf {GC.En},\mathsf {GC.De},\mathsf {GC.Ve}) \end{aligned}$$

where:

  • \((\hat{f},e,d)\leftarrow \mathsf {GC.Gb}(f;r)\) generates a garbled version \(\hat{f}\) of the circuit \(f : \{0,1\}^{n}\rightarrow \{0,1\}^{n}\) which has n input bits and n output bits. We make explicit the randomness r used to garble since it will be used in the verification process. The \(\mathsf {GC.Gb}\) function outputs the garbled version of the function \(\hat{f}\), the encoding tables e and the decoding tables for the output wires d;

  • \(\hat{x}\leftarrow \mathsf {GC.En}(e,x)\) outputs an encoding of x.

  • \(\hat{z}\leftarrow \mathsf {GC.Ev}(\hat{f},\hat{x})\) outputs an encoded version of the output;

  • \(z'\leftarrow \mathsf {GC.De}(d,\hat{z})\) outputs the plaintext version of an encoded value \(\hat{z}\) (or \(\bot \) for an invalid encoding);

  • \(b\leftarrow \mathsf {GC.Ve}(f,r,\hat{f},e,d)\) allows to verify if a given garbled circuit was garbled correctly and outputs \(b\in \{\top ,\bot \}\);

As usual we need the garbling scheme to be projective – i.e., both (e, d) are vectors of pairs of strings – to be compatible with Yao’s protocol. We need the garbling scheme to satisfy privacy and authenticity as defined in [BHR12]. We need the garbling scheme to be verifiable in the standard sense i.e., that an adversary cannot “open” a garbling \(\hat{f}\) to any function different than f.

Definition 1

(Correctness). We say that a garbling scheme enjoys correctness if for all \(n={\text {poly}}(\kappa ), f:\{0,1\}^{n}\rightarrow \{0,1\}^n\) and all inputs \(x\in \{0,1\}^n\):

$$\begin{aligned} \Pr \left( f(x)\ne \mathsf {GC.De}(d,\mathsf {GC.Ev}(\hat{f}, \mathsf {GC.En}(e,x))) : (\hat{f},e,d)\leftarrow \mathsf {GC.Gb}(1^\kappa ,f) \right) = 0 \end{aligned}$$

(the probability is taken over the random coins of all algorithms).

Definition 2

(Privacy). We say that a garbling scheme enjoys privacy if there exists a PPT simulator \(\mathcal {S} \) such that the two following distributions are computationally indistinguishable:

$$\begin{aligned} \{ (\hat{f},\mathsf {GC.En}(e,x) ,d) : (\hat{f},e,d)\leftarrow \mathsf {GC.Gb}(1^\kappa ,f) \}_x \approx \{\mathcal {S} (1^\kappa , f,f(x))\}_x \end{aligned}$$

for all f, x.

Definition 3

(Authenticity). We say that a garbling scheme enjoys authenticity if for all PPT \(\mathcal {A} \), for all \(f,x\in \{0,1\}^n\)

$$\begin{aligned} \Pr \left( \mathsf {GC.De}(d,z^*)\ne \bot \wedge z^*\ne \mathsf {GC.Ev}(\hat{f},\hat{x}) : \begin{array}{c} (\hat{f},e,d)\leftarrow \mathsf {GC.Gb}(1^\kappa ,f), \\ \hat{x}\leftarrow \mathsf {GC.En}(e,x), \\ z^* \leftarrow \mathcal {A} (\hat{f}, \hat{x}) \end{array} \right) \end{aligned}$$

is negligible in \(\kappa \).

Definition 4

(Circuit Verifiability). We say a garbling scheme enjoys circuit verifiability if for all PPT \(\mathcal {A} \):

$$\begin{aligned} \Pr \left( \mathsf {GC.Gb}(f_0;r_0)=\mathsf {GC.Gb}(f_1;r_1) : (f_0,r_0,f_1,r_1)\leftarrow \mathcal {A} (1^{\kappa }), f_0\ne f_1\right) \end{aligned}$$

is negligible in \(\kappa \).

Input Verification. We also enhance the garbling scheme with two algorithms \((\mathsf {GC.TkG},\mathsf {GC.TkV})\). The algorithm \(\mathsf {GC.TkG}\) allows to generate some “tokens” \(\mathsf {tk}\) from the input labels e. These tokens can be used with the \(\mathsf {GC.TkV}\) algorithm to check whether an encoding of an input \(\hat{x}\) is correct without leaking any information about the input x itself. In a nutshell, we construct this from any projective garbling scheme in the following way: let \(e=(K_0,K_1)\) be the encoding information of the original garbling scheme (for simplicity we assume a single input bit). Then we flip a random bit r and let \(\mathsf {tk}=(\langle {K_r}\rangle , \langle {K_{1-r}}\rangle )\) and \(e'=((K_0,\mathsf {open}({K_0})),(K_1,\mathsf {open}({K_1})))\) that is, we extend the input labels with some randomness and we compute two commitments, and permute them in a random order. Now given an encoding of an input \(\hat{x}\leftarrow \mathsf {GC.En}(e',x)\) (using the extended labels i.e., \(\hat{x}=(K_x,\mathsf {open}({K_x}))\)) it is possible to verify whether this is a correct encoding by running \(\mathsf {GC.TkV}(\mathsf {tk},\hat{x})\in \{\top ,\bot \}\). The algorithm simply parses \(\hat{x}=(K^*,\mathsf {open}({K^*}))\), computes \(\langle {K^*}\rangle =\mathsf {Com}_{ck}(K^*,\mathsf {open}({K^*}))\) and checks if \(\langle {K^*}\rangle \in \mathsf {tk}\). These tokens satisfy the following properties: (1) adding the tokens does not break the privacy property of the garbling scheme, and (2) if a (possibly malicious) encoding of an input passes the verification against the tokens, then evaluating a (honestly generated) garbled circuit on this input encoding will give an output different than \(\bot \).

 

Token Generation: :

Given any projective garbling scheme i.e., one where \(e=\{(K^i_0,K^i_1)\}_{i\in [n]}\), we construct a new garbling scheme with verifiable input in the following way: First we define the new encoding information \(e'\) to be

$$\begin{aligned} e'=\{(K^i_0,\mathsf {open}({K^i_0})), (K^i_1,\mathsf {open}({K^i_1})) \}_{i\in [n]} \end{aligned}$$

and then we compute tokens \(\mathsf {tk}\leftarrow \mathsf {GC.TkG}(e')\) by sampling random bits \(r_1,\ldots ,r_n\) and outputting

$$\begin{aligned} \mathsf {tk}= \{ \langle {K^i_{r_i}}\rangle ,\langle {K^i_{1-r_i}}\rangle \}_{i\in [n]} \end{aligned}$$

With \(\langle {K^i_b}\rangle =\mathsf {Com}_{ck}(K^i_b,\mathsf {open}({K^i_b}))\) for all \(b\in \{0,1\},i\in [n]\).

Input Verification: :

\(b\leftarrow \mathsf {GC.TkV}(\mathsf {tk},\hat{x})\) is a deterministic algorithm that parses

$$\begin{aligned}\hat{x} = \{K^i,\mathsf {open}({K^i})\}_{i\in [n]}\end{aligned}$$

computes \(\langle {K^i}\rangle =\mathsf {Com}_{ck}(K^i,\mathsf {open}({K^i}))\), and outputs \(\bot \) if there exists an i such that \(\langle {K^i}\rangle \not \in \mathsf {tk}\).

We define the following properties:

Definition 5

(Token Privacy). We say that a garbling scheme enjoys token privacy if there exists a PPT simulator \(\mathcal {S} \) such that the two following distributions are computationally indistinguishable:

$$\begin{aligned} \left\{ (\hat{f},\mathsf {GC.En}(e,x) ,d, \mathsf {tk}) : \begin{array}{c} (\hat{f},e,d)\leftarrow \mathsf {GC.Gb}(1^\kappa ,f), \\ \mathsf {tk}\leftarrow \mathsf {GC.TkG}(e) \end{array} \right\} _x \approx \{\mathcal {S} (1^\kappa , f,f(x))\}_x \end{aligned}$$

Definition 6

(Input Verifiability). We say that a garbling scheme enjoys input verifiability if for all PPT \(\mathcal {A} \) the following probability

$$\begin{aligned} \Pr \left( \mathsf {GC.De}(d,\mathsf {GC.Ev}(\hat{f},\hat{x}^*)) = \bot \wedge \mathsf {GC.TkV}(\mathsf {tk},\hat{x}^*)=\top : \begin{array}{c} (\hat{f},e,d)\leftarrow \mathsf {GC.Gb}(1^\kappa ,f), \\ \mathsf {tk}\leftarrow \mathsf {GC.TkG}(e), \\ \hat{x}^*\leftarrow \mathcal {A} (1^\kappa , \hat{f},e), \\ \end{array} \right) \end{aligned}$$

is negligible in \(\kappa \).

The proof that our construction satisfies the above requirements is straightforward.

Lemma 1

Any projective garbling scheme can be enhanced to achieve token privacy and input verifiability using computationally hiding and computationally binding commitments as described above.

Proof

For token privacy, we simply run the simulator guaranteed by the privacy property of the underlying garbling scheme. In addition, our simulator needs to output \(\mathsf {tk}\), a vector of pair of commitments. The simulator does so by parsing the encoded input \(\hat{x}\) (provided by the privacy simulator) into \(\hat{x}=\{K^i \}_{i\in [n]}\), chooses random values \(\mathsf {open}({K^i})\), computes commitments \(\langle {K^i}\rangle =\mathsf {Com}_{ck}(K^i,\mathsf {open}({K^i}))\). The simulator also constructs n commitments \(\langle {0}\rangle \) (using independent randomness), and constructs \(\mathsf {tk}\) as n pairs of commitments, where each pair is \((\langle {K^i}\rangle ,\langle {0}\rangle )\) if \(r_i=0\) or \((\langle {0}\rangle ,\langle {K^i}\rangle )\) otherwise. Any adversary that can distinguish between the two distributions in the token privacy property can be trivially reduced to an adversary for either the computationally hiding property of the commitment scheme or the privacy property of the underlying garbling scheme.

For the input verifiability property, let \(\hat{x}^*=\{K^i,\mathsf {open}({K^i})\}_{i\in [n]}\) be the output of the adversary and let \(e=\{(K^i_0,\mathsf {open}({K^i_0})),(K^i_1,\mathsf {open}({K^i_1}))\}_{i\in [n]}\). Now if \(\mathsf {GC.TkV}(\mathsf {tk},\hat{x}^*)=\top \), it must be the case that \(K^i=K^i_b\) for some b or the adversary can be used to break the binding property of the commitment scheme. But then, the property follows from the correctness of the underlying garbling scheme.

Sub-functionalities. In some stages of our protocol we let the parties run any actively-secure two-party computation protocol to implement a desired functionality. In the protocol description we only describe what the functionality should do and not how the functionality is implemented (in the proof we will make use of the UC composition theorem [Can01] to replace these subprotocols with hybrid functionalities under the control of the simulator). In particular we describe the private input of both parties, the private output of each party and the computation performed by the functionality. We also describe the public input of some functionalities. These are values which are defined previously in the protocol which can be imagined as values which are given as input by both parties (and the functionality aborts if they are different).

3 Our Protocol

We are now ready to present our protocol, which we like to call cross-and-clean Footnote 4. As our protocol is quite complex, we split its presentation in five stages, which are described in Figs. 1, 2, 3, 4 and 5 respectively.

Fig. 1.
figure 1

Stage 1: providing inputs

Fig. 2.
figure 2

Stage 2: cut-n-choose

Fig. 3.
figure 3

Stage 3: run computation i

Fig. 4.
figure 4

Stage 4: actively secure filling-in

Fig. 5.
figure 5

Stage 5: forge-and-lose

We have already (in the introduction) argued that the efficiency of the protocol is as desired. So it is now only left to prove that the protocol is secure:

Theorem 1

The protocol described in Figs. 1, 2, 3, 4 and 5 securely evaluates \(\ell \) copies of f in the presence of active adversaries.

Proof

Thanks to the UC composition theorem [Can01] it is sufficient to prove security of the protocol where we replace all actively secure subprotocols in the protocol (the committed OT, coin-flip, filling-in and forge-and-lose subprotocols respectively in Stage 1, 2, 4, 5) with ideal functionalities controlled by the simulator (in order to prove our theorem is enough to know that protocols for these functionalities exist [CLOS02] and we have measured their complexity in terms of the size of the functionalities that they implement). Since the protocol is completely symmetric for A and B, we will assume in the proof that A is corrupt and B is honest. Note that, since we prove the security of the protocol in the standard simulation-based indistinguishability between the real world and the ideal world, we must prove correctness and privacy at the same time – not as separate properties. Note also that, for the sake of presentation, our proof neglects many of the technicalities of the UC-framework [Can01] (such as delayed delivery of messages) but that our simulation strategy is straight-line.

As usual we make a proof by hybrids. We describe the simulator strategy along the way, by making progressive changes from the real protocol towards the simulator strategy, arguing for indistinguishability after every change. We describe the final simulator strategy in Fig. 6.

Fig. 6.
figure 6

Simulator strategy

Hybrid 0. We start in dream version of the simulation, where we assume the simulator is given the real inputs \(x^B_i\) of the honest party B. Here the simulator simulates simply by running the real protocol with the adversary controlling A and the simulator running B and the hybrid ideal functionalities. If B aborts in the protocol before receiving encrypted inputs from A, instruct the ideal functionality to abort on behalf of the corrupted A. Otherwise the simulator extracts \(\{x^A_i\leftarrow \mathsf {SE.D}(\sigma ^A,X^A_i)\}_{i\in [\ell ]}\) and input these values to the ideal functionality. This allows the simulator to learn all the real outputs \(\{y_i\}_{i\in [\ell ]}\). If B aborts in the protocol after receiving encrypted inputs, instruct the ideal functionality to abort on behalf of the corrupted A. Otherwise, let \(\{y_i'\}_{i\in [\ell ]}\) be the outputs of the protocol. If \(\{y_i\}_{i\in [\ell ]} \ne \{y_i'\}_{i\in [\ell ]}\), the simulator aborts the simulation. Clearly this first hybrid is perfectly indistinguishable from the real protocol execution as long as \(\{y_i\}_{i\in [\ell ]} = \{y_i'\}_{i\in [\ell ]}\). Therefore hybrid 0 is computational indistinguishable from the real protocol execution if the protocol has correctness except with negligible probability. We argue correctness of this at the end of the proof.

Hybrid 1. Here we change the inner working of the committed OT functionality (Stage 1): from now on the simulator sends A a commitment to 0 instead of \(\sigma ^B\). Any adversary that can distinguish after this change can be used to break the computationally hiding property of the commitment scheme.

Hybrid 2. Here we replace the commitment sent to A in Stage 4 (filling in phase) to be a commitment to 0 instead of the MAC key \(\tau ^B\). Any adversary that can distinguish after this change can be used to break the hiding property of the commitment.

Hybrid 3. Here we replace the abort condition (Step 1.a) in the functionalities for filling-in in Stage 4. Let \(\sigma ^A\) be the values received by A from the committed OT functionality in Stage 1 and \(\sigma ^*\) be the values input by A to the filling-in functionality in Stage 4. From now on we always abort if \(\sigma ^*\ne \sigma ^A\) (even if what A inputs is a valid opening of the commitment \(\langle {\sigma ^A}\rangle \)). Any adversary that can distinguish after this change can be used to break the computationally binding property of the commitment scheme. (Note that from now on we have the guarantee that the output of filling-in to the honest party B can only be the real output \(y_i\).)

Hybrid 4. Here we replace the abort condition (Step 1.a and 1.c in Computation) of the functionality for forge-and-lose in Stage 5 in a similar way as in the previous hybrid, namely: let \(\sigma ^*,\tau ^*\) be the keys input by A, then in this hybrid we abort if \(\sigma ^A\ne \sigma ^*\) or \(\tau ^A\ne \tau ^*\) (even if A inputs proper commitment openings). An adversary distinguishing after this change can again be used to break the computationally binding property of the commitment scheme.

Hybrid 5. Here we replace the last abort condition (Step 1.c) in the functionality for filling-in in Stage 4. For each \(i\in I^A\) let \(V^*_i\) be the value input by A to this computation. From now on we always abort if \(V^*_i\ne V_i\) (even if the \(\mathsf {MT.V}\) algorithm accepts the proof \(\pi _i\)). An adversary distinguishing after this change can be used to break the computationally binding property of the Merkle-tree commitment. (Note that at this point, by definition, the output of filling-in to A cannot be different from \(y_i\)).

Hybrid 6. Here we change the distribution of \(\mathsf {tra}^B_i\) (the value sent from B to A in Step 2 of Stage 3): instead of computing \(\mathsf {tra}^B_i=\mathsf {OTTransfer}(\mathsf {sen}^B,i,e^B_i)\), we compute \(\mathsf {tra}^B_i=\mathsf {OTTransfer}(\mathsf {sen}^B,i,e^*_i)\) where \(e^*_i = \{K^i_0,K^i_1\}\) is defined as follows: let \(\hat{\sigma }^A_i\leftarrow \mathsf {GC.En}(e^B_i,\sigma ^A)\) and parse \(\hat{\sigma }^A_i = \{K^i\}\), then we set

$$\begin{aligned} K^i_{\sigma ^A[i]}=K^i \text { and }K^i_{1-\sigma ^A[i]}=0. \end{aligned}$$

That is, we set all labels not corresponding to the bits of \(\sigma ^A\) to 0. Since A only has access to the keys \(\mathsf {rec}^A\) corresponding to the bits of \(\sigma ^A\), we can use an adversary that distinguishes after this change to break the IND-CPA of underlying symmetric encryption scheme.

Hybrid 7. Here we change the distribution of the garbled circuits and garbled inputs sent to A during Stage 2 and 3 for all \(i\not \in {\text {CC}}\) (since the simulator is controlling the coin flip functionality, this set is known to the simulator from the beginning), by running the simulator (which is guaranteed to exist thanks to the token privacy property of the garbling scheme on input the function f and the output \(y_i\). (Token privacy is defined in Definition 5.) The simulator provides us with garbled versions of all inputs including \(\hat{\sigma }^A_i\), as well as \(\mathsf {tk}^B_i\), \(d^B_i\) and \(\hat{f}^B_i\) which can now replace the values sent to A in Step 3 of Stage 2 and Steps 1, 2 and 8 of Stage 3. Any adversary distinguishing after this step can be used to break the token privacy of the underlying garbling scheme. (Note that the simulator can also, running \(\mathsf {GC.Ev}\) on the garbled circuit and the garbled inputs, compute the garbled output \(\hat{y}^A_i\) which is needed in the next steps.)

Hybrid 8. Here we replace the abort condition (Step 1.e) in the functionality forge-and-lose in Stage 5. For all i the simulator computes \(\hat{y}^*_i\leftarrow \mathsf {Ext}(\mathsf {td}^A,\langle {\hat{y}^*_i}\rangle )\) using the trapdoor \(\mathsf {td}^A\) (which the simulator learns as it controls the committed-OT sub-protocol) and the commitments \(\langle {\hat{y}^*_i}\rangle \) received during Step 6 of Stage 3. Let \(\hat{y}^{**}_i\) be the value input by A to the forge-and-lose functionality. The simulator now aborts if \(\hat{y}^*_i \ne \hat{y}^{**}_i\) even if A provides a valid commitment opening. Any adversary distinguishing after this step can be used to break the binding property of the commitment scheme.

Hybrid 9. Here we change the last aborting condition (Step 1.h) in the functionality forge-and-lose in Stage 5. Let \((y^*_i,\hat{y}^*_i,t^*_i)\) be the value input by the adversary and let \((y_i,\hat{y}^A_i,t^A_i)\) be the values computed by the simulator in the previous hybrids. From now on, instead of aborting if

$$\begin{aligned} \exists i : y^*_i \ne \mathsf {GC.De}(d^B_i,\hat{y}^*_i) \wedge \mathsf {MAC.Ver}(t^*_i,\tau ^B,y^*_i)=\bot \end{aligned}$$

the simulator aborts if

$$\begin{aligned} \exists i : (y^*_i \ne \mathsf {GC.De}(d^B_i,\hat{y}^*_i) \wedge \mathsf {MAC.Ver}(t^*_i,\tau ^B,y^*_i)=\bot ) \vee (y^*_i\ne y_i). \end{aligned}$$

Any adversary that can distinguish after this change can be used to break unforgeability of the MAC scheme (note that the simulator at this point does not need to know \(\tau ^B\) since it has been replaced by 0 in the commitment that A receives at the end of the filling-in stage, and we can therefore successfully run the reduction) or to break the authenticity property of the garbling scheme (note that we have already made sure that value \(\hat{y}^*_i\) input by A here is the same as the one he commits to in Step 7 of Stage 3 and – since the simulator can extract the value in the commitment using the trapdoor – the reduction can already break the authenticity property before having to send \(d^B_i\) or the opening of the commitment to A in Step 8 of Stage 3). Note that after this change we are ensured (by definition) that A will never receive \(\sigma ^B\) as a result of running the forge-and-lose sub-protocol.

Hybrid 10. Here we change the distribution of the commitment that the simulator sends to A in Step 6 of Stage 3, from being a commitment to \(\hat{y}^B_i\) to being a commitment to 0. An adversary that distinguishes after this change can be used to break the hiding property of the commitment scheme.

Hybrid 11. Here we let the simulator fully decrypt the transfer message \(\mathsf {tra}_i\) from A in Step 3 of Stage 3. That is, instead of running \(\hat{\sigma }^B_i\leftarrow \mathsf {OTRetrieve}(\mathsf {rec}^B,i,\mathsf {tra}^A_i)\) the simulator extracts

$$\begin{aligned} e^*_i \leftarrow \mathsf {OTRetrieve}(\mathsf {sen}^A,i,\mathsf {tra}^A_i) \end{aligned}$$

and constructs the set

$$\begin{aligned} \mathcal {L}_i \subset [8\kappa ] \times \{0,1\}\end{aligned}$$

as follows. Parse

$$\begin{aligned} \mathsf {tk}^A_i =\{ \langle {A_j}\rangle ,\langle {B_j}\rangle \}_{j\in [8\kappa ]} \end{aligned}$$

and

$$\begin{aligned} e^*_i = \{ (K_{j,0} , \mathsf {open}({K_{j,0}})), (K_{j,1} , \mathsf {open}({K_{j,1}})) \}_{j\in [8\kappa ]}. \end{aligned}$$

We add (j, b) to \(\mathcal {L}_i\) if

$$\begin{aligned} \mathsf {Com}_{ck}(K_{j,b},\mathsf {open}({K_{j,b}}))\not \in \{ \langle {A_j}\rangle ,\langle {B_j}\rangle \}. \end{aligned}$$

We then compute

$$\begin{aligned} \mathcal {L}= \cup _{i\in [\ell ]} \mathcal {L}_i. \end{aligned}$$

The set \(\mathcal {L}\) represents all the positions in all the OT transfers in which A “cheated” i.e., where A sent some value which is not consistent with the tokens \(\mathsf {tk}^A_i\). Since an honest B uses the same input bits \(\sigma ^B\) in all transfers, we only count each combination of position j and bit b once. In other words, for each index j A has three strategies:

  1. 1.

    Input the right values (i.e., values that make \(\mathsf {GC.TkV}\) accept) for both \(b\in \{0,1\}\) (for all \(i\in [\ell ]\)): in this case neither \((j,b)\not \in \mathcal {L}\) for both \(b\in \{0,1\}\);

  2. 2.

    Input the right value for a single \(b\in \{0,1\}\) (for all \(i\in [\ell ])\)) and at least a wrong value for \(1-b\) for some \(i^*\in [\ell ]\): in this case \((j,1-b)\in \mathcal {L}\);

  3. 3.

    Input the wrong value for both \(b\in \{0,1\}\) (potentially for different \(i^*\in [\ell ]\)): in this case both \((j,0)\in \mathcal {L}\) and \((j,1)\in \mathcal {L}\);

Now we replace the abort condition in Step 4 of Stage 3 to the following: the simulator aborts with probability 1 if \(\exists \) j such that both \((j,0)\in \mathcal {L}\) and \((j,1)\in \mathcal {L}\) (this is consistent with what B would do in the real protocol, since in this case B will detect the wrong labels regardless of the value of \(\sigma ^B[j]\)). Otherwise, the simulator aborts with probability \(1-2^{|\mathcal {L}|}\) (this is consistent with what B would do in the real protocol, since in this case B detects the wrong labels only if \(\sigma ^B[j]=b\) for \((j,b)\in \mathcal {L}\) – note on the other hand that if \((j,b)\in \mathcal {L}\) and B does not abort then the corrupt A learns that the value of \(\sigma ^B[j]\ne b\), and we will take care of this in a moment). At the same time we change the distribution of the encryptions \(X^B_i\) sent by B to A in Step 3 of Stage 1 to be all encryptions of 0. Any adversary that can distinguish after this change can be used to break the IND-CPA security of \((\mathsf {SE.E},\mathsf {SE.D})\). Remember that we required \((\mathsf {SE.E},\mathsf {SE.D})\) to be secure even against adversaries who learn up to \(\kappa \) bits of the secret key. We here use this property to let the reduction ask the IND-CPA challenger for the bits of \(\sigma ^B[j]\) for all \(j : (j,b)\in \mathcal {L}\).

Hybrid 12. After encrypting 0s instead of the real input of B, it can easily be seen that there is only one place left where we use the input of B, namely to compute the set \(I^B\) for which we need the input of B to evaluate the garbled circuits, as a bad circuit might give an output on some of B’s inputs and \(\bot \) on some other inputs. We get rid of this last use of the inputs of B by removing the restriction that \(\vert I^B \vert \le \psi \kappa \) in the functionalities for filling-in and dropping the abort condition in Step 11 in Stage 3. This change is indistinguishable, as \(\vert I^B \vert \le \psi \kappa \) except with negligible probability. To see why this is the case, let \(\mathsf {good}\) be the set of honestly generated circuits among those received by B. Thanks to the input verifiability property of our garbling scheme we know that for all \(i\in \mathsf {good}\) the values received by B as output from the Stage 3 satisfy \(y^B_i\ne \bot \). Since we open \(\epsilon \ell \) of the circuits in the cut-and-choose the probability that \(\psi \kappa \) bad circuits will all survive without any being detected is less than \(\left( 1+\epsilon \right) ^{-\psi \kappa }\), and we have set \(\psi \) such that \(\left( 1+\epsilon \right) ^{-\psi \kappa } = 2^{-\kappa }\).

This concludes the description of our simulation strategy. It can be seen that (by construction) at this point the simulator does not use the input of the honest party B and we have argued for indistinguishability after each individual change. The complete description of the simulator after all the hybrids can be found in Fig. 6. The simulator is simply a compilation of all the individual changes done in the above hybrids. Therefore the distribution of Hybrid 12 is identical to the distribution of the simulation.

What remains is therefore only to argue that Hybrid 0 is indistinguishable from the real protocol. In Hybrid i define an event \(E^i\) as follows. Let \(Y = \bot \) if the ideal functionality aborts and let \(Y=\{y_i\}_{i\in [\ell ]}\) be the outputs of the ideal functionality otherwise. Let \(Y' = \bot \) if the protocol aborts and let \(Y' = \{y_i'\}_{i\in [\ell ]}\) otherwise. Let \(E^i\) be the event that \(Y \ne Y'\). To argue that Hybrid 0 is indistinguishable from the real protocol it is clearly enough to argue that \(\Pr [E^0]\) is negligible. We have that \(\Pr [E^{12}] = 0\) by construction: it has already been argued that since Hybrid 5 the outputs of the filling-in for the honest party are correct; and it has been argued that since Hybrid 8 the corrupt party can only input the correct outputs to the forge-and-lose functionality, which implies that either the outputs that B received before this sub-protocol are the correct ones or B will receive \(\sigma ^A\) as a result of forge-and-lose and compute the right outputs in the clear.

It then follows from Hybrid 0 and Hybrid 12 being indistinguishable that \(\Pr [E^0]\) is indistinguishable from 0, i.e., negligible.

4 Dealing with Long Inputs and Outputs

The protocol described and analysed in the previous section allows to compute \(f:\{0,1\}^{n_I}\rightarrow \{0,1\}^{n_O}\) where \(n_I\) is the input size and \(n_O\) is the output size. In the previous sections we have, for simplicity, assumed that \(n_I=n_O=O(\kappa )\). However in general \(n_I\) and \(n_O\) can be of size linear in |f|. This presents an issue in the forge-and-lose step, since the size of the circuit implementing the functionality is of size:

$$\begin{aligned} (\ell n_O + \ell n_I + |f|)\kappa = O(\ell |f|\kappa ) \end{aligned}$$

instead of \(O((\ell +|f|)\kappa )\) as desired. We describe here two optimizations which allow to deal with this:

 

Dealing with Long Outputs: :

It is quite easy to deal with long outputs in the following way: modify the circuit to be garbled so that, in addition to outputting \(y_i\), it also outputs \(h(y_i)\) with h a collision resistant hash function. Now it is clear that \(|h(y_i)|=\kappa <n_O\) and therefore we can modify the protocol in the following way: instead of letting A,B input the values \(y_i\) to the forge and lose functionality, they input the hashes instead. The forge and lose finds the first index where the hashes differ, recompute the function (and the hash) on that index and determines who cheated.

Dealing with Long Inputs: :

To deal with long inputs, the key is to notice that only a single pair of inputs (in encrypted format) is ever used during the forge and lose functionality, and the only reasons for the parties to input all of the ciphertexts is to guarantee that the adversary cannot learn in which positions (if any) the function is being recomputed. In some sense we are using a very naĂŻve private information retrieval (PIR), which can of course be replaced with a more clever one. We can therefore modify the forge and lose stage in the following way: instead of having A, B input all ciphertexts at the beginning, they only input the outputs (or their hashes as described above). The functionality finds the first i such that \(y^A_i\ne y^B_i\) (if any), and then runs a 2-server PIR protocol with A, B. This allows the functionality to learn the ciphertext pair \(X^A_i,X^B_i\) necessary to determine the right value of \(y_i\) by receiving only \(\sqrt{\ell } n_I\) bits from A,B. Since \(n_I<|f|\) it is enough to assume that \(\ell >|f|^2\) to bound this term with \(\ell \).

After these changes, the number of bits which A, B send to the forge and lose functionality (regardless of the input and output size of the function) is bounded by

$$\begin{aligned} (\ell \kappa + \sqrt{\ell }f + |f|) \kappa = O((\ell +|f|)){\text {poly}}(\kappa ) \end{aligned}$$

as desired.