Keywords

1 Introduction

Zero-knowledge argument systems introduced by Goldwasser et al. [22] enable a prover to convince a verifier of the veracity of a statement while leaking no additional information. Blum et al. [6] introduced non-interactive zero-knowledge (NIZK) argument systems where the prover outputs just one message (the argument) that convinces the verifier of the truth of the statement. Unfortunately, NIZKs are impossible in the standard model [21], and thus in all such applications, one has to rely on some trust assumption like the common reference string (CRS) model stating that there exists a trusted third party who has created the CRS from a correct distribution. Other, weaker, trust models include the registered public key (RPK, [3], where the authority is trusted to check that a party knows the secret key corresponding to the public key and then store her key) model and the bare public key (BPK, [9], where the authority is only trusted to store the public key of each party) model. However, very few NIZKs are known in the RPK model while black-box NIZK [38] (the simulator uses adversarial algorithm only by giving inputs and receiving outputs) and even auxiliary-string non-black-box [21, 42] (the simulator may use the code of the adversary, who has access to an arbitrary auxiliary string) NIZK is impossible in the BPK model.

There has been a recent surge of the research to decrease the trust in the CRS model due to the use of succinct non-interactive zero knowledge argument systems of knowledge (zk-SNARKs, [11, 18, 26, 27, 35, 36, 40]) in real-life applications like verifiable computation and cryptocurrencies. Recently, [2, 15] constructed subversion-zero knowledge (Sub-ZK) zk-SNARKs, where the prover does not have to trust the CRS creator. According to an impossibility result of [4], this means that such SNARKs cannot have soundness when the CRS has been maliciously generated. Abdolmaleki et al. [2] proposed the following concrete recipe for constructing Sub-ZK zk-SNARKs: first, construct an efficient public CRS verification algorithm \(\mathsf {CV}\) that rejects malformed CRSs. Second, when proving Sub-ZK, use a non-falsifiable knowledge assumption [10] to obtain an extractor that recovers the CRS trapdoor \(\mathsf {td}\) from a \(\mathsf {CV}\)-accepted CRS; \(\mathsf {td}\) is then used by the simulator (that works when the CRS has been honestly generated) to simulate the argument. Based on this recipe, [2, 15] showed that the most efficient known zk-SNARK by Groth [27] is Sub-ZK. One principal weakness of zk-SNARKs is that zk-SNARKs for languages outside of \(\mathsf {BPP}\) have to rely on non-falsifiable assumptions, based on the impossibility result of [19]. However, we are not aware of any prior result indicating whether non-falsifiable assumptions are needed to obtain Sub-ZK.

Another important recent direction in the NIZK arena is that of quasi-adaptive NIZKs (QA-NIZKs, [28]). In a QA-NIZK, the CRS can depend on a language parameter \(\varrho \), where \(\varrho \) can be thought of as a properly distributed public key of some cryptosystem. One consequence of this definition is that up to now, QA-NIZKs have been only considered in the CRS model. The dependence of CRS on correctly generated \(\varrho \) means that one can construct very efficient QA-NIZKs for non-trivial languages based on standard assumptions like \(\mathrm {KerMDH}\) [39]. Importantly, very efficient pairing-based QA-NIZKs [1, 23, 28, 30,31,32] for the linear subspace language have been constructed in the CRS model. A QA-NIZK argument system for linear subspaces allows the prover to convince the verifier that a vector of group elementsFootnote 1 \([\varvec{y}]_{1}\) belongs to the column space of a fixed public matrix \(\varrho = [\varvec{M}]_{1} \in \mathbb {G}_{1}^{n\times m}\), i.e., \(\varvec{y} = \varvec{M} \varvec{x}\) for some vector \(\varvec{x} \in \mathbb {Z}_{p}^{m}\).

Although QA-NIZKs for other languages are known (e.g., the language of bitstrings [23] and the languages of shuffles [24], both requiring a quadratic-length CRS, and a recent QA-NIZK [12] for SSP [11], that relies on non-succinct commitment), research on QA-NIZKs has been mostly concentrated on designing efficient QA-NIZKs for linear subspaces. Such focus is motivated because of the broad applicability of QA-NIZKs for linear subspaces in the design of various cryptographic primitives (see [28, 30,31,32] for examples and references). In addition, [14] combined SNARKs and QA-NIZKs for linear subspaces to construct an efficient pairing-based NIZK shuffle argument systems. This and other recent work [8, 25, 37] that use QA-NIZKs to construct SNARKs shows that the study of different properties of QA-NIZKs can be also beneficial in the world of SNARKs.

In particular, Campanelli et al. [8] proposed a toolbox called LegoSNARK that allows building complex zk-SNARK arguments from other zk-SNARKs given that the building blocks of the final zk-SNARK are so-called commit-and-prove SNARKs (CP-SNARKs). A linear subspace QA-NIZK plays a crucial role in the Campanelli et al. framework. First, it is used in a transformation that makes commit-carrying SNARKs (CC-SNARKs), like [27], CP-SNARKs. Second, it is used as a building block in several CP-SNARKs proposed in [8]. Thus, one interested in having Sub-ZK LegoSNARK or Sub-ZK CP-SNARKs inevitably needs a Sub-ZK QA-NIZK for linear subspaces. Importantly, in [8, 14], one uses a QA-NIZK to prove that an element belongs to the trivial full space; in this case, a QA-NIZK is sound by default. Instead, one has to prove that the stronger property of knowledge-soundness holds.

The main goal of the current paper is the study and construction of subversion-secure QA-NIZKs. According to the original security definitions of QA-NIZKs [28], one aims for soundness (alternatively, knowledge-soundness in applications like [8, 14]) and zero-knowledge in the case when both \(\varrho \) and the CRS are honestly generated. In reality, it means that in the case of QA-NIZKs, one will have one more subversion-attack vector than in the case of SNARKs: namely, one has to consider both the case of a subverted language parameter (the Sub-PAR case) and the case of a subverted CRS. The Sub-PAR case with honestly generated CRS was tackled in [29] (updated full version of [28] from September 2018) where both Sub-PAR soundness and Sub-PAR zero-knowledge were shown to be achievable for a large family of subspace languages.Footnote 2 Since the simulator does not need access to a language parameter trapdoor \(\mathsf {td_\varrho }\), one does not have to extract \(\mathsf {td_\varrho }\) for the simulation to be possible. Moreover, in the Sub-PAR case, the CRS is still honestly generated, which means that the simulator has access to the CRS trapdoor \(\mathsf {td}\).

Translated to the language of QA-NIZKs, by the impossibility result of [4], one cannot achieve both soundness and zero-knowledge in the case both \(\varrho \) and the CRS have been subverted. Therefore, in the rest of the paper, we study the slightly more relaxed case when (knowledge-)soundness holds if only \(\varrho \) has been subverted and zero-knowledge holds when both \(\varrho \) and the CRS have been subverted. It is unclear whether one can use existing techniques to construct a Sub-ZK version of the most efficient QA-NIZKs like \(\varPi _{\mathsf {kw}}\) by Kiltz and Wee [31] in this case. First, \(\varrho \) has to be modeled separately from other inputs; no such parameter exists in the case of SNARKs. The existence of \(\varrho \) (and the dependence of the CRS on it) is the main reason why falsifiable QA-NIZKs are so efficient.

Second, known QA-NIZKs have a very different structure compared to SNARKs. For example, the most efficient known QA-NIZK for linear subspaces \(\varPi _{\mathsf {kw}}\) by Kiltz and Wee [31] has a trapdoor matrix \(\varvec{K}\), but \([\varvec{K}]_{1}\) is not explicitly given in the CRS. This means that the knowledge assumptions of [2, 15] or knowledge-of-exponent assumptions [10] (that all rely on \([\alpha ]_{\iota }\) being in the CRS for each trapdoor \(\alpha \)) cannot be directly translated to the case of (Kiltz-Wee) QA-NIZK, and thus one seems to need quite different knowledge assumptions.

Third, another significant difference is that the soundness of efficient QA-NIZKs like [1, 28, 30,31,32] is based on standard falsifiable assumptions like \(\mathrm {KerMDH}\). Thus, intuitively, the use of non-falsifiable assumptions to prove Sub-ZK of a (sound) QA-NIZK seems to be less justifiable than in the case of proving Sub-ZK of zk-SNARKs since in the case of zk-SNARKs, non-falsifiable assumptions are needed to get soundness anyhow [19]. Moreover, while Bellare et al. had a discussion motivating the use of knowledge assumptions to obtain Sub-ZK, they did not have a formal proof of their necessity. Can one base Sub-ZK QA-NIZKs on falsifiable assumptions or prove it is impossible? (Non-subversion zero-knowledge) QA-NIZKs do not always rely on falsifiable assumptions: in the applications of QA-NIZKs in [8, 14, 25, 37], one proves the “membership” in the full space that only makes sense under knowledge assumptions.

This brings us to the main questions of the current work:

(i) :

Are non-black-box techniques needed to prove Sub-ZK of NIZKs for languages outside of \(\mathsf {BPP}\) ?

(ii) :

Are (knowledge-)soundness and zero-knowledge achievable in the previously described model, i.e., only \(\varrho \) has been subverted in the case of soundness, and both \(\varrho \) and the CRS are subverted in the case of zero-knowledge? From this point on, we assume Sub-ZK QA-NIZK works in this model.

(iii) :

Can one obtain Sub-ZK QA-NIZKs for linear subspaces without modifying the existing constructions?

Our Contributions. We answer to the above main questions (with yes, yes, and mostly yes). It turns out that achieving Sub-ZK for state-of-the-art QA-NIZKs is considerably more complicated than for state-of-the-art SNARKs. This follows partly from the nature of QA-NIZKs (the existence of separate \(\varrho \) and ) and from the construction of the concrete QA-NIZK. In the most relevant case (\(k= 1\)), it turns out that the most efficient existing QA-NIZK by Kiltz and Wee [31] is Sub-ZK (in the model described above) under a (novel) knowledge assumption given suitable algorithms that verify the correctness of both \(\varrho \) and . Hence, in this case, Sub-ZK comes almost for free: one only has to perform some additional computations that verify the correctness of the (language parameter and) CRS, and the proof of Sub-ZK relies on a non-falsifiable assumption.

First, we make a conceptually important observation that Sub-ZK in the CRS model, as defined in [2, 4, 15], is equal to no-auxiliary-string non-black-box zero knowledge [21] in the BPK model [9, 38]. In the BPK model, the verifier (but not the prover) has a public key; and the key authority executes the functionality of an immutable bulletin board by storing the received public keys. A zero-knowledge argument in the BPK model is either designated-verifier (the argument convinces only the designated verifier) when using the verifier’s own public key or transferable (the verifier can transfer the argument to other verifiers and convince them of its validity) when using the public key of a third party; the latter case is essentially equivalent to the CRS model with being the CRS, . The BPK model is significantly weaker than the CRS model, being arguably the weakest public key or parameter based trust model under which complicated functionalities like zero-knowledge are known to exist.

This important positive connection between no-auxiliary-string non-black-box zero knowledge and Sub-ZK was missed in the prior work on Sub-ZK; we hope it will simplify the construction and analysis of the future Sub-ZK argument systems. Because of that connection, we will usually use the abbreviation Sub-ZK to denote no-auxiliary-string non-black-box zero knowledge, but we explicitly emphasize that we are working in the BPK model.

Since three messages are needed to achieve auxiliary-string zero knowledge in the plain model for languages outside of \(\mathsf {BPP}\) [21], it follows that in the BPK model, auxiliary-string non-black-box NIZK is possible only for languages in \(\mathsf {BPP}\). This provides a simple proof that one can only construct non-auxiliary-string non-black-box NIZK for languages outside of \(\mathsf {BPP}\) and thus provides an answer to the open question (i).

In Sect. 3, we define the security of QA-NIZK arguments in the BPK model; for this, we strengthen the “strong” QA-NIZK security definitions from [29] (as updated on September 2018) that consider the case of subverted \(\varrho \) but honestly generated . We allow for both \(\varrho \) and to be subverted. We model the resulting definition of persistent zero-knowledge after the Sub-ZK definition of SNARKs in [2], allocating a special role for the language parameter \(\varrho \). More precisely, we require that for any efficient malicious \(\mathcal {C}\) that creates the language parameter creator and the public key, there exists an efficient extractor \(\mathsf {Ext}_\mathcal {C}\), s.t. if \(\mathcal {C}\), by using random coins r, generates a language parameter \(\varrho \) and a public key (since there is no auxiliary input, \(\varrho \) and have to be generated by \(\mathcal {C}\)) then \(\mathsf {Ext}_\mathcal {C}\), given r, outputs the secret key corresponding to .

Since we allow both \(\varrho \) and to be subverted, it is possible that the subverter sets for \(\mathsf {td_\varrho }\) being a trapdoor for a parameter \(\varrho \), e.g. for Kiltz-Wee QA-NIZK, \(\varrho = [\varvec{M}]_{1}\) and \(\mathsf {td_\varrho }= \varvec{M}\). As we show in Sect. 4, this can result in pathological QA-NIZK argument systems that are persistent zero-knowledge but not standard zero-knowledge. (This is possible since we consider an extractor that extracts the trapdoor behind \(\varrho \) and returns this as the secret key.) Hence, we say that a QA-NIZK argument system is no-auxiliary-string non-black-box zero-knowledge (i.e., Sub-ZK) iff it is both standard zero-knowledge and persistent zero-knowledge.

As the next main contribution, we study a variant \(\varPi _{\mathsf {bpk}}\) of the most-efficient known QA-NIZK for linear subspaces \(\varPi _{\mathsf {kw}}\) by Kiltz and Wee [31] (denoted there as \(\varPi '_{as}\)). \(\varPi _{\mathsf {kw}}\) is known to be perfectly zero-knowledge and computationally sound in the CRS model under a suitable \(\mathrm {KerMDH}\) assumption, [31] for a matrix distribution \(\mathcal {D}_{k}\) where \(k\) is a small security-assumption-related integer; \(k= 1\) in the case of asymmetric pairings. In \(\varPi _{\mathsf {kw}}\), the CRS includes a matrix \([\varvec{\bar{A}}]_{2} \in \mathbb {G}_2^{k\times k}\) (assumed to be distributed according to \(\mathcal {D}_{k}\)) and the argument consists of only \(k\) group elements (thus, smaller \(k\) results in better efficiency). In the variant of \(\varPi _{\mathsf {kw}}\) proposed in the current paper, of \(\varPi _{\mathsf {bpk}}\) includes a new component \(\mathsf {pk}^{\mathsf {pkv}}\) that helps to publicly check that even adversarially generated \([\varvec{\bar{A}}]_{2}\) in has full rank \(k\). In the case of many distributions \(\mathcal {D}_{k}\) that are important in practice (we will call such distributions efficiently verifiable), the latter verification can be done efficiently only based on the knowledge of \([\varvec{\bar{A}}]_{2}\) itself and thus \(\mathsf {pk}^{\mathsf {pkv}}\) will be an empty string. Similarly to [2], we also define an efficient public-key verification algorithm that we denote by \(\mathsf {PKV}\). On top of it, we also define an efficient \(\varrho \)-verification algorithm \(\mathsf {PARV}\). We emphasize that we analyze \(\varPi _{\mathsf {kw}}\) since it is the most efficient known QA-NIZK for linear subspaces. We leave analyzing other QA-NIZKs (that will hopefully be easier to do following our definitional framework and analysis of \(\varPi _{\mathsf {kw}}\)) to the further work.

Since in the case of verifiable \(\mathcal {D}_{k}\), we do not modify the public-key generation and the prover (thus, essentially \(\varPi _{\mathsf {kw}}= \varPi _{\mathsf {bpk}}\)), the (non-subversion) soundness of \(\varPi _{\mathsf {bpk}}\) in the BPK model follows directly from [31]. In the non-verifiable special case \(\mathcal {D}_{k} = \mathcal {U}_{2}\), we add some extra elements to and then prove the (non-subversion) soundness of \(\varPi _{\mathsf {bpk}}\) under the \(\mathrm {SKerMDH}\) assumption of [23]. In the subversion-case, when the language parameter could have been subverted, we prove (subverted-\(\varrho \)) soundness under \(\mathrm {KerMDH}^{\mathrm {dl}}\) or \(\mathrm {SKerMDH}^{\mathrm {dl}}\) assumption. Here, if X and Y are two assumptions, \(X^Y\) is the interactive assumption that X holds even if the adversary was given non-adaptive access to a Y oracle. See [34] for a thorough treatment of \(X^Y\)-type assumptions. Interestingly, up to now, the only non-falsifiable assumptions that have been used to construct efficient succinct NIZKs are knowledge assumptions; the use of (seemingly more standard) \(X^Y\)-type assumptions instead is one of the possibly most interesting contributions of the current paper.

As mentioned before, knowledge-sound QA-NIZKs are also interesting in the case when one uses them to prove the membership in the full space. We prove that \(\varPi _{\mathsf {bpk}}\) is knowledge-sound by modifying a similar knowledge-soundness proof from [8] that, however, was only given in the non-subversion case, and only for \(k= 1\). We use a \(\mathrm {SDL^{dl}}\) (where \(\mathrm {SDL}\) is the symmetric discrete logarithm assumption, [5]) assumption, like in the case of soundness proofs, to get knowledge-soundness even in the subversion case. We modify the proof of [8] so that it generalizes to arbitrary \(k\). Moreover, knowledge-soundness will rely on both the \(\mathrm {SDL^{dl}}\) and a hash-algebraic knowledge (HAK) assumption. In [37], Lipmaa recently defined the framework of HAK assumptions to make the algebraic group model (AGM) of Fuchsbauer et al. [16] more concrete and applicable. While in the AGM, it is assumed that every adversary is algebraic, a HAK assumption is defined with respect to a concrete input distribution of the adversary. I.e., a \(\mathcal {D}\)-HAK assumption states that if an adversary obtains an input (a vector of group elements) distributed according to a fixed distribution \(\mathcal {D}\) then she knows how the group elements that she outputs depend on the input. HAK assumptions are even weaker: they allow for the case an adversary has additionally generated high min-entropy (but not necessarily uniformly random) group elements by using say elliptic-curve hashing.

Since \(\varPi _{\mathsf {kw}}\) is perfectly zero-knowledge [31], we now only have to prove that it is also persistent zero-knowledge; from this, it follows that it is Sub-ZK in the BPK model. We prove that \(\varPi _{\mathsf {bpk}}\) is statistically persistent zero-knowledge under either one of the two new knowledge assumptions \(\mathrm {KWKE}\) (the Kiltz-Wee Knowledge of Exponent assumption) and \(\mathrm {SKWKE}\) (the strong \(\mathrm {KWKE}\) assumption)Footnote 3, assuming that its whole is generated by the verifier or a verifier-trusted authority—even if we are set to prove Sub-ZK that interests the prover. Intuitively, (S)\(\mathrm {KWKE}\) guarantees that if an adversary \(\mathcal {A}\) has succeeded in creating a accepted by \(\mathsf {PKV}\) then one can extract corresponding . We prove that both assumptions hold under a hash-algebraic knowledge (HAK, [37]) assumption, see Theorem 1. (Here, \(\mathrm {SKWKE}\) also relies on a computational assumption that depends on the matrix distribution \(\mathcal {D}_{k}\) but is equal to the discrete logarithm assumption for all standard distributions \(\mathcal {D}_{k}\).)

The proof of Theorem 1 is quite intricate. More precisely, we use a HAK assumption to extract some outputs of \(\mathcal {A}\) as polynomials in indeterminates created by \(\mathcal {A}\). To extract an integer , we use the Schwartz-Zippel lemma and let the extractor output evaluation of the polynomials at a random point. We then use the specific form of \(\mathsf {PKV}\) to argue that such is correct. In the case of \(\mathrm {SKWKE}\), we evaluate the polynomials at two random points and use an additional reduction to a computational assumption, see Theorem 1.

Interestingly, under \(\mathrm {KWKE}\) we only get the guarantee that the part \(\mathsf {pk}^{\mathsf {zk}}\) of the , used either by the prover or the simulator, has been correctly computed. This, however, suffices to prove that \(\varPi _{\mathsf {bpk}}\) is Sub-ZK. (Thus, Sub-ZK can be achieved even if the correctness of the whole public key cannot be verified.) Hence, in the case \(\mathcal {D}_{k}\) is efficiently verifiable, one can get Sub-ZK essentially for free (efficiency-wise, the only added cost will be the need for a prover to verify the correctness of the public key; this can, however, be done once per public key). This is important since it means that in the case of efficiently verifiable matrix distributions, we get a stronger security property (Sub-ZK) without having to design a new, more complicated, and less efficient QA-NIZK. Arguably, in practice, one is only interested in efficiently verifiable distributions: the case \(k= 1\) is the most one, and the case \(k= 2\) is only needed in some applications (e.g., when one wants to rely on a weaker assumption). However, in such cases, one can usually use an efficiently verifiable distribution like \(\mathcal {L}_2\) that corresponds to the 2-Lin assumption. This answers to the open questions (ii–iii).

We also show that under a stronger knowledge assumption \(\mathrm {SKWKE}\), one can guarantee that the whole has been correctly computed. However, as a drawback, the \(\mathrm {SKWKE}\) assumption only holds if the language parameter \([\varvec{M}]_{1}\) comes from a suitable hard distribution. The latter is, however, often the case in QA-NIZK applications, where \([\varvec{M}]_{1}\) is a public key of a cryptographic primitive like an encryption or commitment scheme. In both cases, the soundness is guaranteed by a \(\mathrm {KerMDH}\) assumption.

2 Preliminaries

A random variable X has min-entropy k, \(H_\infty (X) = k\), if \(\max _x \Pr [X = x] = 2^{-k}\). Let PPT denote probabilistic polynomial-time. Let \(\lambda \in \mathbb {N}\) be the security parameter. All adversaries will be stateful. For an algorithm \(\mathcal {A}\), let \({\text {im}}(\mathcal {A})\) be the image of \(\mathcal {A}\) (the set of valid outputs of \(\mathcal {A}\)), let \(\mathsf {RND}_\lambda (\mathcal {A})\) denote the random tape of \(\mathcal {A}\) (assuming the given value of \(\lambda \)), and let denote the random choice of the randomizer r from \(\mathsf {RND}_\lambda (\mathcal {A})\). We denote by \(\mathsf {negl}(\lambda )\) an arbitrary negligible function. We write \(a \approx _\lambda b\) if \(\left|a - b \right| \le \mathsf {negl}(\lambda )\). We follow Bellare et al. [4] by using “cryptographic” style in security definitions where all complexity (adversaries, algorithms, assumptions) is uniform, but the adversary and the security (say, soundness) is quantified over all inputs chosen by the adversary. See [4] for a discussion.

A bilinear group generator \(\mathsf {PGen}(1^\lambda )\) returns \((p, \mathbb {G}_1, \mathbb {G}_2, \mathbb {G}_T, \hat{e}, [1]_{1}, [1]_{2})\), where \(\mathbb {G}_1\), \(\mathbb {G}_2\), and \(\mathbb {G}_T\) are additive cyclic groups of prime order \(p = 2^{\varOmega (\lambda )}\), \([1]_{1}, [1]_{2}\) are generators of \(\mathbb {G}_1\), \(\mathbb {G}_2\), resp., and \(\hat{e}: \mathbb {G}_1 \times \mathbb {G}_2 \rightarrow \mathbb {G}_T\) is a non-degenerate PPT-computable bilinear pairing. We assume the bilinear pairing to be Type-3, i.e., that there is no efficient isomorphism from \(\mathbb {G}_1\) to \(\mathbb {G}_2\) or from \(\mathbb {G}_2\) to \(\mathbb {G}_1\). We use the by now standard bracket notation, i.e., we write \([a]_{\iota }\) to denote \(a g_{\iota }\) where \(g_{\iota }\) is a fixed generator of \(\mathbb {G}_{\iota }\). We denote \(\hat{e}([a]_{1}, [b]_{2})\) as \([a]_{1} [b]_{2}\). Thus, \([a]_{1} [b]_{2} = [a b]_{T}\). We freely use the bracket notation with matrices, e.g., if \(\varvec{A} \varvec{B} = \varvec{C}\) then \(\varvec{A} [\varvec{B}]_{\iota } = [\varvec{C}]_{\iota }\) and \([\varvec{A}]_{1} [\varvec{B}]_{2} = [\varvec{C}]_{T}\).

In the Bare Public Key (BPK) model [9, 38], parties have access to a public file F, a polynomial-size collection of records , where id is a string identifying a party (e.g., a verifier), and is her alleged public key. In a typical zero-knowledge protocol in the BPK model, a key-owning party \(\mathcal {P}_{id}\) works in two stages. In stage one (the key-generation stage), on input a security parameter \(1^\lambda \) and randomizer r, \(\mathcal {P}_{id}\) outputs a public key and stores the corresponding secret key . After that, F will include . In stage two, each party has access to F, while \(\mathcal {P}_{id}\) has possible access to (however, the latter is not required by us). It is commonly assumed that only the verifier of a NIZK argument system in the BPK model has a public key [38]; see also Sect. 3.

In a zero-knowledge proof or argument system, a prover convinces the verifier of the veracity of a statement without leaking any side information except that the statement is true. Here, a proof (resp., an argument) system guarantees soundness against an unbounded (resp., a PPT) cheating prover. The zero-knowledge property is proven by constructing a simulator that can simulate the view of a cheating verifier without knowing the secret information (witness) of the prover. A non-interactive zero-knowledge proof or argument system [6] consists of just one message by the prover.

We will only deal with no-auxiliary-string non-black-box NIZK argument systems in the plain model, but to explain this choice, it is important to know that there are many possibility and impossibility results about zero knowledge in the BPK model. Goldreich and Oren [21] proved that three rounds are needed for auxiliary-string zero knowledge in the plain model. From this, it follows that there exists no auxiliary-string non-black-box NIZK argument system in the BPK model for a language \(\mathscr {L}\) outside of \(\mathsf {BPP}\), see Lemma 1.

The Symmetric Discrete Logarithm (SDL) [5] assumption holds relative to \(\mathsf {PGen}\), if for any PPT \(\mathcal {A}\),

Kernel Matrix Diffie-Hellman Assumption (\(\mathrm {KerMDH}\)) is a well-known assumption family formally introduced in [39]. Let \(\mathcal {D}_{\ell k}\) be a probability distribution over matrices in \(\mathbb {Z}_{p}^{\ell \times k}\), where \(\ell > k\). Next, we define five commonly used distributions (see [13] for references), where : \(\mathcal {U}_{k}\) (uniform), \(\mathcal {L}_{k}\) (linear), \(\mathcal {IL}_{k}\) (incremental linear), \(\mathcal {C}_{k}\) (cascade), \(\mathcal {SC}_{k}\) (symmetric cascade):

Assume that \(\mathcal {D}_{\ell k}\) outputs matrices \(\varvec{A}\) where the upper \(k\times k\) submatrix \(\varvec{\bar{A}}\) is always invertible. I.e., \(\mathcal {D}_{\ell k}\) is robust, [28]. All the above distributions can be made robust with minimal changes. Denote the lower \((\ell - k) \times k\) submatrix of \(\varvec{A}\) as \(\underline{\varvec{A}}\). Denote \(\mathcal {D}_{k} = \mathcal {D}_{k+ 1, k}\).

\(\mathcal {D}_{\ell k}\)-\(\mathrm {KerMDH}_{\mathbb {G}_{1}}\) [39] holds relative to \(\mathsf {PGen}\), if for any PPT \(\mathcal {A}\), . \(\mathcal {D}_{\ell k}\)-\(\mathrm {SKerMDH}\) [23] holds relative to \(\mathsf {PGen}\), if for any PPT \(\mathcal {A}\), . According to Lemma 1 of [23], if \(\mathcal {D}_{\ell k}\)-\(\mathrm {KerMDH}\) holds in generic symmetric bilinear groups then \(\mathcal {D}_{\ell k}\)-\(\mathrm {SKerMDH}\) holds in generic asymmetric bilinear groups. The \(\mathrm {KerMDH}\) assumption holds also for Type-1 pairings, where \(\mathbb {G}_1 = \mathbb {G}_2\), but then one needs \(k\ge 2\), which affects efficiency.

Hash-Algebraic Knowledge Assumptions. The Algebraic Group Model (AGM) is a new model [16] that one can use to prove the security of a cryptographic assumption or protocol. Essentially, in AGM one assumes that each PPT algorithm (including the adversaries) is algebraic in the following sense: if the adversary \(\mathcal {A}\)’s input includes \([\varvec{x_{\iota }}]_{\iota }\) and no other elements from the group \(\mathbb {G}_\iota \) and \(\mathcal {A}\) outputs group elements \([\varvec{y_\iota }]_{\iota }\), then \(\mathcal {A}\) knows matrices \(\varvec{N}_\iota \), such that \(\varvec{y}_\iota = \varvec{N}_\iota \varvec{x}_\iota \). Lipmaa [37] considered AGM to be as a family of algebraic knowledge assumptions. He defined the AGM with hashing (AGMH), where the adversary is additionally allowed to create new group elements that have high min-entropy from the adversary’s viewpoint (and in particular, without knowing their discrete logarithms). This takes into account the existence of efficient elliptic curve hashing algorithms that can be used to generate such new group elements.

Following [37], we say that a PPT algorithm \(\mathcal {A}\) is hash-algebraic (in \(\mathsf {p}\)) if there exists an efficient extractor \(\mathsf {Ext}_\mathcal {A}\), such that for any PPT sampleable distribution \(\mathcal {D}\), \(\mathsf {Adv}^{\mathrm {{hak}}}_{\mathsf {p}, \mathcal {D}, \mathcal {A}}(\lambda ) :=\)

A bilinear group \(\mathsf {p}\) is hash-algebraic if every PPT algorithm \(\mathcal {A}\) that obtains inputs from \(\mathbb {G}_1\)/\(\mathbb {G}_2\) and outputs elements in \(\mathbb {G}_1\)/\(\mathbb {G}_2\) is hash-algebraic. Clearly, a hash-algebraic adversary is less restricted than an algebraic adversary.

The requirement that \(\mathcal {A}\) is hash-algebraic for a concrete \(\mathcal {D}\) is a specific \((\mathsf {p}, \mathcal {D}, \mathcal {A})\)-hash-algebraic knowledge (HAK) assumption stating that \(\mathsf {Adv}^{\mathrm {{hak}}}_{\mathsf {p}, \mathcal {D}, \mathcal {A}}(\lambda ) \approx _\lambda 0\). In AGMH, one assumes that \((\mathsf {p}, \mathcal {D}, \mathcal {A})\)-HAK holds for all choices of \((\mathcal {D}, \mathcal {A})\). Alternatively, [37] calls it the \(\mathsf {p}\)-HAK assumption. While proving the security of a concrete protocol in a fixed group \(\mathsf {p}\), it is sufficient to rely on the following assumption for a single specified distribution \(\mathcal {D}\). A \((\mathsf {p}, \mathcal {D}, \mathcal {A})\)-HAK assumption states that \(\mathsf {Adv}^{\mathrm {{hak}}}_{\mathsf {p}, \mathcal {D}, \mathcal {A}}(\lambda ) \approx _\lambda 0\). A \((\mathsf {p}, \mathcal {D})\)-HAK assumption states that \(\mathsf {Adv}^{\mathrm {{hak}}}_{\mathsf {p}, \mathcal {D}, \mathcal {A}}(\lambda ) \approx _\lambda 0\) for all PPT \(\mathcal {A}\). Analogously, the \((\mathcal {D}, \mathcal {A})\)-algebraic knowledge (AK) assumption in \(\mathsf {p}\) states that \(\mathsf {Adv}^{\mathrm {{ak}}}_{\mathsf {p}, \mathcal {D}, \mathcal {A}}(\lambda ) \approx _\lambda 0\).

Lipmaa [37] demonstrated the usefulness of the HAK assumption showing that Damgård’s original Knowledge-of-Exponent (KE, [10]) assumption is secure under the DL and HAK assumptions. The opposite does not always hold: KE assumption (and its generalizations) cannot be used to extract unless each input group element \([z]_{\iota }\) is accompanied with a “knowledge” input \([x z]_{\iota }\) for random x. Thus, protocols that rely on HAK assumptions can, in principle, be more efficient than protocols that rely on KE assumptions only.

Intuitively, a security proof under the \((\mathsf {p}, \mathcal {D})\)-HAK assumption constitutes essentially an AGMH security proof, but without one assuming that all PPT algorithms in the group \(\mathsf {p}\) are (hash-)algebraic. Finally, according to the analysis of [37], it is sufficient to assume that \([\varvec{q}_\iota ]_{1}\) has high min-entropy while the previous approach of generic model with hashing as in [2, 4, 7, 41] assumed that adversarially created group elements are uniformly random.

3 Defining QA-NIZK in the BPK Model

Quasi-Adaptive Non-Interactive Zero-Knowledge (QA-NIZK) argument systems [28] are quasi-adaptive in the sense that the CRS depends on a language parameter \(\varrho \) that has been sampled from a fixed distribution \(\mathcal {D}_{\mathsf {p}}\). QA-NIZKs are of great interest since they are succinct and based on standard assumptions. Since QA-NIZKs have many applications, they have been a subject of intensive study, [1, 23, 28, 30,31,32,33]. The main limitation of known QA-NIZKs is that efficient QA-NIZKs are only known for a restricted set of languages like the language of linear subspaces (see [12, 23, 24] for QA-NIZKs for other languages).

The original QA-NIZK security definitions [28] were given in the CRS model. Jutla and Roy strengthened the definitions in the full version of their paper, [29], allowing for the case when the language parameter is maliciously picked. We will lift the latter definitions to the weaker BPK model. Sometimes, the only difference compared to the definitions of [29] is in notation (a CRS will be replaced by a public key). The rest of the definitional changes are motivated by the definition of Sub-ZK zk-SNARKs in [2], e.g., a QA-NIZK in the BPK model will have a public-key verification algorithm \(\mathsf {PKV}\) and the zero-knowledge definition mentions a subverter and an extractor. We also define a \(\varrho \)-verification algorithm \(\mathsf {PARV}\). Since black-box [38] and even auxiliary-input non-black-box [21] (see Lemma 1) NIZK in the BPK model is impossible we will give an explicit definition of no-auxiliary-string non-black-box NIZK.

As in [4], we will implicitly assume that the system parameters \(\mathsf {p}\) are generated deterministically from \(\lambda \); in particular, the choice of \(\mathsf {p}\) cannot be subverted. A QA-NIZK argument system enables to prove membership in a language defined by a relation \(\mathscr {R}_\varrho = \{(\mathsf {x}, \mathsf {w})\}\), which in turn is completely determined by a parameter \(\varrho \) sampled (in the honest case) from a distribution \(\mathcal {D}_{\mathsf {p}}\). We will assume implicitly that \(\varrho \) contains \(\mathsf {p}\) and thus not include \(\mathsf {p}\) as an argument to algorithms that also input \(\varrho \); recall that we assumed that \(\mathsf {p}\) cannot be subverted. A distribution \(\mathcal {D}_{\mathsf {p}}\) on \(\mathscr {L}_\varrho \) is witness-sampleable [28] if there exists a PPT algorithm \(\mathcal {D}'_\mathsf {p}\) that samples \((\varrho , \mathsf {td_\varrho }) \in \mathscr {R}_\mathsf {p}\) such that \(\varrho \) is distributed according to \(\mathcal {D}_{\mathsf {p}}\).

The zero-knowledge simulator is usually required to be a single (non-black-box) PPT algorithm that works for the whole collection of relations \(\mathscr {R}_\mathsf {p}= \{\mathscr {R}_\varrho \}_{\varrho \in {\text {im}}(\mathcal {D}_{\mathsf {p}})}\); that is, one usually requires uniform simulation (see [28] for a discussion). Following [2], we accompany the universal simulator with an adversary-dependent extractor. We assume \(\mathsf {Sim}\) also works in the case when one cannot efficiently establish whether \(\varrho \in {\text {im}}(\mathcal {D}_{\mathsf {p}})\). The simulator is not allowed to create new \(\varrho \) or but has to operate with one given to it as an input.

A tuple of PPT algorithms \(\varPi = (\mathsf {PGen}, \mathsf {KGen}, \mathsf {PARV}, \mathsf {PKV}, \mathsf {P}, \mathsf {V}, \mathsf {Sim})\) is a no-auxiliary-string non-black-box zero knowledge (Sub-ZK) QA-NIZK argument system in the BPK model for a set of witness-relations \(\mathscr {R}_\mathsf {p}= \{\mathscr {R}_{\varrho }\}_{\varrho \in {\text {Supp}}\,\,\!\!\left( \mathcal {D}_{\mathsf {p}}\right) }\), if the following Items i, ii, iv and v hold. \(\varPi \) is a Sub-ZK QA-NIZK argument of knowledge, if additionally Items iii holds. Here, \(\mathsf {PGen}\) is the parameter generation algorithm, \(\mathsf {KGen}\) is the public key generation algorithm, \(\mathsf {PARV}\) is the \(\varrho \)-verification algorithm, \(\mathsf {PKV}\) is the public-key verification algorithm, \(\mathsf {P}\) is the prover, \(\mathsf {V}\) is the verifier, and \(\mathsf {Sim}\) is the simulator.

  1. (i)

    Perfect Completeness: for any \(\lambda \), \(\mathsf {p}\in {\text {im}}(\mathsf {PGen}(1^\lambda ))\), PPT \(\mathcal {A}\),

  2. (ii)

    Computational Quasi-Adaptive Sub-PAR Soundness: for any \(\mathsf {p}\in {\text {im}}(\mathsf {PGen}(1^\lambda ))\), and stateful PPT \(\mathcal {A}\),

  3. (iii)

    Computational Quasi-Adaptive Sub-PAR Knowledge-Soundness: for every PPT stateful adversary adversary \(\mathcal {A}\), there exist a PPT extractor \(\mathsf {Ext}_{\mathcal {A}}\), s.t. for all \(\mathsf {p}\in {\text {im}}(\mathsf {PGen}(1^\lambda ))\),

    A knowledge-sound argument system is called an argument of knowledge.

  4. (iv)

    Statistical Zero Knowledge: for any \(\lambda \), \(\mathsf {p}\in {\text {im}}(\mathsf {PGen}(1^\lambda ))\), and computationally unbounded adversary \(\mathcal {A}\), \(|\varepsilon _0^{zk} - \varepsilon _1^{zk}| \approx _\lambda 0\), where \(\varepsilon _b^{zk} :=\)

    The oracle \(\mathsf {O}_0 (\mathsf {x}, \mathsf {w})\) returns \(\bot \) (reject) if \((\mathsf {x}, \mathsf {w}) \not \in \mathscr {R}_\varrho \), and otherwise it returns . Similarly, \(\mathsf {O}_1 (\mathsf {x}, \mathsf {w})\) returns \(\bot \) (reject) if \((\mathsf {x}, \mathsf {w}) \not \in \mathscr {R}_\varrho \), and otherwise it returns .

  5. (v)

    Statistical Persistent Zero Knowledge: for any PPT subverter \(\mathcal {C}\), there exists a PPT extractor \(\mathsf {Ext}_{\mathcal {C}}\), s.t. for any \(\lambda \), \(\mathsf {p}\in {\text {im}}(\mathsf {PGen}(1^\lambda ))\), and computationally unbounded adversary \(\mathcal {A}\), \(|\varepsilon ^{zk}_0 - \varepsilon ^{zk}_1| \approx _\lambda 0\), where

    The oracle \(\mathsf {O}_0 (\mathsf {x}, \mathsf {w})\) returns \(\bot \) (reject) if \((\mathsf {x}, \mathsf {w}) \not \in \mathscr {R}_\varrho \), and otherwise it returns . Similarly, \(\mathsf {O}_1 (\mathsf {x}, \mathsf {w})\) returns \(\bot \) (reject) if \((\mathsf {x}, \mathsf {w}) \not \in \mathscr {R}_\varrho \), and otherwise it returns .

\(\varPi \) is statistically no-auxiliary-stringFootnote 4 non-black-box zero knowledge (Sub-ZK) if it is both statistically zero-knowledge and statistically persistent zero-knowledge.

Knowledge-sound QA-NIZKs are useful in situations where the witness relations \(\mathscr {R}_\varrho \) are trivial in the sense that for each \(\mathsf {x}\), there exists a \(\mathsf {w}\) such that \((\mathsf {x}, \mathsf {w}) \in \mathscr {R}_\varrho \). In such cases, one must argue that the prover knows this \(\mathsf {w}\). Knowledge-sound QA-NIZK argument systems have applications in shuffles [14] and SNARKs [8, 25, 37].

In their definition of strong soundness for strong QA-NIZK, Jutla and Roy [29] made the assumption that \(\mathcal {C}_\varrho \) also returns \(\mathsf {td_\varrho }\). This assumption reminds the AGM [16], where in the security proofs, the adversary is assumed to output a part of her secret state but might be stronger depending on the definition of \(\mathcal {D}_{\mathsf {p}}\). Thus, one should not make such an assumption per se but prove (say, in the AGM) that it holds. In several recent reinterpretations of AGM [37], one has reworded AGM by requiring the existence of an extractor that returns the secret state. In our Sub-PAR (knowledge-)soundness definition, we require that \(\mathsf {PARV}(\varrho ) = 1\) (thus, \(\varrho \in {\text {im}}(\mathcal {D}_{\mathsf {p}})\) and a \(\mathsf {td_\varrho }\) exists). We do not require \(\mathsf {td_\varrho }\) can be extracted; we only require that \(\mathsf {w}\) can be extracted. In our security proof, the extractor of \(\mathsf {w}\) will first extract \(\mathsf {td_\varrho }\) by using a DL oracle; we prove knowledge-soundness under a non-falsifiable assumption (more precisely, under the \(\mathrm {SDL^{dl}}\) assumption that states that solving \(\mathrm {SDL}\) is intractable even if the adversary is given non-adaptive access to a DL oracle, see Fig. 6).

More precisely, in the case of the concrete construction of \(\varPi _{\mathsf {bpk}}\), extraction of \(\mathsf {td_\varrho }\) is needed since the \(\varPi _{\mathsf {kw}}\) argument system [31] (and thus also the \(\varPi _{\mathsf {bpk}}\) argument system in Sect. 5) is only sound if \(\mathcal {D}_{\mathsf {p}}\) is witness-sampleable. In the soundness proof in [31], one obtains \(\mathsf {td_\varrho }\) from the honest \(\varrho \)-creator. In the Sub-PAR knowledge-soundness proof in Sect. 5, we extract \(\mathsf {td_\varrho }\) from the malicious \(\varrho \)-creator \(\mathcal {A}\) and then use \(\mathsf {td_\varrho }\) to extract \(\mathsf {w}\). However, we use the DL oracle to extract \(\mathsf {td_\varrho }\) and thus will need not have to rely on witness-sampleability of \(\mathcal {D}_{\mathsf {p}}\).

We assume that a single subverter \(\mathcal {C}\) produces \(\varrho \) and in the case of Sub-ZK, and the extractor will get access to the code of \(\mathcal {C}\) and its inputs and random coins. The extractor never works with probability 1 since \(\mathcal {C}\) can randomly sample (with a non-zero but negligible probability) a well-formed . However, if it works, then in our constructions, the simulation will be perfect. For the sake of simplicity, we will not formalize this as perfect zero-knowledge. (One reason for this is that differently from [2], the secret key extracted by \(\mathsf {Ext}_\mathcal {C}\) is not unique in our case; see discussion in Sect. 5.)

The existence of \(\mathsf {PKV}\) is not needed in the CRS model, assuming the CRS creator is trusted by the prover, and thus \(\mathsf {PKV}\) was not included in the prior QA-NIZK definitions. Since soundness is proved in the case is chosen correctly (by the verifier or a trusted third party, trusted by her), \(\mathsf {V}\) does not need to execute \(\mathsf {PKV}\). However, \(\mathsf {PKV}\) should be run by \(\mathsf {P}\). Similarly, the existence of \(\mathsf {PARV}\) is not needed in the CRS model; the algorithm \(\mathsf {PARV}\) needs to be run both by \(\mathsf {P}\) and \(\mathsf {V}\). The simulator is only required to simulate correctly in the case \(\mathsf {PARV}\) accepts \(\varrho \) and \(\mathsf {PKV}\) accepts .

For Sub-ZK, we require that both standard zero-knowledge (with trusted \(\varrho \) and generators) and persistent zero-knowledge (with possibly subverted \(\varrho \) and ) generators hold. The reason behind requiring both is subtle and will be explained in Sect. 4. Very briefly, since one considers a single subverter \(\mathcal {C}\) that creates both \(\varrho \) and , persistent zero-knowledge leaves one vulnerable against the subverter who just sets . While this attack is not possible in the case of all QA-NIZKs, as we show in Sect. 4, one can design a QA-NIZK argument system that is persistent zero-knowledge but not standard zero-knowledge. Intuitively, requiring that the same simulator \(\mathsf {Sim}\) also works without the knowledge of \(\mathsf {td_\varrho }\) makes it possible to avoid such pathological cases. However, it means that persistent zero-knowledge is not a strictly stronger notion than the standard zero-knowledge, and one requires both to obtain Sub-ZK.

Comparison to Earlier Sub-ZK Definitions. Subversion-security was defined by Bellare et al. [4] for the CRS model; further CRS-model subversion-security definitions were given in [2, 15]. As proven in [4], one cannot achieve Sub-SND (soundness even if the CRS was generated maliciously) and non-subversion zero knowledge at the same time. Thus, subsequent efforts have concentrated on achieving either Sub-SND and witness-indistinguishability [4], subversion knowledge-soundness and witness-indistinguishability [17], or Sub-ZK (zero knowledge in the case the CRS was generated maliciously) and soundness [2, 4, 15]. In the latter case, the CRS is trusted by the verifier \(\mathsf {V}\) while (following the definitions of [2]) the prover checks that the CRS is well-formed by using a publicly available algorithm. Thus, Sub-ZK in the CRS model is the same as zero-knowledge in the BPK model: the CRS has to be trusted by (or, even chosen by) \(\mathsf {V}\) and hence can be equal to the public key of an entity trusted by \(\mathsf {V}\) (or of \(\mathsf {V}\) herself). Since black-box NIZK [38] and even auxiliary-string non-black-box NIZK [21] in the BPK model is impossible, one has to define no-auxiliary-string non-black-box zero knowledge (Sub-ZK) as above. Bellare et al. [4] motivated not incorporating auxiliary strings to the definition of Sub-ZK by known impossibility results. We will formalize this (folklore, see [42] for discussion) impossibility result as the following straightforward lemma.

Lemma 1

Auxiliary-string non-black-box NIZK in the BPK model is only possible for languages in \(\mathsf {BPP}\).

Proof

The notions of (no-)auxiliary-string and (non-)-black-box zero-knowledge were defined by Goldreich and Oren [21] who proved that auxiliary-string (even non-black-box) zero-knowledge argument systems for languages outside of \(\mathsf {BPP}\) require at least three messages in the plain model. An auxiliary-string (non-black-box) NIZK argument system in the BPK model can be interpreted as a two-message auxiliary-string (non-black-box) zero-knowledge argument system in the plain model, where the verifier creates BPK and sends it as her first message. Thus, an auxiliary-string NIZK argument system for languages outside of \(\mathsf {BPP}\) would contradict the impossibility result of [21].   \(\square \)

Auxiliary-input zero-knowledge is usually used to achieve sequential composition of interactive zero-knowledge protocols, [21]. Sub-ZK guarantees sequential security in the case of NIZK, see [2] for a proof. In particular, the main result of [2, 15], reformulated in our language, is that there exist computationally knowledge-sound Sub-ZK zk-SNARKs for \(\mathsf {NP}\) in the BPK model.

In the case of QA-NIZKs, one has to deal with two parameters, \(\varrho \) (the language parameter) and (the public key). As shown in [29] (updated version from September 2018), one can achieve both soundness and zero-knowledge in the case when \(\varrho \) is subverted but is honestly chosen. In the persistent zero-knowledge definition, we allow for subverted and \(\varrho \). Due to the impossibility result of [4], we are not aiming to achieve Sub-SND. Thus, in the definition of soundness, we assume that is honestly generated.

Language of Linear Subspaces and Kiltz-Wee QA-NIZK. An important application of QA-NIZK is in the case of the following language. Assume we need to show that \([\varvec{y}]_{1} \in {\text {colspace}}([\varvec{M}]_{1})\), where \([\varvec{M}]_{1}\) is sampled from a distribution \(\mathcal {D}_{\mathsf {p}}\) over \(\mathbb {G}_{1}^{n\times m}\). We assume, following [28], that \((n, m)\) is implicitly fixed by \(\mathcal {D}_{\mathsf {p}}\). That is, a QA-NIZK for linear subspaces handles languages

$$ \mathscr {L}_{[\varvec{M}]_{1}} = \!\!\left\{ [\varvec{y}]_{1} \in \mathbb {G}_{1}^n: \exists \varvec{w} \in \mathbb {Z}_{p}^{m} \text{ s.t. } \varvec{y} = \varvec{M} \varvec{w}\right\} \,\, . $$

The corresponding relation is defined as \(\mathscr {R}_{[\varvec{M}]_{1}} = \{([\varvec{y}]_{1}, \varvec{w}) \in \mathbb {G}_{1}^n\times \mathbb {Z}_{p}^{m}: \varvec{y} = \varvec{M} \varvec{w}\}\). This language is useful in many applications, [8, 28]. As a typical application, let be a public key of the Elgamal cryptosystem; then ciphertext \([\varvec{y}]_{1} \in \mathscr {L}_{[\varvec{M}]_{1}}\) iff it encrypts 0. Here, \([\varvec{M}]_{1}\) comes from a \(\mathrm {KerMDH}\)-hard witness-sampleable distribution \(\mathcal {D}_{\mathsf {p}}\).

The most efficient known QA-NIZK for linear subspaces in the CRS model was proposed by Kiltz and Wee [31]. In particular, they proposed a QA-NIZK \(\varPi _{\mathsf {kw}}\) that assumes that the parameter \(\varrho = [\varvec{M}]_{1} \in \mathbb {G}_{1}^{n\times m}\) is sampled from a witness-sampleable distribution \(\mathcal {D}_{\mathsf {p}}\). \(\varPi _{\mathsf {kw}}\) results in the argument that consists of \(k\) group elements, where \(k\) is the parameter (\(k= 1\) being usually sufficient in the case of asymmetric pairings) related to the underlying \(\mathrm {KerMDH}\) distribution. More precisely, given \(n> m\), the Kiltz-Wee QA-NIZK is computationally quasi-adaptively sound under the \(\mathcal {D}_{k}\)-\({\mathrm {KerMDH}}_{\mathbb {G}_{1}}\) assumption relative to \(\mathsf {PGen}\), [31]. Importantly, \(\varPi _{\mathsf {kw}}\) is significantly more efficient than the Groth-Sahai NIZK for the same language. For the sake of completeness, Fig. 1 describes the Kiltz-Wee QA-NIZK argument system \(\varPi _{\mathsf {kw}}\) for linear subspaces in the CRS model.

Fig. 1.
figure 1

Kiltz-Wee QA-NIZK argument system \(\varPi _{\mathsf {kw}}\) for \([\varvec{y}]_{1} = [\varvec{M}]_{1} \varvec{w}\)

Some Applications of QA-NIZK in the BPK Model. The simplest example application is that of UC-commitments from [28], where a trusted third party generates a commitment key \(\varrho \) together with a QA-NIZK public key , and \(\mathsf {P}\) opens the commitments later by disclosing a QA-NIZK argument of proper commitment under the commitment key \(\varrho \). Here, \(\varrho \) should not be generated by \(\mathsf {P}\) (who could then equivocate) or by \(\mathsf {V}\) (who could then extract the message). However, can be generated by \(\mathsf {V}\). This allows one, securely generated \(\varrho \), to be used in many applications, from UC-commitments to identity-based encryption. In each such application, a trusted authority trusted by \(\mathsf {V}\) (e.g., \(\mathsf {V}\) herself) can create her that takes the particularities of that application into account.

Another, arguably much more important application, is the use of Sub-ZK QA-NIZKs in the construction of Sub-ZK SNARKs. Several recent papers [8, 14, 25, 37] have used QA-NIZKs for subspace language to construct SNARKs. In these cases, one proves the membership in the trivial full vector space under knowledge assumption, resulting in a statement that (say) the argument belongs to the span of certain CRS elements only like in [37] or that two commitments that possibly use different commitment keys commit to the same vectors like in [14]. To obtain Sub-ZK SNARKs (under a knowledge assumption), in such cases also the QA-NIZK has to be Sub-ZK (under a knowledge assumption).

In many other applications, it is desirable that zero-knowledge holds even if both \(\varrho \) and both are chosen by \(\mathsf {V}\) (or by possibly different parties, neither of which is trusted by \(\mathsf {P}\)). The above Sub-ZK definitions cover this more realistic scenario; in addition, they do not require \(\mathsf {V}\) to trust \(\varrho \). One such application is in the LegoSNARK framework by Campanelli et al. [8]. LegoSNARK uses QA-NIZK for linear subspace to build Commit-and-Prove (CP) SNARKs, which can be securely and efficiently combined together, creating a complex proof system able to perform well even for heterogeneous instance representation. Unfortunately, most of the modern zk-SNARKs are not CP-SNARKs. Hence [8] proposed a QA-NIZK-based transformation that builds them using any Commit-Carrying (CC) SNARK; the latter are much more common, e.g., the most efficient zk-SNARK for QAP by Groth [27] is a CC-SNARK. Despite that, Campanelli et al. propose a number of CP-SNARKs that are QA-NIZK-based.

4 Persistent Zero-Knowledge \(\not \Rightarrow \) Zero-Knowledge

Intuitively, it seems that persistent zero-knowledge follows from the standard zero-knowledge since the set of all possible PPT subverters \(\mathcal {C}\) also includes honest algorithms. However, this intuition is wrong. We will next show that one can construct pathological QA-NIZK argument systems that achieve persistent zero-knowledge, but do not satisfy the usual definition of zero-knowledge and actually leak some information about the witness.

Let us consider a slight variation of the subspace language where \(\varrho = ([\varvec{M}]_{1}, [\varvec{M}]_{2})\)Footnote 5 and the statement is that \([\varvec{y}]_{1}\) belongs to the subspace spanned by the matrix \([\varvec{M}]_{1}\). Moreover, for simplicity let us take . Consider the QA-NIZK argument system (a leaky QA-NIZK) in Fig. 2. It has secret keys from the same set \(\mathbb {Z}_p^{2 \times 1}\), and thus, \(\varvec{M}\) can pass as a secret key. Leaky QA-NIZK does not have a public key, the argument is simply \([\pi ]_{1} = [w]_{1}\), and the verification is done by checking that \([\pi ]_{1}^\top [\varvec{M}]_{2}^\top = [\varvec{M} w]_{1}^\top [1]_{2} = [\varvec{y}]_{1}^\top [1]_{2}\). It is not standard zero-knowledge since the simulator only knows \([\varvec{M}]_{1}\), \([\varvec{M}]_{2}\), and \([\varvec{y}]_{1} = [M_1 w, M_2 w]_{1}\) and outputting \([w]_{1}\) breaks the following symmetric computational Diffie-Hellman (CDH) assumption: given input \(([1, a, b]_{1}, [1, a, b]_{2})\) for , , it is difficult to compute \([a b]_{1}\). To see this, let us suppose that the symmetric CDH challenge is \([1, a, b]_{1}\), \([1, a, b]_{2}\) for , . We denote \(M_1 = 1/a\), \(w = b\), \(M_2 = M_2' M_1 = M_2'/a\) where . We also reset generators of \(\mathbb {G}_1\) and \(\mathbb {G}_2\) to be \([g]_{1} = [a]_{1}\) and \([g]_{2} = [a]_{2}\). Now if such simulator existed, we could run it with input \([M_1 g, M_2 g, M_1 w g, M_2 w g]_{1} = [1, M_2', b, M_2' b]_{1}\), \([M_1 g, M_2 g]_{2} = [1, M_2']_{2}\) and it would output \([w g]_{1} = [b a]_{1}\); this would break the CDH assumption.

Surprisingly, simulation is possible (under a knowledge assumption) if we try to prove persistent zero-knowledge. We remind that the Bilinear Diffie-Hellman Knowledge of Exponent (\(\mathrm {BDHKE}\)) [2] assumption says that if a PPT adversary \(\mathcal {A}(\mathsf {p})\) outputs \(([x]_{1}, [x]_{2})\) on random coins r, then there exists an extractor that extracts x with an overwhelming probability given the same random coins r. Thus, assuming \(\mathrm {BDHKE}\) and because \(\mathsf {Ext}_\mathcal {C}\) is given access to the random coins of \(\mathcal {C}\), \(\mathsf {Ext}_\mathcal {C}\) can extract \(\varvec{M}\) and provide it to the simulator as . The simulator then computes \([w]_{1} = M_1^{-1} [y_1]_{1}\).

Fig. 2.
figure 2

A contrived leaky subspace QA-NIZK where \(\varrho = ([\varvec{M}]_{1}, [\varvec{M}]_{2})\)

We could divide \(\mathcal {C}\) into \(\mathcal {C}_{\varrho }\), which generates \(\varrho \), and , which generates , such that the extractor only gets random coins of . This would make it impossible to extract \(\varvec{M}\). However, this will not work since we cannot exclude communication between \(\mathcal {C}_\varrho \) and , e.g., \(\mathcal {C}_\varrho \) can compute herself and send it to . outputs without having any knowledge of , making extracting impossible.

Because of that, we adopted a different solution: namely, we require that a Sub-ZK QA-NIZK argument system must satisfy both standard zero-knowledge and persistent zero-knowledge with respect to the same simulator. This solution rules out the intuitively insecure arguments like the one in Fig. 2.

5 Construction of a QA-NIZK in the BPK Model

In this section, we will show that if the membership of \([\varvec{\bar{A}}]_{2}\) in \(\mathcal {D}_{k}\) can be efficiently verified, then a slight variant \(\varPi _{\mathsf {bpk}}\) of the Kiltz-Wee QA-NIZK \(\varPi _{\mathsf {kw}}\) for linear subspaces [31] is secure (including Sub-ZK) in the BPK model. More precisely, we say that the distribution \(\mathcal {D}_{k}\) is efficiently verifiable, if there exists an algorithm \(\mathsf {MATV}([\varvec{\bar{A}}]_{2})\) that outputs 1 if \(\varvec{\bar{A}}\) is invertible (recall that we assume that the matrix distribution is robust) and well-formed with respect to \(\mathcal {D}_{k}\) and otherwise outputs 0. Clearly, the distributions \(\mathcal {D}_1\), \(\mathcal {L}_{k}\), \(\mathcal {IL}_{k}\), \(\mathcal {C}_{k}\), and \(\mathcal {SC}_{k}\) (for any \(k\)) are verifiable, as can be seen in Fig. 3, while the verification whether \([\varvec{\bar{A}}]_{2}\) is invertible is intractable for the distribution \(\mathcal {U}_{k}\) if \(k> 1\). Indeed, if \(k= 2\) then in the latter case, one needs to test if \(a_{1 1} a_{2 2} - a_{1 2} a_{2 1} = 0\), given only \([\varvec{\bar{A}}]_{2}\); the case \(k> 2\) is even more complicated. Nevertheless, we show that a slightly modified version of our construction works with the distribution \(\mathcal {D}_2\).

Fig. 3.
figure 3

Auxiliary procedure \(\mathsf {MATV}\) for \(\mathcal {D}_{k} \in \{\mathcal {L}_{k}, \mathcal {IL}_{k}, \mathcal {C}_{k}, \mathcal {SC}_{k}\}\).

Recall that in the BPK model, the public key (corresponds to the CRS in \(\varPi _{\mathsf {kw}}\)) belongs either to the verifier \(\mathsf {V}\) or to a party trusted by \(\mathsf {V}\). One proves computational soundness in the setting where \(\mathsf {V}\) trusts that is honestly generated, i.e., that the corresponding is secret and is well-formed. Since is not trusted by the prover \(\mathsf {P}\), one proves Sub-ZK in the case of a maliciously generated . We assume that \([\varvec{M}]_{1}\) is sampled by a PPT subverter, and moreover, the simulator does not know the corresponding witness \(\varvec{M}\) or any function of \(\varvec{M}\) not efficiently computable from \([\varvec{M}]_{1}\).

To modify \(\varPi _{\mathsf {kw}}\) so that it would be secure in the BPK model instead of the CRS model, the most straightforward idea is to divide into \(\mathsf {pk}^{\mathsf {zk}}= [\varvec{P}]_{1}\) (the part of that is used by \(\mathsf {P}\) and thus intuitively needed to guarantee zero knowledge) and \(\mathsf {pk}^{\mathsf {snd}}= [\varvec{\bar{A}}, \varvec{C}]_{2}\) (the part of is used by \(\mathsf {V}\) and thus intuitively needed to guarantee soundness). Thus, \(\mathsf {P}\) (resp., \(\mathsf {V}\)) has to be assured that \(\mathsf {pk}^{\mathsf {zk}}\) (resp., \(\mathsf {pk}^{\mathsf {snd}}\)) is generated honestly. Hence, one could use \(\mathsf {pk}^{\mathsf {zk}}_\mathsf {P}\) from \(\mathsf {P}\)’s public key and \(\mathsf {pk}^{\mathsf {snd}}_\mathsf {V}\) from \(\mathsf {V}\)’s public key to create an argument. However, it is not clear how to do this since both \(\mathsf {pk}^{\mathsf {snd}}_\mathsf {V}\) and \(\mathsf {pk}^{\mathsf {zk}}_\mathsf {P}\) depend on the same secret \(\varvec{K}\). Moreover, in this case, both \(\mathsf {P}\) and \(\mathsf {V}\) have public keys while we want to have a situation, common in the BPK model, where only \(\mathsf {V}\) has a public key.

Instead, we assume that \(\mathsf {V}\)’s public key is equal to the whole CRS and then construct a public-key verification algorithm \(\mathsf {PKV}\). For \(\mathsf {PKV}\) to be efficient in the case \(\mathcal {D}_{k}\) is not efficiently verifiable, we need to add some new elements (collectively denoted as \(\mathsf {pk}^{\mathsf {pkv}}\)) to . Figure 4 describes the new QA-NIZK \(\varPi _{\mathsf {bpk}}\). The construction of \(\mathsf {PKV}\) will be explained in Sect. 6.

Fig. 4.
figure 4

Sub-ZK QA-NIZK \(\varPi _{\mathsf {bpk}}\) for \([\varvec{y}]_{1} = [\varvec{M}]_{1} \varvec{w}\) in the BPK model, where either (1) \(\mathcal {D}_{k}\) is efficiently verifiable or (2) \(\mathcal {D}_{k} = \mathcal {U}_2\).

We will prove that in the BPK model, \(\varPi _{\mathsf {bpk}}\) is statistically persistent zero-knowledge under a novel non-falsifiable assumption, computationally quasi-adaptively Sub-PAR sound under another novel non-falsifiable assumption, and (if \(\varvec{M}\) has full rank) computationally quasi-adaptively Sub-PAR knowledge-sound under two non-falsifiable assumptions, one of which is novel. Some of the new non-falsifiable assumptions do not belong to the family of knowledge assumptions, which is an interesting result by itself. We will study new assumptions in Sect. 6, before stating and proving the security of \(\varPi _{\mathsf {bpk}}\) in Sect. 7.

6 New Non-falsifiable Assumptions

We will next motivate and define the new assumptions. We will also prove the security of \(\mathrm {KWKE}\) and \(\mathrm {SKWKE}\) under the HAK assumptions.

\(\varvec{\mathrm {KWKE}}\) and \(\varvec{\mathrm {SKWKE}}\) Assumptions. In the Sub-ZK proof, we will need two different (tautological) knowledge assumptions, \(\mathrm {KWKE}\) (Kiltz-Wee Knowledge of Exponent), and \(\mathrm {SKWKE}\) (Strong Kiltz-Wee Knowledge of Exponent). Similarly to Sub-ZK SNARKs [2, 15], the knowledge assumption is needed to equip the simulator \(\mathsf {Sim}\) of \(\varPi _{\mathsf {kw}}\) with the correct secret key .

The \(\mathrm {KWKE}\) assumption guarantees that one can extract a secret key from which one can compute \(\mathsf {pk}^{\mathsf {zk}}= [\varvec{P}]_{1}\) but not necessarily \(\mathsf {pk}^{\mathsf {snd}}\). Since \(\mathsf {pk}^{\mathsf {zk}}\) does not fix \(\varvec{K}\) uniquely, \(\mathrm {KWKE}\) extracts one possible \(\varvec{K}\). Since for achieving Sub-ZK, it is not needed that \(\mathsf {pk}^{\mathsf {snd}}\) can be computed from , \(\mathrm {KWKE}\) is sufficient. To argue that \(\mathrm {KWKE}\) is a reasonable knowledge assumption, we prove that it holds under a hash-algebraic knowledge assumption.

We also introduce a stronger knowledge assumption \(\mathrm {SKWKE}\) that allows extracting the unique secret key \(\varvec{K}\) that was used to generate the whole public key . We prove that \(\mathrm {SKWKE}\) holds under a HAK and a \(\mathrm {WKerMDH}\) assumption, given that \(\mathcal {D}_{k}\) is a \(\mathrm {WKerMDH}\)-hard distribution. (Here, \(\mathrm {WKerMDH}\) is a weaker variant of the well-known \(\mathrm {KerMDH}\) distribution.) The assumption of \(\mathrm {WKerMDH}\)-hardness often holds in practice, e.g., when \(\varrho \) corresponds to a randomly chosen public key of a cryptosystem or a commitment scheme (see Sect. 3 for an example). After that, we will prove that \(\varPi _{\mathsf {bpk}}\) is Sub-ZK under either \(\mathrm {KWKE}\) or \(\mathrm {SKWKE}\); in the latter case, we additionally get a guarantee that the public key is correctly formed.

We will now define the new knowledge assumptions needed in the Sub-ZK proof. In \(\mathrm {KWKE}\), we assume that if \(\mathcal {A}\) outputs a \(\varrho \) accepted by \(\mathsf {PARV}\) and a accepted by \(\mathsf {PKV}\), then there exists an extractor \(\mathsf {Ext}_\mathcal {A}\) who, knowing the secret coins of \(\mathcal {A}\), returns a secret key \(\varvec{K}\) that could have been used to compute \(\mathsf {pk}^{\mathsf {zk}}\). \(\mathrm {SKWKE}\) will additionally guarantee that the same \(\varvec{K}\) was used to compute \(\mathsf {pk}^{\mathsf {snd}}\).

Definition 1

Fix \(k\ge 1\), \(n> m\ge 1\), and a distribution \(\mathcal {D}_{k}\). Let \(\mathsf {PKV}\) be as in Fig. 4. Then \((\mathcal {D}_{\mathsf {p}}, k, \mathcal {D}_{k})\)-\(\mathrm {KWKE}_{\mathbb {G}_{1}}\) (resp., ) holds relative to \(\mathsf {PGen}\) if for any \(\mathsf {p}\in {\text {im}}(\mathsf {PGen}(1^\lambda ))\) and PPT adversary \(\mathcal {A}\), there exists a PPT extractor \(\mathsf {Ext}_\mathcal {A}\), s.t. 

Here, the part is only present in the definition of \(\mathrm {SKWKE}\).

In Theorem 1, we also need the following “weak \(\mathrm {KerMDH}\)” assumption.

Definition 2

\(\mathcal {D}_{\ell k}\)-\(\mathrm {WKerMDH}_{\mathbb {G}_{1}}\) holds relative to \(\mathsf {PGen}\), if for any PPT \(\mathcal {A}\), .

Clearly, \(\mathcal {D}_{\ell k}\)-\(\mathrm {WKerMDH}_{\mathbb {G}_{1}}\) is not stronger and it is ostensibly weaker than \(\mathcal {D}_{\ell k}\)-\(\mathrm {KerMDH}_{\mathbb {G}_{1}}\) since computing \(\varvec{c}\) may be more complicated than computing \([\varvec{c}]_{2}\). (Although, it is easy to show that \(\mathcal {D}_{k}\)-\(\mathrm {KerMDH}\) follows from \(\mathcal {D}_{k}\)-HAK and \(\mathcal {D}_{k}\)-\(\mathrm {WKerMDH}\).) The Discrete Logarithm (DL) assumption is a classical example of \(\mathrm {WKerMDH}\) (consider matrices for ). In the case of say \(\mathcal {SC}_{k}\), the non-trivial co-kernel element \(\varvec{c}\) has to satisfy \(c_2 = - a c_1\) which enables to recover a; thus, \(\mathcal {SC}_{k}\)-\(\mathrm {WKerMDH}\) is secure under the DL assumption. Similarly, in the case of \(\mathcal {C}_{k}\), \(c_2 = - a_1 c_1\).

Next, we will prove that \(\mathrm {KWKE}\) (resp., \(\mathrm {SKWKE}\)) holds under the \(\mathcal {D}_{k}\)-HAK (resp., \(\mathcal {D}_{k}\)-HAK and \(\mathcal {D}_{\mathsf {p}}\)-\(\mathrm {WKerMDH}\)) assumption. Note that the use of \(\mathrm {WKerMDH}\), and thus of \(\mathrm {SKWKE}\), is questionable if \(\mathcal {C}_{\varrho }\) is malicious; nevertheless, we consider this case for the sake of completeness.

Theorem 1

(Security of \(\mathrm {KWKE}\) and \(\mathrm {SKWKE}\)). Assume that either \(\mathcal {D}_{k}\) is efficiently verifiable or \(\mathcal {D}_{k} = \mathcal {U}_2\). Assume \(k/p \approx _\lambda 0\). Then

  1. (i)

    \((\mathcal {D}_{\mathsf {p}}, k, \mathcal {D}_{k})\)-\(\mathrm {KWKE}_{\mathbb {G}_1}\) holds under the \(\mathcal {D}_{k}\)-HAK assumption.

  2. (ii)

    assuming that \(\mathcal {D}_{k}\)-HAK and \(\mathcal {D}_{\mathsf {p}}\)-\(\mathrm {WKerMDH}_{\mathbb {G}_1}\) hold (thus, \(\varrho = [\varvec{M}]_{1}\) comes from a \(\mathrm {WKerMDH}_{\mathbb {G}_1}\)-hard distribution), \((\mathcal {D}_{\mathsf {p}}, k, \mathcal {D}_{k})\)-\(\mathrm {SKWKE}_{\mathbb {G}_1}\) holds.

Proof

Assume \(\mathcal {A}\) is a \(\mathrm {KWKE}\) or \(\mathrm {SKWKE}\) adversary, s.t.: given public parameters \(\mathsf {p}\) and randomness , \(\mathcal {A}(\mathsf {p}; r)\) outputs with probability \(\varepsilon _\mathcal {A}\) a language parameter \(\varrho = [\varvec{M}]_{1}\) and public key , such that (in particular, \(\det \varvec{\bar{A}} \ne 0\) and \(\varvec{M}^\top \varvec{C} = \varvec{P} \varvec{\bar{A}}\)).

(i: security of \(\mathrm {KWKE}\)): Assume \(\mathcal {A}\) is a \(\mathrm {KWKE}\) adversary. Let \(\mathsf {Ext}^{hak}_{\mathcal {A}}\) be the extractor, existence of which is guaranteed by the \(\mathcal {D}_{k}\)-HAK assumption. Figure 5 depicts a candidate \(\mathrm {KWKE}\)-extractor \(\mathsf {Ext}_\mathcal {A}\), where \([q_{\iota i}]_{\iota }\) for \(i > 0\) are group elements created by \(\mathcal {A}\) (for which she does not know the discrete logarithm) in \(\mathbb {G}_\iota \), and \(q_{\iota 0} = 1\). Due to the \(\mathcal {D}_{k}\)-HAK assumption, \(\mathsf {Ext}^{hak}_\mathcal {A}\) can extract \(\varvec{N}_\iota \) and \([\varvec{q}_\iota ]_{\iota }\), such that \( \left[ {\begin{matrix}{\text {vect}}(\varvec{M}) \\ {\text {vect}}(\varvec{P})\end{matrix}}\right] _{1} = \varvec{N}_1 \left[ {\begin{matrix} 1, \\ \varvec{q}_1 \end{matrix}}\right] _1 \in \mathbb {G}_1^{mn+ mk} \) and \( \left[ {\begin{matrix}{\text {vect}}(\varvec{\bar{A}}) \\ {\text {vect}}(\varvec{C}) \end{matrix}}\right] _{2} = \varvec{N}_2 [{\begin{matrix}1 \\ \varvec{q}_2\end{matrix}}]_{2} \in \mathbb {G}_2^{k^2 + nk}\). Here, \({\text {vect}}(\varvec{B})\) denotes the vectorization of a matrix \(\varvec{B}\). Thus, e.g., \(\bar{A}_{i j} = \sum _{t \ge 0} N_{k(i - 1) + j, t} \varvec{q}_{2 t}\) and \(C_{i j} = \sum _{t \ge 0} N_{k(i - 1) + j + k^2, t} \varvec{q}_{2 t}\). Given \(\varvec{N}_1\) and \(\varvec{N}_2\), one can efficiently compute matrices \(\varvec{M} [j]\), \(\varvec{P} [j]\), \(\varvec{\bar{A}} [i]\) and \(\varvec{C} [i]\), such that the polynomials \(\varvec{M} (\varvec{Q_1}) := \sum _{j \ge 0} \varvec{M} [j] Q_{1 j}\), \(\varvec{P} (\varvec{Q_1}) := \sum _{j \ge 0} \varvec{P} [j] Q_{1 j}\), \(\varvec{\bar{A}} (\varvec{Q}_2) := \sum _{i \ge 0} \varvec{\bar{A}} [i] Q_{2 i}\), and \(\varvec{C} (\varvec{Q}_2) := \sum _{i \ge 0} \varvec{C} [i] Q_{2 i}\) satisfy \([\varvec{M}]_{1} = [\varvec{M} (\varvec{q}_1)]_{1}\), \([\varvec{P}]_{1} = [\varvec{P} (\varvec{q}_1)]_{1}\), \([\varvec{\bar{A}}]_{2} = [\varvec{\bar{A}} (\varvec{q}_2)]_{2}\), and \([\varvec{C}]_{2} = [\varvec{C} (\varvec{q}_2)]_{2}\).

Fig. 5.
figure 5

Extractors \(\mathsf {Ext}_\mathcal {A}(\mathsf {p}; r)\) and \(\mathsf {Ext}^2_\mathcal {A}(\mathsf {p}; r)\) in the proof of Theorem 1

We will now show that \(\mathsf {Ext}_\mathcal {A}\) satisfies the requirements of the extractor in the definition of \(\mathrm {KWKE}\). Assume that \(\mathcal {A}\) was successful with inputs \((\mathsf {p}; r)\). We execute \(\mathsf {Ext}_\mathcal {A}(\mathsf {p}; r)\) and obtain either \(\varvec{K}\) or \(\bot \). From (*) in \(\mathsf {PKV}\) (i.e., \(\varvec{M}^\top \varvec{C} = \varvec{P} \varvec{\bar{A}}\)), \( V(\varvec{Q}_1, \varvec{Q}_2) := (\sum _{j \ge 0} \varvec{M} [j] Q_{1 j})^\top \cdot (\sum _{i \ge 0} \varvec{C} [i] Q_{2 i}) - (\sum _{j \ge 0} \varvec{P} [j] Q_{1 j}) \cdot (\sum _{i \ge 0} \varvec{\bar{A}} [i] Q_{2 i}) \) satisfies \(V(\varvec{q}_1, \varvec{q}_2) = \varvec{0}\). We now consider the following two cases, \(V(\varvec{Q}_1, \varvec{Q}_2) = \varvec{0}\) as a polynomial and \(V(\varvec{Q}_1, \varvec{Q}_2) \ne \varvec{0}\) but \(V(\varvec{q}_1, \varvec{q}_2) = \varvec{0}\).

Case 1: \(V(\varvec{Q}_1, \varvec{Q}_2) = \varvec{0}_{m\times k}\) as a polynomial. Since \(Q_{1 j}\) and \(Q_{2 i}\) are indeterminates for all \(i, j > 0\), the coefficients \(V_{i j}\) of \(Q_{1 j} Q_{2 i}\) of \(V(\varvec{Q}_1, \varvec{Q}_2) = \sum _{i \ge 0, j \ge 0} V_{i j} Q_{1 j} Q_{2 i}\) must be equal to \(\varvec{0}_{m\times k}\) for all \(i, j \ge 0\). In particular,

$$\begin{aligned} \varvec{P} [j] \cdot \varvec{\bar{A}} [i] = \varvec{M} [j]^\top \varvec{C} [i]\,\, , \quad i \ge 0, j \ge 0\,\, . \end{aligned}$$
(1)

Let \(\varvec{\bar{A}} (\varvec{Q}_2) = \sum \varvec{\bar{A}} [i] Q_{2 i} \in \mathbb {Z}_p^{k\times k} [\varvec{Q}_2]\) be an affine multivariate matrix polynomial and let the polynomial \(d (\varvec{Q}_2) := \det (\varvec{\bar{A}} (\varvec{Q}_2)) \in \mathbb {Z}_p [\varvec{Q}_2]\) be its determinant. Clearly, \(\deg (d (\varvec{Q}_2)) \le k\), and \(\varvec{\bar{A}} (\varvec{Q}_2)\) is invertible iff \(d (\varvec{Q}_2) \ne 0\) as a polynomial. Since , \(d (\varvec{Q}_2) \ne 0\) and thus \(\varvec{\bar{A}} (\varvec{Q}_2)\) is invertible. This holds by definition for efficiently verifiable \(\mathcal {D}_{k}\). If \(\mathcal {D}_{k} = \mathcal {U}_2\), then \([a_{1 s}]_{1} [1]_{2} = [1]_{1} [a_{1 s}]_{2}\), for \(s \in \{1, 2\}\), and \([a_{1 1}]_{1} [a_{2 2}]_{2} \ne [a_{1 2}]_{1} [a_{2 1}]_{2}\) guarantee that \(d (\varvec{Q}_2) \ne 0\).

By the Schwartz-Zippel lemma, \(d (\varvec{y}) = 0\) for uniformly sampled (and thus \(\mathsf {Ext}_\mathcal {A}\) aborts in step \((\sharp )\)) with probability at most \(k/ p\). Thus, \(\varvec{\bar{A}} (\varvec{y})\) is invertible with probability at least \(\varepsilon _\mathcal {A}- k/ p\).

Assume now that \(\varvec{\bar{A}} (\varvec{y})\) is invertible. Define \( \varvec{K} (\varvec{Q}_2) := \varvec{C} (\varvec{Q}_2) \varvec{\bar{A}}^{-1} (\varvec{Q}_2) = (\sum _{i \ge 0} \varvec{C} [i] Q_{2 i}) (\sum _{i \ge 0} \varvec{\bar{A}} [i] Q_{2 i})^{-1} \in \mathbb {Z}_{p}^{n\times k} (\varvec{Q}_2)\). Let \(\varvec{K} := \varvec{K} (\varvec{y})\). Since \(\varvec{\bar{A}} (\varvec{y})\) is invertible then from Eq. (1), \( \varvec{P} [j] \cdot \varvec{\bar{A}} (\varvec{y}) = \varvec{P} [j] \cdot \left( \sum _i \varvec{\bar{A}} [i] y_i\right) = \varvec{M} [j]^\top \left( \sum _i \varvec{C} [i] y_i\right) = \varvec{M} [j]^\top \varvec{C} (\varvec{y})\). Thus, \(\varvec{P} [j] = \varvec{M} [j]^\top \varvec{K}\), and \(\varvec{P} (\varvec{Q}_1) = \varvec{M} (\varvec{Q}_1)^\top \varvec{K}\). Hence, with probability \(\varepsilon _{\mathsf {Ext}_\mathcal {A}} \ge \varepsilon _\mathcal {A}- k/ p\), \( \varvec{P} (\varvec{Q}_1) = \sum _{j \ge 0} \varvec{P} [j] Q_{1 j} = \varvec{M} (\varvec{Q}_1)^\top \varvec{K} \). Thus, \(|\varepsilon _{\mathsf {Ext}_\mathcal {A}} - \varepsilon _\mathcal {A}| \le k/ p\) and the claim follows.

Case 2: \(V(\varvec{X}, \varvec{Q}_1, \varvec{Q}_2) \ne \varvec{0}\) but \(V(\varvec{x}, \varvec{q}_1, \varvec{q}_2) = \varvec{0}\). Following [37], we consider separately the “non-hashing” case (the adversary creates no random elements \([q_\iota ]_{\iota }\)) and the “hashing” case (the adversary creates at least one random element that has high min-entropy).

In the non-hashing case, the verification polynomial is equal to the integer matrix \( V:= \varvec{M} [0]^\top \varvec{C} [0] - \varvec{P} [0] \cdot \varvec{\bar{A}} [0] \). Recall that \(V(\varvec{Q}_1, \varvec{Q}_2) \ne \varvec{0}\) but \(V(\varvec{q}_1, \varvec{q}_2) = \varvec{0}\). Since we are in the non-hashing case, there are no created group elements. Thus, the adversary cannot succeed in the non-hashing since the polynomial \(V\) is constant, and we need \(V= 0\) and \(V\ne 0\) at the same time.

Consider now the “hashing” case when \(\mathcal {A}\) has created at least one random group element \(q_{k}\) (say, in \(\mathbb {G}_1\)). Clearly, \(V(\varvec{Q}_1, \varvec{Q}_2)\) is a degree-1 polynomial in any indeterminate \(Q_{k}\). Thus, by the Schwartz-Zippel lemma and since \(H_\infty ([q_{\iota s}]_{\iota }) = \omega (\log \lambda )\), the probability \(1 / 2^{\sum _{\iota , s} H_\infty ([q_{\iota s}]_{\iota })}\) that \(V(\varvec{q}_1, \varvec{q}_2) = 0\) is negligible. Hence, the probability that an adversary, who created at least one (high min-entropy) group element \([q_{k}]_{1}\), can make the verifier accept is negligible.

(ii: security of \(\mathrm {SKWKE}\)): Let \(\mathcal {A}\) be an \(\mathrm {SKWKE}\) adversary that works in time \(\tau (\lambda )\) and outputs accepted by \(\mathsf {PKV}\) with probability \(\varepsilon _\mathcal {A}\). To prove that \(\mathrm {SKWKE}\) is secure, we need to additionally show that \(\varvec{C} = \varvec{K} \varvec{\bar{A}}\). In the process, we need to assume that \(\mathcal {D}_{\mathsf {p}}\)-\(\mathrm {WKerMDH}\) is hard against \(\tau (\lambda )\)-time adversaries. The general proof works exactly as in the \(\mathrm {KWKE}\) case, except one change that we discuss below. (In particular, the Case 2 is exactly the same.) We omit other details of the proof.

More precisely, the main idea is that in the proof step (i) we already established that \(\varvec{C} (\varvec{Q}_2) = \varvec{K} (\varvec{Q}_2) \varvec{\bar{A}} (\varvec{Q}_2)\) as polynomials. In the current step, we need to show that \(\varvec{C} (\varvec{Q}_2) = \varvec{K} \varvec{\bar{A}} (\varvec{Q}_2)\) holds, that is, \(\varvec{K} (\varvec{Q}_2)\) is a constant function. To guarantee the latter, we check the value of the rational function \(\varvec{K} (\varvec{Q}_2)\) at two positions. If the two values are different, we can break \(\mathcal {D}_{\mathsf {p}}\)-\(\mathrm {WKerMDH}\). Otherwise, w.h.p., \(\varvec{K} (\varvec{Q}_2)\) is a constant function.

More precisely, consider the extractor \(\mathsf {Ext}^2_\mathcal {A}\) in Fig. 5. Here, \(\varvec{K} = \varvec{K} (\varvec{y})\) and \(\varvec{K}' = \varvec{K} (\varvec{y}')\). Let \(\varepsilon _\mathcal {A}\) be the success probability of \(\mathcal {A}\). Analogously to the security proof of \(\mathrm {KWKE}\), with probability \(\varepsilon _\mathcal {A}- 2 k/ p\), both \(\varvec{\bar{A}} (\varvec{y})\) and \(\varvec{\bar{A}} (\varvec{y}')\) are invertible and thus \(\mathsf {Ext}^2_\mathcal {A}\) does not return \(\bot \).

Assume now that \(\mathsf {Ext}^2_\mathcal {A}\) does not return \(\bot \). By following similar analysis as in the case (i), \(\varvec{P} (\varvec{Q}_1) = \varvec{M} (\varvec{Q}_1)^\top \varvec{K}\) and \(\varvec{P} (\varvec{Q}_1) = \varvec{M} (\varvec{Q}_1)^\top \varvec{K}'\) which means that \( \varvec{M} (\varvec{Q}_1)^\top (\varvec{K} - \varvec{K}') = \varvec{0}_{m\times k}\). If \(\varvec{K} \ne \varvec{K}'\) then \(\mathsf {Ext}_\mathcal {A}\) has computed a non-zero element \(\varvec{K} - \varvec{K}'\) in the cokernel of \([\varvec{M}]_{1}\) and thus broken \(\mathcal {D}_{\mathsf {p}}\)-\(\mathrm {WKerMDH}_{\mathbb {G}_{1}}\). Since breaking \(\mathcal {D}_{\mathsf {p}}\)-\(\mathrm {WKerMDH}\) is hard within \(\tau (\lambda )\) steps, the probability \(\varepsilon _{\mathrm {WKerMDH}}\) that \(\mathsf {Ext}_\mathcal {A}\) returns \(\varvec{K} - \varvec{K}'\) is negligible unless \(\mathcal {A}\) has computational complexity \(\omega (\tau (\lambda ))\). Otherwise, \(\varvec{K} = \varvec{K} (\varvec{y}) = \varvec{K} (\varvec{y}')\), which means \(\varvec{f} (\varvec{y}) = \varvec{f} (\varvec{y}') = \varvec{0}\), where \( \varvec{f} (\varvec{Q}_2) := \varvec{C} (\varvec{Q}_2) \varvec{\bar{A}}^{-1} (\varvec{Q}_2) - \varvec{K}\). Denote the (ij)th coefficient of the matrix \(\varvec{f} (\varvec{Q}_2)\) by \( f_{i j} (\varvec{Q}_2) = \sum _{s} C_{i s} (\varvec{Q}_2) \bar{A}_{s j}^{-1} (\varvec{Q}_2) - K_{i j} \). Note that \(f_{i j} (\varvec{Q}_2) = f'_{i j} (\varvec{Q}_2) / \det (\varvec{\bar{A}} (\varvec{Q}_2))\), where \(f'_{i j} (\varvec{Q}_2)\) is some polynomial of degree \(\le k\).

At this point, we know that \(\det (\varvec{\bar{A}} (\varvec{Q}_2)) \ne 0\). Thus, \(\varvec{f} (\varvec{Q}_2) \ne \varvec{0}\) iff \(\varvec{C} (\varvec{Q}_2) - \varvec{K} \varvec{\bar{A}} (\varvec{Q}_2) \ne \varvec{0}\). From this and the Schwartz-Zippel lemma it follows that if \(f_{i j} (\varvec{Q}_2) \ne 0\) then \(\Pr _{\varvec{y}} [f_{i j} (\varvec{y}) = 0] \le k/ p\). If \(\varvec{f} (\varvec{Q}_2) \ne \varvec{0}\) then there exists at least one \((i_0, j_0)\), s.t. \(f_{i_0, j_0} (\varvec{Q}_2) \ne 0\) and thus \(\Pr _{\varvec{y}} [f_{i_0, j_0} (\varvec{y}) = 0] \le k/ p\). Thus, if \(\varvec{f} (\varvec{Q}_2) \ne \varvec{0}\) then \(\Pr _{\varvec{y}} [\varvec{f} (\varvec{y}) = \varvec{0}] \le k/ p\).

Hence, with probability \(\varepsilon _{\mathsf {Ext}^2_\mathcal {A}} \ge \varepsilon _\mathcal {A}- 3 k/ p - \varepsilon _{\mathrm {WKerMDH}}\), \(\varvec{C} (\varvec{Q}_2) = \varvec{K} \varvec{\bar{A}} (\varvec{Q}_2)\) and thus \(\varvec{P} (\varvec{Q}_1) = \varvec{M} (\varvec{Q}_1)^\top \varvec{K}\) and \(\varvec{C} = \varvec{K} \varvec{\bar{A}}\). Thus, \(|\varepsilon _{\mathsf {Ext}^2_\mathcal {A}} - \varepsilon _\mathcal {A}| \le 3 k/ p + \varepsilon _{\mathrm {WKerMDH}}\) and the security of \(\mathrm {SKWKE}\) follows.   \(\square \)

In the case of \(\mathrm {SKWKE}\), we extract the unique \(\varvec{K}\) used to compute the CRS. Following a proof idea from [2], it is easy to show that under either the \(\mathrm {KWKE}\) (and thus, also the \(\mathrm {SKWKE}\)) assumption \(\varPi _{\mathsf {bpk}}\) is Sub-ZK.

New Interactive Assumptions \(\varvec{\mathrm {KerMDH}^{\mathrm {dl}}}\) and \(\varvec{\mathrm {SKerMDH}^{\mathrm {dl}}}\). Since in the case of efficiently verifiable \(\mathcal {D}_{k}\), we essentially do not modify \(\varPi _{\mathsf {bpk}}\) (we only define \(\mathsf {PKV}\)), its Sub-PAR soundness almost follows from that of \(\varPi _{\mathsf {kw}}\) [31]. The main difference is that, due to considering the subverted language parameter, we need to change how one extracts \(\varvec{M}\). Namely, in [31], the \(\mathrm {KerMDH}\) adversary \(\mathcal {B}\) defined in the soundness reduction obtains \(([\varvec{M}]_{1}, \varvec{M})\) sampled from \(\mathcal {D}'_\mathsf {p}\) (this relies on the witness-sampleability). In our proof of Sub-PAR soundness (Theorem 2 in Sect. 7), \(\mathcal {B}\) obtains and then uses a non-adaptive DL oracle to extract \(\varvec{M}\). This means that we prove Sub-PAR soundness under a new interactive non-falsifiable \(\mathrm {KerMDH}^{\mathrm {dl}}\) assumption; however, importantly, we do not require witness-sampleability.

Since in some applications (e.g., in the setting of symmetric pairings), one uses \(\mathcal {D}_2 = \mathcal {U}_2\), we prove that if \(k= 2\) and \(\mathcal {D}_{k} = \mathcal {U}_{k}\), then \(\varPi _{\mathsf {bpk}}\) is sound under another new interactive non-falsifiable \(\mathrm {SKerMDH}^{\mathrm {dl}}\) assumption. Intuitively, in this case, \(\mathsf {pk}^{\mathsf {pkv}}\) contains additional elements, needed to efficiently check that \([\varvec{\bar{A}}]_{2}\) has full rank. If \(\mathcal {D}_{k}\) is efficiently verifiable then by definition, \(\mathsf {pk}^{\mathsf {pkv}}= \varepsilon \) (empty string) is sufficient. Since for efficiency reasons, one is interested in only small values of \(k\), we will not consider the case of non-verifiable \(\mathcal {D}_{k}\) with \(k> 2\).

In addition, we are interested in applying the QA-NIZK in the case \(\varvec{M}\) has rank n (i.e., the image of \(\varvec{M}\) is the full space). Since then soundness holds trivially, one must prove knowledge-soundness. We show that in this case, \(\varPi _{\mathsf {bpk}}\) is Sub-PAR knowledge-sound under two non-falsifiable assumptions: a HAK knowledge assumption and the new interactive \(\mathrm {SDL^{dl}}\) assumption. The \(\mathrm {KerMDH}^{\mathrm {dl}}\), \(\mathrm {SKerMDH}^{\mathrm {dl}}\), and \(\mathrm {SDL^{dl}}\) assumptions are \(X^Y\)-type interactive assumptions as used in [20, 34], where the assumption X is assumed to hold even if the adversary is given non-adaptive access (i.e., before the X challenge is chosen) to an oracle that solves the assumption Y.

The \(\mathrm {SDL^{dl}}\) assumption holds relative to \(\mathsf {PGen}\), if for any PPT \(\mathcal {A}\),

Here, the oracle \(\mathrm {dl} ([y]_{1})\) returns the discrete logarithm y of \([y]_{1}\).

The \(\mathcal {D}_{\ell k}\)-\(\mathrm {KerMDH}^{\mathrm {dl}}_{\mathbb {G}_{1}}\) assumption holds relative to \(\mathsf {PGen}\), if for any PPT \(\mathcal {A}\),

The \(\mathcal {D}_{\ell k}\)-\(\mathrm {SKerMDH}^{\mathrm {dl}}\) assumption holds relative to \(\mathsf {PGen}\), if for any PPT \(\mathcal {A}\),

Generic-model security proofs of \(\mathrm {SDL^{dl}}\) and \(\mathrm {SKerMDH}^{\mathrm {dl}}\) are very similar to those of \(\mathrm {SDL}\) and \(\mathrm {KerMDH}\): the field elements returned by the DL oracle are independent of the challenge and thus do not influence the rest of proof.

One could use an AK assumption instead of the \(\mathrm {SDL^{dl}}\) assumption. However, the AK assumption explicitly does not allow \(\mathcal {A}\) to create new group elements by using elliptic-curve hashing. The \(\mathrm {SDL^{dl}}\) assumption allows the adversary to create such group elements, but allows access to non-adaptive DL oracle to extract their discrete logarithms. It is also not an expanding assumption, differently to many knowledge assumptions (e.g., the PKE assumption [26] that underlies many pairing-based SNARKs) that allow one to extract long “plaintext” from a short “ciphertext”. Hence, the \(\mathrm {SDL^{dl}}\) assumption, while still non-falsifiable, seems to be somewhat more realistic than an AK assumption. On the other hand, we need to extract \(\varvec{y}\) and \(\varvec{\pi }\) from \(\mathcal {A}\)’s output after the challenge is known, adaptively. In this case, a knowledge assumption (HAK) is more realistic than an adaptive DL oracle that one could also just use to break SDL directly.

7 Security of \(\varPi _{\mathsf {bpk}}\)

Theorem 2

Let \(\varPi _{\mathsf {bpk}}\) be the QA-NIZK argument system for linear subspaces from Fig. 4. The following statements hold in the BPK model. Assume that \(\mathcal {D}_{\mathsf {p}}\) is such that \(\mathsf {PARV}\) is efficient.

  1. (i)

    \(\varPi _{\mathsf {bpk}}\) is perfectly complete and perfectly zero-knowledge.

  2. (ii)

    If \((\mathcal {D}_{\mathsf {p}}, k, \mathcal {D}_{k})\)-\(\mathrm {KWKE}_{\mathbb {G}_{1}}\) holds relative to \(\mathsf {PGen}\) then \(\varPi _{\mathsf {bpk}}\) is statistically persistent zero-knowledge.

  3. (iii)

    Assume \(\mathcal {D}_{k}\) is efficiently verifiable (resp., \(\mathcal {D}_{k} = \mathcal {U}_2\)). If \(\mathcal {D}_{k}\)-\(\mathrm {KerMDH}^{\mathrm {dl}}\) (resp., \(\mathcal {D}_{k}\)-\(\mathrm {SKerMDH}^{\mathrm {dl}}\)) holds relative to \(\mathsf {PGen}\) then \(\varPi _{\mathsf {bpk}}\) is computationally quasi-adaptively Sub-PAR sound.

  4. (iv)

    Assume \(\varvec{M}\) has rank n (\(\varvec{y} = \varvec{M} \varvec{w}\) always has a solution), and that \(\mathcal {D}_{k}\) is robust. If \(\mathrm {SDL^{dl}}\) and \(\mathsf {KGen}([\varvec{M}]_{1})\)-HAK, for arbitrary efficiently computable \([\varvec{M}]_{1}\), hold relative to \(\mathsf {PGen}\) then \(\varPi _{\mathsf {bpk}}\) is computationally quasi-adaptively Sub-PAR knowledge-sound.

Proof

(i: perfect completeness/perfect zero-knowledge): obvious.

(ii: persistent zero-knowledge): Let \(\mathcal {C}\) be a subverter that computes so as to break the Sub-ZK property. That is, \(\mathcal {C}(\mathsf {p}; r_\mathcal {C})\) outputs . Let \(\mathcal {B}\) be the adversary from Fig. 6. Note that \(\mathsf {RND}_\lambda (\mathcal {B}) = \mathsf {RND}_\lambda (\mathcal {C})\). Under the \((\mathcal {D}_{\mathsf {p}}, k, \mathcal {D}_{k})\)-\(\mathrm {KWKE}\) assumption, there exists an extractor \(\mathsf {Ext}^2_\mathcal {B}\), such that if \(\mathsf {PARV}([\varvec{M}]_{1}) = 1\) and then \(\mathsf {Ext}^2_\mathcal {B}(\mathsf {p}; r_\mathcal {C})\) outputs \(\varvec{K}\), such that \(\varvec{P} = \varvec{M}^\top \varvec{K}\). We construct a trivial extractor \(\mathsf {Ext}_\mathcal {C}(\mathsf {p}; r_\mathcal {C})\) for \(\mathcal {C}\), as depicted in Fig. 6. Clearly, \(\mathsf {Ext}_\mathcal {C}\) returns , such that \(\varvec{P} = \varvec{M}^\top \varvec{K}\).

Fig. 6.
figure 6

The extractor and the constructed adversary \(\mathcal {B}\) from the persistent zero-knowledge proof of Theorem 2.

Fix concrete values of \(\lambda \), \(\mathsf {p}\in {\text {im}}(\mathsf {PGen}(1^\lambda ))\) and \(r_\mathcal {C}\in \mathsf {RND}_\lambda (\mathcal {C})\). Let , and run \(\mathsf {Ext}_\mathcal {C}(\mathsf {p}; r_\mathcal {C})\) to obtain \(\varvec{K}\). Fix \(([\varvec{y}]_{1}, \varvec{w}) \in \mathscr {R}_{[\varvec{M}]_{1}}\). It clearly suffices to show that if \(\mathsf {PARV}([\varvec{M}]_{1}) = 1\), and \(([\varvec{y}]_{1}, \varvec{w}) \in \mathscr {R}_{[\varvec{M}]_{1}}\) then and have the same distribution. This holds since from it follows that \(\varvec{P} = \varvec{M}^\top \varvec{K}\) and from \(([\varvec{y}]_{1}; \varvec{w}) \in \mathscr {R}_{[\varvec{M}]_{1}}\) it follows that \(\varvec{y} = \varvec{M} \varvec{w}\). Thus, \( \mathsf {O}_0 ([\varvec{y}]_{1}, \varvec{w}) = [\varvec{P}]_{1}^\top \varvec{w} = [\varvec{K}^\top \varvec{M} \varvec{w}]_{1} = \varvec{K}^\top [\varvec{y}]_{1} = \mathsf {O}_1 ([\varvec{y}]_{1}, \varvec{w}) \). Hence, \(\mathsf {O}_0\) and \(\mathsf {O}_1\) have the same distribution, and thus, \(\varPi _{\mathsf {bpk}}\) is persistent zero-knowledge under \(\mathrm {KWKE}\).

(iii: \(\mathcal {D}_{k}\) is efficiently verifiable, Sub-PAR soundness under \(\mathrm {KerMDH}^{\mathrm {dl}}\)): follows directly from the soundness proof of \(\varPi _{\mathsf {kw}}\) in [31]. There is only one difference: If \([\varvec{M}]_{1}\) is not subverted (like in [31]), then one can use the witness-sampleability of \(\mathcal {D}_{\mathsf {p}}\) to extract \(\varvec{M}\), and get a reduction to the falsifiable \(\mathrm {KerMDH}\) assumption. In the case of Sub-PAR soundness, since the language parameter can be subverted (and thus one cannot rely on witness-sampleability), we let \(\mathcal {B}\) use the DL oracle to obtain \(\varvec{M}\) from \([\varvec{M}]_{1}\) and then use it in the soundness proof of [31] to get a reduction to the non-falsifiable \(\mathrm {KerMDH}^{\mathrm {dl}}\) assumption. Importantly, in this case, witness-sampleability is not needed.

(iii: \(\mathcal {D}_{k} = \mathcal {U}_2\), Sub-PAR soundness under \(\mathrm {SKerMDH}^{\mathrm {dl}}\)): In the case \(\mathcal {D}_{k} = \mathcal {U}_2\), the proof is similar to the soundness proof of \(\varPi _{\mathsf {kw}}\) in [31]. However, since we added \([a_{1 1}, a_{1 2}]_{1}\) to the public key, we reduce instead to the \(\mathrm {SKerMDH}^{\mathrm {dl}}\) assumption; this complicates the proof.

Assume that \(\mathcal {A}\) breaks the soundness of \(\varPi _{\mathsf {bpk}}\) with probability \(\varepsilon \). We will build an adversary \(\mathcal {B}\), see Fig. 7, that breaks \(\mathrm {SKerMDH}^{\mathrm {dl}}\) with probability \(\ge \varepsilon - 1 / p\). First, \(\mathcal {B}\) uses the DL oracle to obtain \(\varvec{M}\) from \([\varvec{M}]_{1}\); this is needed since \([\varvec{M}]_{1}\) could be subverted. Here, witness-sampleability is not needed. As above, when the language parameter is generated honestly, the DL oracle is not needed, and one instead relies on the witness-sampleability of \(\mathcal {D}_{\mathsf {p}}\) to obtain a reduction to the falsifiable \(\mathrm {SKerMDH}\) assumption.

Fig. 7.
figure 7

Adversary \(\mathcal {B}\) in the soundness proof of Theorem 2 (reduction to \(\mathrm {SKerMDH}^{\mathrm {dl}}\))

Note that in Fig. 7, \([\varvec{\bar{A}}']_{2} = [\varvec{\bar{A}}]_{2} \in \mathbb {G}_2^{k\times k}\). Define implicitly (since we do not know this value) . Thus, \( [\varvec{C}]_{2} = (\varvec{K}' || \varvec{M}^{\bot }) [\varvec{A}']_{2} = [\varvec{K}'\varvec{\bar{A}}' + \varvec{M}^{\bot } \varvec{\underline{A}}']_{2} = [(\varvec{K}' + \varvec{M}^{\bot } \varvec{\underline{A}}' \varvec{\bar{A}}^{-1}) \varvec{\bar{A}}]_{2} = [\varvec{K}\varvec{\bar{A}}]_{2}\) and \( [\varvec{P}]_{1} = [\varvec{M}^\top \varvec{K}']_{1} = [\varvec{M}^\top (\varvec{K} - \varvec{M}^{\bot } \varvec{\underline{A}}' \varvec{\bar{A}}^{-1})]_{1} = [\varvec{M}^\top \varvec{K}]_{1}\). Thus, has the same distribution as the real public key.

With probability \(\varepsilon \), \(\mathcal {A}\) is successful, that is,

  1. 1.

    \(\varvec{y}^\top \varvec{M}^{\bot } \ne \varvec{0}_{1 \times (n- m)}\) (that is, \(\varvec{y} \not \in {\text {colspace}}(\varvec{M})\)) and thus also \(\varvec{c} = ((\varvec{\pi }^\top - \varvec{y}^\top \varvec{K}') || -\varvec{y}^\top \varvec{M}^{\bot }) \ne \varvec{0}_{n- m+ k}\);

  2. 2.

    \(\varvec{y}^\top \varvec{C} = \varvec{\pi }^\top \varvec{\bar{A}}\) (\(\mathsf {V}\) accepts). Thus, \(\varvec{0}_{1 \times k} = \varvec{\pi }^\top \varvec{\bar{A}} - \varvec{y}^\top \varvec{C} = \left( \varvec{\pi }^\top || \varvec{0}^\top _{n- m}\right) \varvec{A}' - \varvec{y}^\top \left( \varvec{K}' || \varvec{M}^{\bot }\right) \varvec{A}' = \left( (\varvec{\pi }^\top - \varvec{y}^\top \varvec{K}') || -\varvec{y}^\top \varvec{M}^{\bot }\right) \varvec{A}' = \varvec{c}^\top \varvec{A}'\).

By definition, \(\varvec{s}_{1} - \varvec{s}_2 = \varvec{c}_1 + \varvec{R}^\top \varvec{c}_2\) and thus \( (\varvec{s}_{1}^\top - \varvec{s}_2^\top ) \varvec{A} = (\varvec{c}^\top _1 + \varvec{c}^\top _2 \varvec{R}) \varvec{A} = \varvec{c}^\top \varvec{A}' = \varvec{0}_{1 \times k} \). Since \(\varvec{c} \ne \varvec{0}_{n- m+ k}\) and \(\varvec{R}\) leaks only through \(\varvec{A}'\) (in the definition of \([\varvec{C}]_{2}\)) as \(\varvec{R}\varvec{A}\), \( \Pr [\varvec{c}_1 + \varvec{R}^\top \varvec{c}_2 = \varvec{0} \mid \varvec{R}\varvec{A}] \le 1 / p \), where the probability is over .

(Item iv: Sub-PAR knowledge-soundness): Our proof strategy is inspired by that of [8, App. F]. However, their proof is given for honestly generated language parameter \(\varrho = [\varvec{M}]_{1}\) and \(\varvec{M}\) is obtained by using witness-sampleability; we modify the proof by extracting \(\varvec{M}\) from \(\varrho \) by using a DL oracle. Thus, we need to use two different types of non-falsifiable assumptions: (1) the non-adaptive \(\mathrm {SDL^{dl}}\) assumption to extract \(\varvec{M}\) from \([\varvec{M}]_{1}\), and (2) knowledge (HAK) assumptions to extract \(\varvec{y}\) and \(\varvec{\pi }\) from \([\varvec{y}]_{1}\) and \([\varvec{\pi }]_{1}\); we use the fact that the verification equation holds to be able to apply HAK. Moreover, we modify the proof of [8] to work for an arbitrary \(k\).

We construct the following \(\mathrm {SDL^{dl}}\) adversary \(\mathcal {B}\), that is given access to a non-adaptive DL oracle in the query phase and then, after that, a challenge \(([x]_{1}, [x]_{2})\), returns \(x\). First, \(\mathcal {B}\) samples r and calls \(\mathcal {A}(\mathsf {p}; r)\), obtaining \([\varvec{M}]_{1}\). \(\mathcal {B}\) uses the non-adaptive DL oracle nm times, extracting the matrix \(\varvec{M} \in \mathbb {Z}_p^{n \times m}\).

In the challenge phase, \(\mathcal {B}\) obtains \(([x]_{1}, [x]_{2})\) from the challenger. After that, \(\mathcal {B}\) samples random \(\varvec{K}_1, \varvec{K}_2 \in \mathbb {Z}_p^{n \times k}\) and sets . \(\mathcal {B}\) honestly generates by setting , , and . Denote \(\varvec{P}' = {\text {vect}}(\varvec{P}) \in \mathbb {Z}_p^{m k}\). \(\mathcal {B}\) sends to \(\mathcal {A}\) who returns \([\varvec{y}, \varvec{\pi }]_{1}\).

According to the \(\mathsf {KGen}([\varvec{M}]_{1})\)-HAK assumption for arbitrary efficiently computable \([\varvec{M}]_{1}\), given \(\mathcal {A}\) who on input , where , outputs \([\varvec{y}]_{1} \in \mathbb {G}_1^n\) and \([\varvec{\pi }]_{1} \in \mathbb {G}_1^k\), we can extract \([\varvec{q}]_{1} \in \mathbb {G}_1^{n_q}\), \((\varvec{y}_1, \varvec{y}_2, \varvec{y}_3)\) and \((\varvec{\pi }_1, \varvec{\pi }_2, \varvec{\pi }_3)\), such that

$$\begin{aligned} \begin{aligned}{}[\varvec{y}]_{1} =&\varvec{y}_1 [1]_{1} + \varvec{y}_2 [\varvec{P}']_{1} + \varvec{y}_3 [\varvec{q}]_{1}\,\, , \\ [\varvec{\pi }]_{1} =&\varvec{\pi }_1 [1]_{1} + \varvec{\pi }_2 [\varvec{P}']_{1} + \varvec{\pi }_3 [\varvec{q}]_{1}\,\, , \end{aligned} \end{aligned}$$
(2)

Note that \(\varvec{y}_2 \in \mathbb {Z}_p^{n \times m k}\), \(\varvec{\pi }_2 \in \mathbb {Z}_p^{k\times m k}\), \(\varvec{y_3} \in \mathbb {Z}_p^{n \times n_q}\), and \(\varvec{\pi _3} \in \mathbb {Z}_p^{k \times n_q}\).

We will now write \(\varvec{K}' = {\text {vect}}(\varvec{K})\), \(\varvec{K}'_1 = {\text {vect}}(\varvec{K}_1)\), \(\varvec{K}'_2 = {\text {vect}}(\varvec{K}_2)\), \(\varvec{P}_1 = \varvec{M}^\top \varvec{K}_1\), \(\varvec{P}_2 = \varvec{M}^\top \varvec{K}_2\), \(\varvec{P}'_1 = {\text {vect}}(\varvec{P}_1)\) and \(\varvec{P}'_2 = {\text {vect}}(\varvec{P}_2)\). Thus, \(\varvec{P} = \varvec{M}^\top \varvec{K} = \varvec{M}^\top (x\varvec{K}_1 + \varvec{K}_2) = x\varvec{P}_1 + \varvec{P}_2\) and \(\varvec{P}' = x\varvec{P}'_1 + \varvec{P}'_2\). Recall \(\varvec{M} \in \mathbb {Z}_p^{n \times m}\), \(\varvec{K} \in \mathbb {Z}_p^{n \times k}\), and \(\varvec{P} \in \mathbb {Z}_p^{m \times k}\).

From the verification equation \([\varvec{y}]_{1}^\top [\varvec{C}]_{2} = [\varvec{\pi }]_{1}^\top [\varvec{\bar{A}}]_{2}\). Assuming \(\varvec{\bar{A}}\) is invertible, \([\varvec{\pi }]_{1} = [\varvec{K}^\top \varvec{y}]_{1}\). From this and Eq. (2), \(\varvec{\pi }_1 [1]_{1} + \varvec{\pi }_2 [\varvec{P}']_{1} + \varvec{\pi }_3 [\varvec{q}]_{1} = [\varvec{K}]_{1}^\top \varvec{y}_1 + [\varvec{K}^\top \varvec{y}_2 \varvec{P}']_{1} + [\varvec{K}^\top \varvec{y}_3 \varvec{q}]_{1}\), and thus

$$\begin{aligned} \varvec{\pi }_1 [1]_{1} +&\varvec{\pi }_2 [x\varvec{P}'_1 + \varvec{P}'_2]_{1} + \varvec{\pi }_3 [\varvec{q}]_{1} \\ =&[x\varvec{K}_1 + \varvec{K}_2]_{1}^\top \varvec{y}_1 + [(x\varvec{K}_1 + \varvec{K}_2)^\top \varvec{y}_2 (x\varvec{P}'_1 + \varvec{P}'_2)]_{1} + [(x\varvec{K}_1 + \varvec{K}_2)^\top \varvec{y}_3 \varvec{q}]_{1} \,\, . \end{aligned}$$

Collecting the powers of \(X\), we get that the verification equation states that \(V(x, \varvec{q}) = \varvec{0}_{k}\), where \(V(X, \varvec{Q}) := \varvec{a} X^2 + \varvec{b} (\varvec{Q}) X+ \varvec{c} (\varvec{Q})\) for

$$\begin{aligned} \varvec{a} =&\varvec{K}_1^\top \varvec{y}_2 \varvec{P}'_1\,\, , \\ \varvec{b} (\varvec{Q}) =&\varvec{K}_1^\top \left( \varvec{y}_1 + \varvec{y}_2 \varvec{P}'_2\right) + \left( \varvec{K}_2^\top \varvec{y}_2 - \varvec{\pi }_2\right) \varvec{P}'_1 + \varvec{K}_1^\top \varvec{y_3} \varvec{Q} \,\, , \\ \varvec{c} (\varvec{Q}) =&\varvec{K}_2^\top (\varvec{y}_1 + \varvec{y}_2 \varvec{P}'_2) - \left( \varvec{\pi }_1 + \varvec{\pi }_2 \varvec{P}'_2\right) + (\varvec{K}_2^\top \varvec{y}_3 - \varvec{\pi }_3) \varvec{Q} \,\, . \end{aligned}$$

Since each \(q_i\) has min-entropy \(\varOmega (\log \lambda )\) from the adversary’s viewpoint and \(V(X, \varvec{Q})\) is a linear polynomial in each \(Q_i\), from \(V(x, \varvec{q}) = \varvec{0}_k\) it follows (by the Schwartz-Zippel lemma) with an overwhelming probability \(1 - \varepsilon _q\) that \(V(x, \varvec{Q}) = 0\) as a polynomial and thus also \(V(x, \varvec{0}) = \varvec{a} X^2 + \varvec{b} (\varvec{0}) X+ \varvec{c} (\varvec{0}) = 0\), where \(\varvec{b} := \varvec{b} (\varvec{0})\) and \(\varvec{b} := \varvec{b} (\varvec{0})\). In particular, in what follows, we can assume \(\varvec{y}_3 = \varvec{0}\) and \(\varvec{\pi }_3 = \varvec{0}\).

Next, let \(\varvec{w}\) be any solution to \(\varvec{y} = \varvec{M} \varvec{w}\); a solution exists and can be efficiently found since \(\varvec{M}\) has rank n. We already extracted \(\varvec{M}\) by using the DL oracle, while \(\varvec{y} = \varvec{y}_1 + x\varvec{d} + \varvec{y}_2 \varvec{P}'_2\), where \(\varvec{d} := \varvec{y}_2 \varvec{P}'_1 \in \mathbb {Z}_p^n\), can be extracted if \(\varvec{d} = \varvec{0}_n\). Thus, if \(\varvec{d} = \varvec{0}_n\) then we can extract and return \(\varvec{w}\).

To show that, w.h.p., \(\varvec{d} = \varvec{0}_n\), consider the opposite case \(\varvec{d} \ne \varvec{0}_n\). If \(\varvec{a} \ne \varvec{0}_{k}\) (this can only happen if \(\varvec{d} \ne \varvec{0}_n\)) then we have a quadratic equation \(\varvec{a} [x^2]_{1} + \varvec{b} [x]_{1} + \varvec{c} [1]_{1} = 0\), with \(\varvec{a} \ne 0\), that \(\mathcal {B}\) can solve for \(x\), and thus return \(x\).

Assume \(\varvec{a} = \varvec{0}_{k}\) but \(\varvec{d} \ne \varvec{0}_n\). This means \(\varvec{d} \in \mathbb {Z}_p^n\) is a non-zero element in the kernel of \(\varvec{K}_1^\top \in \mathbb {Z}_p^{k\times n}\). Since for \(\mathcal {A}\), \(\varvec{K}_1\) looks uniformly random from \(\mathbb {Z}_p^{k\times n}\), the question is now what is the maximum probability that for any \(\varvec{d} \ne \varvec{0}_{k}\) picked by \(\mathcal {A}\), \(\varvec{K}_1^\top \varvec{d} = \varvec{0}\). Obviously, unless \(\varvec{d} = \varvec{0}_{k}\), this probability is equal to .

Hence, the probability of success \(\varepsilon _\mathcal {B}\) of \(\mathcal {B}\) is at least \(\varepsilon _\mathsf {w}- \varepsilon _q - p^{-k}\), where \(\varepsilon _\mathsf {w}\) is the probability of extracting \(\mathsf {w}\).   \(\square \)

If the language parameter has been honestly generated, then one does not need the DL oracle to extract \(\varvec{M}\). Instead, as in [31], one relies on the witness-sampleability of \(\mathcal {D}_{\mathsf {p}}\) to extract \(\varvec{M}\) and then finish the proof of Sub-PAR (knowledge-)soundness. Importantly, in the subverted case, we do not have to assume witness-sampleability.

We note \(\mathrm {SKerMDH}\) is not secure when \(k= 1\), [23].