Obfuscation-based Non-black-box Simulation and Four Message Concurrent Zero Knowledge for NP

(1)

Obfuscation-based Non-black-box Simulation and

Four Message Concurrent Zero Knowledge for

NP

Omkant Pandey∗ Manoj Prabhakaran† Amit Sahai‡

Abstract

As recent studies show, the notions ofprogram obfuscationandzero knowledgeare intimately con-nected. In this work, we explore this connection further, and prove the following general result. If there existsdiffering input obfuscation(diO) for the class of all polynomial time Turing machines, then there exists afour message, fully concurrent zero-knowledge proof system for all languages inNPwith neg-ligiblesoundness error. This result is constructive: givendiO, our reduction yields an explicit protocol along with anexplicitsimulator that is “straight line” and runs in strict polynomial time.

Our reduction relies on a new non-black-box simulation technique which does not use the PCP theorem. In addition to assumingdiO, our reduction also assumes (standard and polynomial time) cryp-tographic assumptions such as collision-resistant hash functions.

The round complexity of our protocol also sheds new light on theexactround complexity of con-current zero-knowledge. It shows, for the first time, that in the realm of non-black-box simulation, concurrent zero-knowledge may not necessarily require more rounds thanstand alonezero-knowledge!

1 Introduction

Zero-knowledge and program obfuscation. Zero-knowledge proofs, introduced by Goldwasser, Micali and Rackoff [GMR85] are the classical example of thesimulation paradigm. They allow aproverto con-vince averifier that a mathematical statement x ∈ L is true while givingno additional knowledgeto the verifier. Prior to 2001, all known zero-knowledge simulators used the (cheating) verifierV∗as ablack-boxto produce their output (called the simulated view). Barak [Bar01] demonstrated how to take advantage of ver-ifier’s program to build, more powerful,non-black-boxsimulation techniques. Constructing and analyzing non-black-box simulators is significantly more challenging task.

The reason why taking advantage of verifier’s code is difficult is because of the intriguing possibility of

program obfuscation. Roughly speaking, program obfuscation is a method to transform a computer program (say described as a Boolean circuit) into a form that is executable but otherwise completely “unintelligible.” In its strongest form, an obfuscated program leaks no information about the program beyond its “function-ality” or the “input-output behavior”. Therefore, access to the obfuscated program is no better than having black box access to it. This property, as formalized by Barak, Goldreich, Impagliazzo, Rudich, Sahai, Vad-han, and Yang [BGI+01], is called thevirtual black box (VBB) security. It was shown in [BGI+01] that

VBB-secure obfuscation is impossible in general. In the hindsight, this negative result is also the funda-mental reason why non-black-box (NBB) simulation techniques prove to be more powerful than black box techniques.

∗

University of Illinois at Urbana-Champaign, Email:[email protected]

†

University of Illinois at Urbana-Champaign, Email:[email protected]

‡

(2)

Zero-knowledge, in particular non-black-box simulation, is intimately connected to program obfusca-tion. This connection has been explicitly studied in the works of Hada [Had00], and Bitansky and Paneth [BP12b, BP12a, BP13a], and alluded to in several other works, e.g., [HT99, Bar01]). In this work, we ex-plore this line of research further, particularly in light of recent breakthrough work onindistinguishability obfuscation(iO) [GGH+13].

Indistinguishability obfuscation. Garg, Gentry, Halevi, Raykova, Sahai, and Waters [GGH+13] present a candidate construction for a weaker notion of obfuscation calledindistinguishability obfuscation[BGI+01]. Roughly speaking,iOguarantees that if two (same size) programsC0, C1 are functionally equivalent, then their obfuscations are computationally indistinguishable. A closely related notion is that ofdiffering input obfuscation (diO) [BGI+01] which, roughly speaking, guarantees that the obfuscations of C0 andC1 are computationally indistinguishableprovided thatit is hard to find an inputxsuch thatC0(x)6=C1(x).

Garg et. al. [GGH+13] present a candidate construction of iO for the class of all polynomial size circuits. Candidate constructions ofdiOfor the class of all polynomial timeTuring machineswere recently constructed by Ananth et. al. [ABG+13], and Boyle, Chung, and Pass [BCP14].

Our results. In this work we show how to use program obfuscation to build a new non-black-box simula-tion strategy that works for fullyconcurrentzero-knowledge. More specifically, we show that:

• If differing-input obfuscation (diO) exists for the class of all polynomial time Turing machines, then there exists a constant round, fully concurrent zero knowledge protocol forNPwith negligible sound-ness error. The protocol has anexplicitsimulator;1 the simulator is “straight line” and runs in strict polynomial time.

• We also show how to implement the core ideas of the above protocol in onlyfourrounds. That is, our new protocol requires sending only four messages between the prover and the verifier.

Our protocol can be instantiated using the diO construction of Ananth et. al. [ABG+13] which ob-fuscates polynomial time Turing machines that can accept inputs of variable length (at most polynomial in the security parameter).2 We stress that we are able to obtain an explicit simulator for our protocol irre-spective of the computational assumptionsunderlying the above mentioneddiO. This is because we use the security—i.e., indistinguishability property—of obfuscation only in proving thesoundnessof our protocol. The simulator only depends on the correctness or thefunctionalityof the obfuscated program, and hence can be described explicitly. As is usually the case with most cryptographic applications of obfuscation, we also require that obfuscation is “secure” w.r.t. auxiliary information. In our case the auxiliary information will consist of the transcripts of Barak’s preamble (see theorems 5.1 and 6.1 for a precise statement).

Other than (auxiliary input) diO, our reduction only assumes standard (polynomial time hardness) as-sumptions, namely injective one-way functions and collision-resistant hash functions. Interestingly, our reduction does not explicitly depend on CS-proofs/universal-arguments [Kil92, Mic94, Kil95, BG02]; in particular, if we instantiate the construction of [ABG+13] using the “SNARKs” of Bitansky et al. [BCCT13] (which does not rely on the PCP theorem), we obtain an instantiation of our protocol that also does not rely on the PCP theorem.

1

In some protocols, specifically those based on knowledge-type assumptions [HT99], by virtue of the assumption that there exists an “extractor,” we only obtain anexistentialresult that a simulator exists; however, the actual program of the simulator is not explicitly given in the security proofs.

2

(3)

The round complexity of our final protocol also sheds new light on theexact(as opposed to asymptotic) round-complexity of concurrent zero-knowledge. Even in the simpler case of stand alone zero knowl-edge, the best known constructions require at least four rounds [FS89], and historically, concurrent zero-knowledge has always required more rounds than stand alone zero-zero-knowledge.3 Our four round protocol, for the first time, closes the gap between the best known upper bounds on round complexities of concurrent versus standalone zero-knowledge protocols (whose simulators can be explicitly described).

In retrospect, the fact that obfuscation actually helpsnon-black-box simulation can be perplexing. In-deed, in all prior works along this line [Had00, BP12b, BP13a], the core ideas for simulation are ofopposite

nature: it is theinabilityto obfuscate the “unobfuscatable functions” that helps the simulator. In our case, similar to [BP12a], it is theabilityto obfuscate programs that allows polynomial time simulation.

1.1 Technical Overview: Non-black-box Simulation via Program Obfuscation

Let us start by considering the simplest approach to zero-knowledge from (the possibility of) program ob-fuscation. For now, let us restrict ourselves to the case ofstand alonezero-knowledge forNP-languages. Letx∈Lbe the statement andRbe the witness-relation.

One simple approach is to have the verifier send an obfuscation of the following programMx,swhich

contains a secret strings ∈ {0,1}n_: _M

x,s(a) = sif and onlyR(x, a) = 1andMx,s(a) = 0notherwise.

LetMf_x,sdenote theiO-secure obfuscation ofM_x,s. The real prover can recoversby using a witnesswto

x. Further, ifxis false,Mx,sis identical toMx,0nand therefore must hides, ensuring the soundness.4 This

gives us a two-message,honest verifierZKproof. However, this idea does not help the simulation against malicious verifiers.

To fix this, let us try to use Barak’s preamble (calledGenStat [Bar01]) which has the following three rounds: first, the verifier sends a collision-resistant hash functionh : {0,1}∗ → {0,1}n_{, then the prover}

sends a commitmentcto0n (using a perfectly binding schemeCom), and then the verifier sends a string r∈ {0,1}n_{. The transcript defines a “fake statement”}_λ₌_h_{h, c, r}_i_{. A “fake witness”}_ω_{for the statement}_λ

consists of a pair(Π, u)such thatc=Com(h(Π) ; u)andΠis a program of lengthpoly(n)which outputs the stringron input the stringc(say, innlog lognsteps). Ifhis a good collision-resistant hash function, then it was shown in [Bar01, BG02], no efficient proverP∗ can output a satisfying witnessωto the statementλ (sampled in an interaction with the honest verifier). However, a simulator can commit toh(V∗)(instead of 0n) so that it will have a valid witness to the resulting transcriptλ.

Coming back to our protocol, we use this idea as follows. We modify our first idea, and require the ver-ifier to send a the obfuscation of a new programMλ,s(instead ofMx,s) whereλ=hh, c, riis the transcript

ofGenStat. The new programMλ,s outputssif and only if it receives a valid witnessω to the statement

λ(as described earlier) and0non all other inputs. To prove the statementxwill be proven by proving the knowledge of either a witness w tox or the secrets (using an ordinary witness-indistinguishable proof-of-knowledge (WIPOK)). A simulator can “succeed” in the simulation as before: it commits to verifier’s program incto obtain (an indistinguishable statement)λ, then uses the fake witnessω(which it now has) to execute the programMf_λ,s(ω)and learnsand complete theWIPOKusings.

We now draw attention to some important points arising due to the use ofλin the obfuscation (instead ofx). First, the length of the fake witnessωthat the simulator has depends on the length of the program of V∗. Since the protocol needs to take into accountV∗ ofeverypolynomial length, the obfuscated program

3_{Barak’s (bounded-concurrent}_ZK_{) protocol [Bar01] and recent construction of Chung, Lin, and Pass [CLP13b] require at least}

six rounds even after optimizations; the recent protocol of Gupta and Sahai [GS12b] requires five rounds.

4_{By security of}_iO,

f

Mx,s c

(4)

f

Mλ,smust accept inputsωof arbitrary, a-priori unknown, polynomial length. In other words, the obfuscated

programMf_λ,smust be aTuring machinewhich accepts inputs of arbitrary, a-priori unknown, (polynomial)

length. Therefore, we will have to use program obfuscation for Turing machines.

Second, the statementλ = hh, c, riis not a “false” statement since an all powerful prover can always find collisions inhand obtain a satisfying input toMλ,s. The only guarantee we have is that ifλis sampled

as above, then it would behard for any efficient prover—even those with a valid witness tox—to find a satisfying input forMλ,s. Therefore, unlike before (whenxwas used instead ofλ), obfuscationsMf_λ,sand

f

Mλ,0n are not guaranteed to be indistinguishable if we use aniO-secure obfuscation; this is because the

Turing machinesMλ,sandMλ,0nare not functionally equivalent. Therefore, we will have to usediO-secure

obfuscation (since finding a differing input is still hard for these programs). As a matter of fact, we will need to assumeauxiliary inputdiOas discussed later.

By putting these ideas together, we actually a get a standaloneZKprotocol forNP(summarized below). The protocol needs to use some kind of reference tosother than the obfuscated program. This is done by using af(s) wheref is a one-way function. This protocol has a “straight line” simulator. Further, unlike Barak’s protocol, this protocol does not use universal arguments (and hence the PCP theorem).

Standalone Zero-Knowledge using Obfuscation. The protocol has three stages.

1. Stage-1 is the 3 round preamble GenStat: V sends a CRHF h, P sends a commitment c =

Com(0n;u)andV sends a randomr← {0,1}n_.

2. In stage 2, V sends (f,_es,Mf_λ,s) where f is a one-way function, _es = f(s), and Mf_λ,s is the

obfuscation of Turing machine Mλ,s described earlier and λ = hh, c, ri is the transcript of

stage-1.V also proves that(f,s,_eMf_λ,s)are correctly constructed (using a standardZKproof).

3. In stage-3 P provess, using a standard WIPOK, the knowledge of “either a witnesswto xor secretssuch that_es=f(s).”

Standalone ZK of this protocol can be proven by following Barak’s simulator which commits to the code ofV∗ and therefore has anω for simulated statementλsuch thatMf_λ,s(ω) = swithin a polynomial

number of steps; the simulator computessand uses it in theWIPOK. The soundness of the protocol relies on thediO-security of obfuscation. Indeed, following [Bar01], for a properly sampledλ, it is hard to find ωsuch thatMλ,s(ω)6=Mλ,0n(ω), and therefore it is hard to distinguishMf_λ,sfromMf_λ,₀n bydiO-security

of obfuscation. Now, soundness is argued using three hybrid experiments: first use the simulator of theZK

protocol in stage 2, then replaceMf_λ,s from Mf_λ,₀n, and finally extracts from theWIPOK in stage 3 and

violate the hardness of one-way functionf (sincexis false, extraction must yields).

The issue of auxiliary information. An important point we wish to highlight here is that of auxiliary input. A cheating prover P∗ in the protocol above, will have access to the statements λ in addition to the obfuscated programMf_λ,s. Therefore,λis theauxiliary informationthat the receiver of the obfuscated

program already has! Therefore, we must require the obfuscation to satisfy the (stronger) notion ofauxiliary inputdiO[ABG+13, BCP14] w.r.t. the transcripts ofGenStat(i.e., Barak’s preamble).

1.2 Towards Constant Round Concurrent Zero-knowledge

(5)

In the context of our protocol, this schedule will havensessions interleaved recursively as follows: session ndoes not “contain” any messages of any other session, and all messages of sessioniare contained between messagesci−1andri−1of sessioni−1for everyi, starting fromi=n. For completeness, this scheduling is shown in figure 1 (towards the end of the paper) with respect to 3 sessions. The double-headed arrows marked byπirepresent the rest of the messages of thei-th session. Roughly speaking, the simulation fails

because of the following: in order to simulate session i, the simulator needs to extract the secret si by

running the programMf_λ_i_,s_i; however, the execution ofMf_λ_i_,s_icontains an execution ofMf_λ_i₊₁_,s_i₊₁and due

to this recursion, simulator’s total running time in session 1 is exponential inn.

More formally, consider the scheduling given in figure 1. Lett3≥1be the time taken by the verifier in computingr3 on input the stringc3. Then clearly, the time taken by the simulator in running the obfuscated machine Mf_λ₃_,s₃ isT₃ ≥ t3. Then, if t₂ denotes the time taken by the simulator to obtain string r2, we

have that t2 ≥ t3 +T3 ≥ 2t3. Clearly, the time taken by the simulator to extract s2 by running the program Mf_λ₂_,s₂ will be at least T2 ≥ t2 ≥ 2t3. By repeating this argument for session 1, we have that T1 ≥t1 ≥ t2+T2 ≥ 2t2 ≥22t3. Repeating this argument fornsessions in the DNS schedule, the total time taken by the simulator will be≥2n−1.

V1 V2 V3

h1

c1

-h2

c2

-h3

c3

-r3, Mfλ3,s3

π3

-r2, Mfλ2,s2

π2

-r1, Mfλ1,s1

π1

-Figure 1: DNS scheduling for our protocol

(6)

of) a program which recursively runs such a program for every interleaved session betweenciandri. That

is, the programMf_λ_i_,s_i ends uprecomputingall of the secrets of the interleaved sessions even though they

have already been computed.

We can avoid this recomputation as follows. LetIbe an oracle which takes as input queries of the form (f,_es)—wheref is aninjective one-way function andesis in the range off—and returns the unique value

ssuch that f(s) = es.

5 _{Now consider an arbitrary program} _ΠI _{which has access to the inversion oracle}

I. Clearly, ifris chosen randomly, then for any (fixed) programΠI and any fixed inputa, the probability thatΠI(a) = r is at most2−n. This is because once the description of the oracle program Πh·i is fixed, the output ofΠI(a)is deterministically fixed (for any fixed inputachosen prior to seeingr) andrhits this value with probability at most2−n.

Our main point here is that it is hard to come up with a satisfying “fake witness” ω to the transcripts λ = hh, c, ri even if the program committed incis given access to the inversion oracle I. On the other hand, the simulator can still predictr as before. However, more importantly, by means of the oracleI we can avoid the recursive re-computation of the secrets in the concurrent setting as follows.

Consider analternativesimulatorSh·i which will be given access to the oracleI. This simulator will have access to both, the program of the verifier V∗ as well asits own program, given as explicit inputs, collectively denoted asΠh·i_S,V∗. The simulator, on input a session indexi, will work by initiating an execution

ofV∗. It will commit to programΠh·i_S,V∗(j) in sessionj (ignoring for the moment the fact that simulator needs fresh randomness); finally, this simulatordoes not run any obfuscated programto compute the secrets. Instead it queries the oracleIon “well formed”(fj,esj)for every sessionj 6=i; whenj=iit simply returns

the stringri. Then, if all goes well, observe that programΠh·i(i)predicts stringriin polynomial time (given I) and this holds for every sessioni. In particular, there is no recursive recomputation of the secrets since they can be fed to the program directly once they have been computed. We note that such an oracle was first used by Deng, Goyal, and Sahai [DGS09] to construct the firstresettably-sound resettable zero-knowlege

protocol forNP.

It should be clear that the actual simulation will be performed by a “main” simulatorSmainwhich will

not have access to anyinversion oracle, and run in (strict) polynomial time. The main simulator will run in the same manner as the alternative simulatorSh·i except that instead of usingI, it will run the obfuscated programs (only once for each session) to recover the secrets. To ensure efficient simulation, once a session secret has been recovered, it will be stored in a global tableT (which will be used to simulate answers of I). Therefore the “fake witness” will now have the formω=hu,Πh·i,T), but the statements will still have the same formλ=hh, c, ri; and we require thatΠT outputsrwithin finite steps. These requirements will be formally captured by defining a relationRsimw.r.t. the preambleGenStatin Section 4. We will discuss the overview of four round construction in Section 6.

1.3 Related Work

Concurrent zero-knowledge and non-black-box simulation. From early on, it was understood and ex-plicitly proven in [FS90, GK96], that zero-knowledge is not preserved under parallel repetition where mul-tiple sessions of the protocol run at the same time. The more complex notion of concurrent zero-knowledge (cZK) was introduced and achieved by Dwork, Naor, and Sahai [DNS98] (assuming “timing constraints” on the underlying network). A large body of research on cZK studied the round-complexity of black-box concurrentZKwith improving lower bounds on the same [KPR98, Ros00, CKPR03]. The state of art is the

5

We assume that it is easy to test thatfis injective and thatesis in the range off. These requirements are only for simplicity

(7)

lower-bound is by Canetti, Kilian, Petrank, and Rosen [CKPR03] who prove that black-box cZK requires at leastO(logn/log logn)rounds wherenis the length of the statements being proven. Prabhakaran, Rosen, and Sahai [PRS02], building upon the prior works of Richardson and Kilian [RK99] and Kilian and Pe-trank [KP01], presented a cZK protocol forNPwhich hasOe(logn)rounds, matching the lower bound of

[CKPR03].

The central open question in this area is to construct a constant round cZK protocol forNPlanguages based on standard (or at least reasonable) assumptions. Barak [Bar01] showed that in thebounded concur-rentsetting where there is an a-priori upper bound on the number of sessions, there exists a constant round non-black-box cZK protocol forNP; the protocol is based on the existence of collision-resistant hash func-tions [Bar01] and uses universal arguments [Kil92, Mic94, Kil95, BG02]. The communication complexity of Barak’s protocol depends on the a-priori bound on the sessions.

It has proven difficult to extend Barak’s NBBtechniques to the setting of fully concurrentZK(i.e., to unbounded polynomially many sessions) ino(logn)rounds. Nevertheless, NBBtechniques have enjoyed great success resulting in the construction of resettable protocols [BGGL01, DL07, DGS09, GM11], non-malleable protocols [Bar02, PR05b, PR05a], leakage-resilientZK[Pan14], bounded-concurrent secure com-putation [PR03, Pas04], adaptive security [GS12a], and so on. Bitanksy and Paneth [BP12a] showed that it is possible to perform non-black-box simulation using oblivious transfer (instead of collision-resistant hash functions and universal arguments). This eventually led to the construction of resettablly-soundZK

under one-way functions [BP13a, CPS13, COPV13]. Goyal [Goy13] presents a non-black-box simulation technique in the fully concurrent setting and achieves the first public-coin cZK protocol in the plain model.6 An alternative approach to construct round-efficient zero-knowledge proofs is to use “knowledge as-sumptions” [Dam91, HT99, BP04]. The recent work of Gupta and Sahai [GS12b] shows that such assump-tions also yield a constant round concurrentZKprotocol forNP. However, all knownZKprotocols based on knowledge-type assumptions do not yield an explicit simulator. This is because the knowledge-type assumptions assume the existence of a special “extractor” machine (which is not explicitly known); this extractor is used by the simulator ofZKprotocols and only provides an “existential” result.

Chung, Lin, and Pass [CLP13b] recently presented the first construction of a constant-round fully con-current ZKprotocol which has an explicit simulator. Their result is based on a new complexity-theoretic assumption, namely the existence of so called “strongP-certificates.”

Another alternative proposed in the literature is to assume some kind of a setup such as timing con-straints, (untrusted) public-key infrastructure, and so on [DNS98, DS98, CGGM00, Dam00, Gol02, PTV10, GJO+13] or switch to super-polynomial time simulation [Pas03, PV08]. We will not consider such models further in this work.

Program obfuscation. After the strong impossibility results of [BGI+01], research in program obfusca-tion proceeded in two main direcobfusca-tions. The first line of research focussed on constructing obfuscaobfusca-tion for specific functionalities such as point functions and their variants, proxy re-encryption, encrypted signatures, hyperplanes, conjunctions, and so on[Wee05, LPS04, HRSV07, Had10, CRV10, BR13a]. The other line of research focussed on finding weaker definitions and alternative models. Goldwasser and Rothblum [GR07] considered the notion ofbest possible obfuscation(and is equivalent toiO when the obfusactor is polyno-mial time); and Bitansky and Canetti [BC10] consideredvirtual grey boxsecurity. Alternative models for obfuscation such as the hardware model were considered in [GIS+10, BCG+11].

After [GGH+13], an improved construction ofiOwas presented by Barak et. al. [BGK+13]. Further,

6

(8)

in an idealized “generic encodings” model it is shown thatVBB-obfuscation for all circuits can be achieved [CV13, BR13b, BGK+13]. These results often involve a “bootstrapping step”; Applebaum [App13] presents an improved technique for bootstrapping obfuscation. Further complexity-theoretic results appear in recent works of Moran and Rosen [MR13], and Barak et. al. [BBC+14].

Sahai and Waters [SW13] show that indistinguishability obfuscation is a powerful tool and use it to successfully construct several (old and new) cryptographic primitives; further applications ofiOappear in [HSW13, BZ13, BCP14, KRW13, MO13]

Differing input obfuscation was studied by Ananth et. al. [ABG+13], who present a candidate construc-tion ofdiOfor the class of polynomial time Turing machines and demonstrate new applications. Another variant of their construction allows the Turing machines to accept variable length inputs. Concurrent work of Boyle, Chung, and Pass [BCP14] introduces a related notion ofextractability obfuscationand shows con-ditions under which this notion (anddiO) are implied byiO. In addition, it also presents obfuscation for the class of polynomial time Turing machines, building upon the work of Brakerski and Rothblum [BR13a].

The issue ofauxiliary informationin program obfuscation was first considered by Goldwasser and Kalai [GK05], and further explored in [GK13, BCPR13, BP13b]. The work of Bitansky, Canetti, Paneth, and Rosen [BCPR13] shows that ifiOexists then “extractability primitives” such as knowledge-types assump-tions and extractable one-way funcassump-tions [CD09] cannot exist in the presence ofarbitraryauxiliary informa-tion. Boyle and Pass [BP13b] strengthen this result further by showing a pair of (universal) distributions Z,Z’ on auxiliary information such that either extractable OWF w.r.t. Z do not exist or extractability-obfuscations w.r.t.Z’ do not exist.

2 Preliminaries

We use standard notations which are recalled here. This section can be skipped without affecting readability.

Notation. For a randomized algorithmA we writeA(x;r) the process of evaluatingA on inputx with random coinsr. We writeA(x)the process of sampling a uniformrand then evaluatingA(x;r). We define A(x, y;r)andA(x, y)analogously. We denote byNandRthe set of natural and real numbers respectively. The concatenation of two stringaandbis denoted byakb.

We assume familiarity with interactive Turing machines (ITMs). For two randomized ITMs A and B, we denote by [A(x, y) ↔ B(x, z)] the interactive computation between A and B, with A’s inputs (x, y)andB’s inputs(x, z), and uniform randomness; and[A(x, y;rA) ↔ B(x, z;rB)]when we wish to

specify randomness. We denote byVIEWP[A(x, y) ↔ B(x, z)]andOUTP[A(x, y) ↔ B(x, z)]the view

and output of machine P ∈ {A, B} in this computation. Finally, TRANS[A(x, y) ↔ B(x, z)] denotes the transcript of the interaction [A(x, y) ↔ B(x, z)] which consists of all messages exchanged in the computation.

We also assume familiarity withoracleTuring machines, which are ordinary TMs with an extra tape called theoracle communication tape. An oracleTMsAwill be written asAh·i to insist that it is an oracle

TM; in addition, we writeAI whenA’s oracle is fixed toI. Recall that each query toI counts as one step towards the running time ofAI.

(9)

Two ensembles{Xn}n∈Nand{Yn}n∈Nare said to becomputationally indistinguishable, denoted{Xn} c ≈ {Yn}, if for all non-uniform probabilistic polynomial time (PPT) distinguishersD, sufficiently largen, and

every advice stringzn:|Prx←Xn[Dn(x) = 1]−Pry←Yn[Dn(y) = 1]| ≤negl(n), where we writeDn(a)to

denotedD(n, zn, a), andneglis a negligible function. The statistical distance between two probability

dis-tributionsX andYover the same supportSis denoted by∆(X, Y) = 1₂P

a∈S|Pr[X=a]−Pr[Y =a]|.

We say that ensembles {X_n}n∈_N and {Yn}n∈_N are statistically indistinguishable (or statistically close),

denoted{Xn} s

≈ {Yn}, if there exists a negligible function neglsuch that∆ (Xn,Yn) ≤ negl(n) for all

sufficiently largen.

Standard primitives. In this work, we will be using a family ofinjectiveone-way functions. In addition, unless specified otherwise, we assume that all functionsf ∈ Fn in the family have anefficiently testable range membership: i.e., there exists a polynomial time algorithm to test thaty∈Range(f)whereRange(f) denotes the range off.

We will also be using a family ofcollision resistant hash functions(CRHF){Hn}whereh :{0,1}∗ → {0,1}poly(n) _for _h _{∈ H}

n; recall that {Hn} is a CRHF family if there exists a negligible function negl

such that for every non-uniformPPTmachines A, every sufficiently largen, and every advice stringzn:

Prh←Hn[h(x) =h(y) : (x, y)←A(zn, h)]≤negl(n).

Finally, we will also be using a non-interactive, perfectly binding commitment schemefor committing strings of polynomial length. A commitment to a stringm using randomness u will be denoted by c =

Com(m;u). Without loss of generality, we assume that the messagemcommitted to inccan be recovered given the randomnessuand the stringc. We assume perfectly binding schemes purely for the simplicity of exposition. One can replaceComby the 2-round statistically-binding commitment scheme of Naor [Nao89] without affecting our results.

2.1 Interactive Proofs, Proofs of Knowledge, and Witness Indistinguishability

We recall the standard definitions of interactive proofs [GMR85], witness indistinguishability [FS90], and proofs of knowledge [GMR85, TW87, FFS88, FS90, BG92, PR05b].

Definition 2.1 (Interactive Proofs). A pair of probabilistic polynomial time interactive Turing machines hP, Viis called an interactive argument system for a language L ∈ NPwith witness relationRif there exists a negligible functionnegl:N→Rsuch that the following two conditions hold:

• Completeness: for everyx∈L, and every witnesswsuch thatR(x, w) = 1, it holds that

Pr[OUTV[P(x, w)↔V(x)] = 1] = 1.

• Soundness: for everyx /∈L, every interactive Turing machineP∗ running in time at mostpoly(|x|), and everyy ∈ {0,1}∗_,

Pr[OUTV[P∗(x, y)↔V(x)] = 1]≤negl(|x|).

If the soundness condition holds for every (not necessarily PPT) machine P∗ then hP, Vi is called an interactiveproof system.

(10)

soundness error is defined in terms of the statement length|x|, in cryptographic contexts, it is convenient to define it in terms of the security parameter n, and write negl(n). This is without loss of generality, since in our setting since|x| = poly(n). Also, in this work, we will use words “argument” and “proof” interchangeably throughout the paper.

Definition 2.2(Proof of Knowledge). LethP, Vi be an interactive proof system for a languageL ∈ NP with witness relationR. We say thathP, Vi is a proof of knowledge(POK) for relationR if there exists a polynomial pand a probabilistic oracle machineE (called theextractor) such that for every PPT ITM

P∗, there exists a negligible functionneglsuch that for everyx ∈ L, and every(y, r) ∈ {0,1}∗ such that qx,y,r := Pr[OUTV[Px,y,r∗ ↔V(x)] = 1]>0wherePx,y,r∗ denotes the machineP∗ whose common input,

auxiliary input, and randomness are fixed to x, yand r respectively and the probability is taken over the randomness ofV, the following conditions holds:

• the expected number of steps taken byEPx,y,r∗ _{is bounded by} p(|x|)

qx,y,r, whereE

Px,y,r∗ _{is machine}_E_with

oracle access toP_x,y,r∗ ;

• except with negligible probability,EPx,y,r∗ _outputs_w∗_{such that}_R₍_{x, w}∗_{) = 1.}

Definition 2.3(Witness Indistinguishable Proofs). LethP, Vibe an interactive proof system for a language L ∈ NPwith witness relationR. We say thathP, Viiswitness indistinguishable (WI) for relationRif for everyPPT ITMV∗, every statementx ∈ L, every pair of witnesses(w1, w2)such that R(x, wi) = 1

for everyi∈ {1,2}, and every (advice) stringz ∈ {0,1}∗, it holds that{VIEW(1)_|x|} ≈ {c VIEW(2)_|x|}where

{VIEW(_|x|i)}:=VIEWV∗[P(x, w_i)↔V∗(x, z)].

As before, w.l.o.g., we can replace |x|by the security parameter n in all definitions above. We remark that there exists aWIPOK withstrict polynomial time extraction inconstant roundsusing non-black-box techniques [BL04] and inω(1)rounds using black-box techniques [GMR85, Blu87].

Three round, public-coinWIPOKand ZAPs. The classical protocols of [GMR85, Blu87], based on the existence of non-interactive perfectly binding commitment schemes, are 3-roundwitness indistinguishable, proof of knowledge(WIPOK) protocols (for every language inNP). We will use Blum’s protocol [Blu87] as a building block and denote its three messages byhα, β, γi, whereβis random string of sufficient length.7

AZAPfor a languageL, introduced by Dwork and Naor [DN00], is atwo round witness indistinguish-ableinteractive proof forL.ZAPs can be constructed from a variety of assumptions such as non-interactive zero-knowledge proofs [BFM88, BSMP91] (which in turn can be based on trapdoor permutations [FLS99]) and verifiable random functions [MRV99]. In fact, evennon-interarctive(i.e., one round) constructions for

ZAPs for all ofNPexist based on bilinear pairings [GOS06] and derandomization techniques [BOV03]. We will use the two round construction of [DN00] based on NIZK as a building block and denote its two messages byhσ, πiwhere σ is a randomly string of sufficient length. An important property of this construction isadaptive soundness: the statement to be proven can be chosen afterthe stringσ has been sent by the verifier. We will rely on this property in our security proofs.

7

(11)

2.2 Concurrent Zero Knowledge

We now recall the notion of concurrent zero-knowledge [DNS98] in which one considers a “concurrent adversary”V∗ who interacts in many copies ofP, proving adaptively chosen, possibly correlated, polyno-mially many statements. We follow conventions established in [DNS98, PRS02, Ros04].

Concurrent attack. Theconcurrent attackon an interactive proof systemshP, Vifor languageL∈NP with witness relationRconsiders an arbitrary interactiveTMV∗which opens at mostm=m(n)sessions for an arbitrary polynomialmwith arbitrary auxiliary inputz ∈ {0,1}∗. Let~x := {xi} ∈ Lm be set of

statements in Lof length at most poly(n), and w~ := {wi}i∈[m] be such thatR(xi, wi) = 1. The attack

proceeds by uniformly fixing the random coins of V∗ and initiating its execution on input the security parametern∈_Nand auxiliary inputz. At each step,V∗either initiates a newsession—in which case a new prover instanceP(xi, wi)with fresh randomness is fixed who interacts withV∗in sessioni; orV∗schedules

the delivery of a message of an existing session in which the corresponding prover instance responds with corresponding message. There is no restriction on howV∗schedules the messages of various sessions. We say thatV∗ launchesm-concurrent attackonhP, Vi. The output of the attack consists of the view ofV∗, denotedVIEWhP,V_V∗ i(n, m, ~x, ~w, z).

Definition 2.4(Concurrent Zero Knowledge). We say that an interactive proof systemhP, Vifor a language L∈NP(with witness relationR) isconcurrent zero knowledgeif for every polynomialm:N→N, every

PPT ITMV∗ launching a m-concurrent attack, there exists a PPTmachine SV∗ such that for every set

~

x := {xi} ∈Lmof statements of length at mostpoly(n), everyw~ :={wi}i∈[m]such thatR(xi, wi) = 1,

and every auxiliary inputz∈ {0,1}∗it holds that

SV∗ n, ~x, z

n∈N c ≈

VIEWhP,V_V∗ i n, m, ~x, ~w, z

n∈N

.

MachineSV∗is called thesimulator.

In what follows, we will sometimes abuse the notation and writeV∗ to also mean the descriptionof the Turing machineV∗. However, when we want to be explicit about the description of a Turing machineM (including V∗), we will actually write desc(M). For the simulator, we may sometimes write SV∗(·) :=

S(V∗,·)to insist that the program ofV∗ is given as an explicit input to the simulator (and dropnfrom the notation). Further, we will assume a (unique) session identifier for each session represented by a string of lengthn; this session identifier can be chosen byV∗ so long as it is unique for every session. W.l.o.g. we assume that the all-ones string1n(not to be confused with the unary representation of the security parameter) is never used as a session identifier and denotes a special symbol.

3 Differing Input Obfuscation for Turing Machines

(12)

3.1 Definitions

LetSteps(M, x)denote the number of steps taken by aTMM on input x; we use the convention that if M does not halt on x then Steps(M, x) is defined to be the special symbol ∞. We define the notion of “compatible Turing machines” and “nice sampler.” A pair ofTMs(M0, M1)is said to be compatible if they have the same size, and more crucially, for every inputxifM0halts onxthenM1also halts onxin thesame number of steps. I.e., for everyx, Steps(M0, x) = Steps(M1, x). We then consider sampling algorithms

Sampwhich output a pair of compatibleTMs(M0, M1), and say thatSampis “nice” if noPPTadversary Acan produce anxsuch that:M0(x)6=M1(x)and bothM0, M1halt within a polynomial number of steps on inputx. This requirement, or some variant of it, is necessary [ABG+13, BCP14].

Definition 3.1(CompatibleTMs). A pair of Turing machines(M0, M1)is said to becompatibleif|M0|= |M1|and for every stringx∈ {0,1}∗it holds thatSteps(M0, x) =Steps(M1, x).

By our convention, the second condition implies thatM0halts onxif and only if M1 halts onx.

Definition 3.2(NiceTMSampler). We say that a (possibly non-uniform)PPTTuring machineSampis a

nice sampler for Turing machinesif the following conditions hold:

1. the output ofSampis a triplet(z, M0, M1)such that(M0, M1)isalwaysa pair ofcompatibleTMs, andz∈ {0,1}∗is a string;

2. there exists a negligible functionneglsuch that for every polynomiala :N → N, every sufficiently largen∈_N, and every (possibly non-uniform)TMArunning in time at mosta(n), it holds that:

Pr

(z, M0, M1)←Samp(1n) ; A(z, M0, M1) =x;

Steps(M0, x)≤a(n) ; M0(x)6=M1(x).

≤negl(n).

Some remarks are in order. First, note that sinceM0, M1 are always compatible and Steps(M0, x) ≤ a(n), we have that Steps(M1, x) ≤ a(n). Further, the “event” in the parentheses above can actually be tested inpolynomialtime. This is because every step defining this event can be performed in polynomial time. Finally, note that since the definition quantifies over all polynomialsa, it allowsAto produceany

inputxso long asM0, M1halts onxwithin a polynomial number of steps.

The first output of Sampabove will be used as auxiliary information in the definition below. We will denote thedistribution of first outputofSampbyZ.

Differing input obfuscator. We now present the definition of aZ-auxiliary differing input obfuscatorfor Turing machines. Roughly speaking, the notion states that a machineOis aZ-auxiliarydiOfor (possibly non-uniform) efficiently samplableZ = {Z_n} if the following holds: if there exists aPPTdistinguisher Dwho distinguishesO(M0)fromO(M1)when given auxiliary inputz← Zn, then it is easy to find anx

(givenz) such thatM0(x)6=M1(x). In other words, if it is hard to find the “differing input”xthen the two obfuscations are indistinguishable.

We now present the definition below, following [ABG+13]. We note that since we want to be explicit about the distribution of the auxiliary information (the first output of the sampling algorithmSamp), we will denote it byZ.

Definition 3.3(Z-auxiliary Differing Input Obfuscator for Turing Machines). A uniformPPTmachineO is called adiffering input obfuscatorfor a class of Turing machines{Mn}if the following conditions are

(13)

1. Polynomial slowdown and functionality:there exists a polynomialadio such that for everyn ∈ N, everyM ∈ M_n, every inputx such thatM halts onx, and everyMf ← O(n, M), the following

conditions hold:

• Steps(M , xf )≤a_dio

n, Steps(M, x)

• Mf(x) =M(x)

Polynomialadiois called theslowdown polynomialofO.

2. Indistinguishability: for every nicesamplerSamp (i.e., satisfying definition 3.2)whose first output is distributed according toZ, there exists a negligible functionneglsuch that for every polynomial a:N→ N, every sufficiently largen∈N, and every (possibly non-uniform)TMDrunning in time at mosta(n), it holds that:

PrD(z,O(n, M0)) = 1 : (z, M0, M1)←Samp(1n)

−PrD(z,O(n, M1)) = 1 : (z, M0, M1)←Samp(1n)

≤ negl(n).

MachineDis called thedistinguisher.

3.2 Candidate Constructions

As noted earlier, our reduction requires the existence ofZ-auxiliarydiO for the class of all polynomial size Turing machines which accept inputs of arbitrary polynomial length (inn) and halt within polynomial steps with respect to all (possibly nonuniform) efficiently samplable distributionsZ that arehard over the statements of La_simfor every polynomiala. A candidate construction for this primitive appears in the work of [ABG+13]. Their construction is based ondiOof the class of all polynomial-size circuits (constructed in [GGH+13]), fully homomorphic encryption (e.g., [Gen09, BV11], and SNARKs [BCCT13] (which re-quire knowledge-type assumptions). If an a-priori bound on the input is known, then comparatively better constructions are possible [ABG+13, BCP14].

Our requirements from obfuscation are actually weaker than stated above. We do not need obfuscation for the class ofall(polynomial size and running time) Turing machines; instead we only require the obfus-cation of the machineSimLockwhich (receive inputs of arbitrary, a-priori unknown, polynomial length and) halt within a polynomial number of steps. We also do not need security w.r.t. every hard distributionZover La_sim; instead, we only need to assume that it holds for the statementsλthat are transcripts of theGenStat

protocol (with an arbitrary cheating proverP₁∗). Interestingly, this kind of advice can be simulated using the distributionZ∗that simply outputs(h, r); therefore distributionZ0can actually beuniform distributionifh is a “public-coin”CRHF[HR04] making it a more plausible assumption.

As we have to come to learn [GK05, GK13, BCPR13, BP13b], security w.r.t. arbitrary auxiliary in-puts might be too strong an assumption. Bitansky, Canetti, Paneth, and Rosen [BCPR13] show that either indistinguishability obfuscation does not exist for all circuits or for every OWF-family F there exists an auxiliary input distributionZF w.r.t. whichF is not an extractable OWFfamily [CD09]. Boyle and Pass

[BP13b] further strengthen this result by showing a pair of distributionsZ,Z’ such that either extractable

(14)

necessarily contradict the conjectured security of candidate construction of [ABG+13] w.r.t. the auxiliary input distributions we need (namely transcripts ofGenStatorZ∗_{mentioned above). Nevertheless, we hope}

that candidate constructions based on better complexity-theoretic assumptions will be discovered for this primitive in the future.

4 Relation

R

sim

and A Nice Sampler

In this section, we define the preambleGenStat, relationsRsim,Ra_sim, and prove that a randomly sampled transcript ofGenStatis a hard distribution over the statements of languageLsim(corresponding to relation Rsim). For convenience, we use a non-interactive perfectly binding commitment scheme; the two-round statistically-binding commitment scheme of [Nao89] also works.

4.1 PreambleGenStat

Statement generation protocol. Let{H_n}be a family of collision-resistant hash functions (CRHF)h ∈ Hnsuch thath :{0,1}∗ → {0,1}nandCombe a non-interactive perfectly-binding commitment scheme

for{0,1}n_{. The statement generation protocol}_GenStat _:= _h_P

1, V1iis a three round protocol betweenP1 andV1which proceeds as follows:

ProtocolGenStat:=hP1, V1i:

1. V1 sends a randomh← Hn

2. P1 sends a commitmentc=Com(0n;u)whereuis a randomly chosen

3. V1 sends a random stringr← {0,1}n

The transcript of the protocol isλ:=hh, c, ri.

4.2 RelationRsim

We now define the relation Rsim. Let {F_n}n∈_N and {Hn}n∈_N denote the family of injective one-way

functions and collision-resistant hash functions respectively. LetCombe a non-interactive, perfectly binding (string) commitment scheme. The relation is formally defined in figure 2. The statements for the relation Rsim are the transcripts λ := (h, c, r) and the witnesses are of the formω := (u,Πh·i,T)such that cis a commitment to the oracle-TMΠh·i using randomness u,T is a table containing answers to all inversion queries thatΠh·i makes (for functionsf ∈ Fn), andΠT outputsr.8

An important observation regradingRsim is that since tableT is not a part of the commitmentc(and it should not be), we must enforce thatΠh·inever makes any invalid queries toT. This is because after seeing r, it is easy to design a “bad” tableT which will encoderby means of “bad” entries and “satisy”λ.

Relation Rsim is undecidable in general. For convenience, we define a decidable, polynomial time, version ofRsim, denoted byRa_simwherea:N→Nis a polynomial, as follows.

RelationRa_simand languageLa_sim: Leta:N→Nbe a polynomial; relationRasimis identical toRsim except that the witness(u,Πh·i,T)satisfies following additional constraints:

8_{For simplicity, we assume that it easy to test the that functions}_f_{are injective and whether a given element is in the range of}

(15)

Instance: A tuplehh, c, ri ∈ Hn× {0,1}poly(n)× {0,1}nwhereh:{0,1}∗→ {0,1}n.

Witness: A tuplehu,Πh·i,T i ∈ {0,1}poly(n)_{× {0,}_1}∗_{× {0,}_1}∗_where_Πh·i_{is an}_oracle_Turing machine, andT is a table containing entries of the form(f,_es, s)such that when queried on(f,_es),

T returnss, denotedT(f,_es) =s.

Relation: Rsim hh, c, ri, hu,Πh·i,T i= 1if and only if all of the following conditions hold:

1. c=Com h Πh·i ; u

2. ∀(f,_es, s)∈ T it holds thatf ∈ Fnis aninjectivefunction andf(s) =_es

3. ProgramΠT, takes no input, outputs the stringr, and halts

4. ProgramΠT makes oracle queries of the form(f,s)_e such that:

∀queries (f,_es) ∃s s.t. (f,_es, s)∈ T

Figure 2: RelationRsimbased on a perfectly binding commitmentCom.

1. T

≤ a(n),

2. ΠT halts in at mosta(n)steps.

LetLsim(resp.Lasim) be the language corresponding to relationRsim(resp.,Rasim). Note thatLasim∈ NP. Note thatRa_simcan be tested in timepoly(a(n)) =poly(n).

Hard distributions overLa_sim. We say thatZ ={Zn}is ahard distributionover the statements ofLa_sim

if there exists a negligible function negl such that for every non-uniform PPT algorithm A∗ and every sufficiently largenit holds that

Pr[λ← Zn;ω←A∗(1n, λ);Rasim(λ, ω) = 1]≤negl(n).

The following lemma states that the transcripts ofGenStatform a hard distribution overLsim. That is, it is hard for anyPPTmachineP₁∗to compute a witnessωto statementsλwhenλis the transcript ofGenStat

betweenP₁∗ and an honestV1. The proof follows [Bar01].

Lemma 4.1 (Hardness of GenStat). Assume that {H_n} is a family of collision-resistant hash functions against (non-uniform)PPTalgorithms. There exists a negligible functionneglsuch that for every (non-uniform)PPTTuring machineP₁∗ , the probability thatP₁∗, after interacting with an honestV1in protocol

GenStat, writes a stringωon its (private) output tape such thatRsim(λ, ω) = 1is at mostnegl(n), whereλ

is the transcript of interaction betweenP₁∗andV1, and the probability is taken over the randomness of both P₁∗andV1.

(16)

every polynomial a, but as we shall see in the proof, adoes not actually play a role. Therefore, we have chosen to avoid the use ofaand directly state the lemma in terms ofRsim.

Proof of lemma 4.1. Assume, on the contrary, that there exist polynomialsp, qand a proverP₁∗ such that P₁∗ takes at most p(n) steps and writes a stringω on its private output tape such that for infinitely many values ofn, δ(n) ≥ 1/q(n) where δ(n) is the probability that Rsim(λ, ω) = 1 (where λ is sampled as defined in the lemma). Now consider the machineP₁∗in an execution ofGenStatand let(h, c)be the first two messages in this interaction. Let the machineP₁∗_,h,cdenote the machineP₁∗whose state has been frozen up to the point wherec is sent in this execution. By a standard averaging argument, it follows that with probability at leastδ/2 over the sampling of(h, c) in this interaction, the probability thatP₁∗_,h,c writes a valid witnessωat the end of the interaction is at leastδ/2. We call such(h, c)“good.”

The following procedure finds collisions in h provided (h, c) are good: the procedure chooses two random stringsr1, r2each of lengthn, feedsP1∗withr1 and then withr2 separately; letωi = (ui,Π

h·,i i ,Ti)

be the contents of the private output tape ofP₁∗_,h,c when fed with stringri fori ∈ {1,2}. The procedure

outputs(Π1,Π2)as the potential collision onh.

We claim that the procedure finds collisions inhwith noticeable probability as follows. Note that since (h, c)is good, with probabilityδ2/4, it holds thatRsim(λi, ωi) = 1whereλi= (h, c, ri). Hence,ΠT_ii =ri

andh(Π1) =h(Π2)w.h.p. sincecis perfectly binding.

Now, define I to be an inversion oracle which on input a query of the form (f,_es) for f ∈ F_n and

e

s∈Range(f)outputss=f−1(es). Then, by definition ofRsim(in particular, due to condition 4 in figure

2), we have that the output ofΠTi

i is the same as that ofΠ I

i. I.e.,Π I

i outputsri. SinceΠIi is adeterministic

computation, it holds thatΠ1 andΠ2 are different programs wheneverr1 6=r2 (which happens with prob. 1−2−n). Further, sinceP₁∗runs in time at mostp(n), programsΠ1,Π2are of size at mostp(n). Therefore, Π1andΠ2 are collisions inh, found with probability at least δ

2

4 ·(1−2

−n₎_≥_δ2_/_8.

It follows that collisions can be found for a noticeable (specifically, at leastδ/2) fraction of functions in {Hn}with noticeable probability (specifically,δ2/8). This concludes the proof.

4.3 A Nice Sampler forTM

ProtocolGenStatallows us to build a (non uniform) sampling algorithmSampwhich will be nice according to definition 3.2. In addition, the distribution of first output ofSampwill give us a hard distribution over the statement ofLa_simfor every polynomiala.

Sampuses the following simpleTM:9

SimLock(λ, ω, s):

Test ifRsim(λ, ω) = 1, and if so outputs; Else, output the empty string0n.

Also, for a fixed (λ, s), define SimLockλ,s(·) := SimLock(λ,·, s). Machine SimLockλ,s essentially

tests whether the input is a valid witness to λ, and if so outputs the fixed value s, and nothing other-wise. Note that it is possible that SimLock does not halt on some inputs. Also, w.l.o.g., we assume thatSteps(SimLockλ,s1, ω) =Steps(SimLockλ,s2, ω)for everyλ, ωand(s1, s2){0,1}

n_{× {}₀_,₁_}n_.

9

(17)

The sampler. The sampling algorithm,Samp_s,P∗

1 is defined with respect to a strings ∈ {0,1}

n _{and an}

arbitraryPPTinteractiveTMP₁∗. ITMP₁∗follows the instructions ofGenStatprotocol and interacts with algorithmV1. The distribution of first output ofSamp_s,P∗

1 is independent ofs, and captured by separately defining distributionZ_P∗

1 :={Zn,P ∗ 1}

Samp_s,P∗ 1(1

n₎_.

• Sample a random transcriptλofGenStatby interacting withP₁∗honestly according toV1

• Output λ, SimLockλ,s, SimLockλ,0n

DistributionZn,P∗ 1.

• Output a randomly sampled transcriptλofGenStat, obtained by honestly interacting withP₁∗.

Note that the first output of Samp_s,P∗ 1(1

n₎ _and _Z

n,P∗1 are distributed identically for every (n, s). The following lemma is essentially a corollary of lemma 4.1.

Lemma 4.2. For every non-uniformPPT TMP₁∗, and everys∈ {0,1}n_,_Samp s,P∗

1 is a nice sampler for

Turing machines (according to definition 3.2). Further, for every polynomiala :N → N,Zn,P∗

1 is a hard

distribution overLa_sim.

Proof. Observe that the pair(SimLockλ,s, SimLockλ,0n)isalwaysa pair of compatibleTMs, by definition

ofSimLock. Now suppose that the second property of definition 3.2 is not satisfied. Then there exists anA, running in time at mosta(n) for some polynomiala, which outputs anxwith noticeable probability such thatSimLockλ,s(x) 6= SimLockλ,0n(x), and Steps(SimLock_λ,s, x) ≤ a(n); here the probability is taken

over the sampling of λ(which in turn is distributed according to Zn,P∗

1). It follows, from the definition ofSimLockλ,s, thatxmust be a witness to λand thereforeAis aPPTmachine which finds witnesses to

statementsλ∈La_simwith noticeable probability. We can useAto violate lemma 4.1 as follows.

Consider the machineB₁∗_,s which incorporates P₁∗ andA. It then samplesλby routing messages be-tween P₁∗ and an external (honest) V1, and returns the output of A λ,SimLockλ,s,SimLockλ,0n. It is

straightforward to see thatB₁∗_,s violates lemma 4.1 (for every fixeds). Further,λis distributed according Z_n,P∗

1 andais an arbitrary polynomial, this also proves the second part of the lemma.

5 A Simpler Variant of Our Protocol

In this section, we describe the simpler version of our protocol, namelySimple-cZK; it is a (fully) concurrent zero-knowledge protocol in constant (but not four) rounds. LetPdenote the prover algorithm andV denote the verifier algorithm. Informally, the protocol has three stages:

1. In stage 1,PandV sample a statementλ= (h, c, r)for the relationRsimusing the protocolGenStat.

2. In stage 2, V sends the imageseof a randomly chosen inputsunder an injective OWFf and also

sendsf. Additionally,

(a) V also sends an obfuscation of the machineSimLockλ,swhich outputs son every inputω for

whichRsim(λ, ω) = 1.

(18)

3. In stage 3,P proves that either it knows a witnesswtoxor it knows the pre-imagef−1(_es).

The formal description of protocolSimple-cZKappears in figure 3. The main result of this section is the following theorem.

Theorem 5.1. Assume the existence of collision-resistant hash functions and injective one-way functions. Further, for every polynomial a : N → N, and every hard distribution Z over the statements of Lasim,

assume the existence of Z-auxiliary differing-input obfuscation (diO) for the class of all polynomial-size Turing machines that halt in a polynomial number of steps.10 Then, there exists a constant round, fully concurrent zero-knowledge protocol with negligible soundness, for all languages inNP.

We prove the above theorem by proving that protocolSimple-cZKis a fully concurrent zero-knowledge protocol with negligible soundness error (Theorem 5.6). It is clear that the protocol has constant rounds and perfect completeness. The soundness and concurrent-ZKproperties of this protocol are proven in next two sections.

Inputs. The common input toPandV is a statementx∈Lwhere languageL∈NP. The prover’s auxiliary input is a witnesswsuch thatR(x, w) = 1. The security parameternis an implicit input to both parties.

Protocol. The protocol proceeds in three stages.

Stage 1: PandV execute theGenStatprotocol in whichV sends the first messageh← Hn,Psends the second messagec=Com(0n_;_u)_{for a random}_u_{, and}_V _{sends the final message}_r_{← {0,}_1}n_. Letλ=hh, c, ribe the transcript.

Stage 2: V samples an injective one-way functions f ← Fn, a random inputs ∈ {0,1}n, and a sufficiently long random tapeζ∈ {0,1}poly(n)_{and computes:}

e

s=f(s), Mfλ,s← O(SimLockλ,s; ζ) (5.1)

V sends(f,_es,Mfλ,s), and proves using a constant roundZKprotocol (sayΠZK) that there exist (s, ζ)satisfying equation (5.1) above.

Stage 3: Pproves toV, using a 3-roundWIPOK(sayΠWIPOK) the knowledge of either:

• wsuch thatR(x, w) = 1; OR

• ssuch thatf(s) =_es.

Verifier’s output: V accepts if the proof in stage 3 succeeds; otherwise, it rejects.

Figure 3: The simpler variant of our protocol:Simple-cZK.

5.1 Soundness

Lemma 5.2. Simple-cZKhas negligible soundness error.

10

We note that we actually do not need obfuscation for the class of allPPTTuring machines. Instead, we only need obfuscation for those Turing machines of the formSimLocka

whereais a polynomial andSimLocka

(19)

Proof. LetP∗ be a non-uniform cheating prover who succeeds in proving a false statement x /∈ L with some non-negligible probability. There are two parts of the proof:

• first part shows thatP∗cannot compute the secretswith noticeable probability,

• second part shows that ifx /∈LandP∗ convincesV with noticeable probability, then it can be used to computeswith noticeable probability—violating the first part.

We start with the first part of the proof. LetP_x,z,ρ∗ denote the prover algorithmP∗with non-uniform advice zand random tape fixed toρ. Further define the following two machines:

MachineP₁∗_,₍_x,z,ρ₎: This machine is identical toP_x,z,ρ∗ except that it only executesstage 1of the protocol, i.e., the GenStat part, aborts the rest of the execution and halts. The transcripts of this machine’s interactions are of the formλ= (h, c, r).

MachineP₂∗_,₍_x,z,ρ₎: This machine is identical toP_x,z,ρ∗ except that it only executes first two stages of the protocol, namelystage 1andstage 2, aborts the rest of the execution and halts. The transcripts of this machine’s interaction containλ=hh, c, ri,(f,s,_eMf_λ,s) ), and the transcript of theZKprotocol.

Observe that the sampler Samp_s,P∗

1,(x,z,ρ), defined in section 4.3, is a well defined machine for everys with respect to our first algorithmP₁∗_,₍_x,z,ρ₎. Further, it is a nice sampler due to lemma 4.2. We now prove thatP₂∗_,₍_x,z,ρ₎cannot learn the inverse ofes. That is,

Claim 5.3. The probability thatP₂∗_,₍_x,z,ρ₎, after an interacting with the honest verifierV in protocolSimple-cZK, writes a strings∈ {0,1}n_{on its private output tape such that}

e

s=f(s)

is at mostnegl(n)where(λ, (f,es,Mfλ,s) )is the (partial) transcript of the interaction and the probability is taken over the randomness ofV.

Proof.Assume on the contrary thatP₂∗_,₍_x,z,ρ₎does write a stringssatisfying the lemma with non-negligible probabilityδ = δ(n). We show how to use this machine to invert the injective functionf in polynomial time. We start by consider the following machineB₂∗_,₍_x,z,ρ₎which takes no input.

MachineB∗₂_,₍_x,z,ρ₎:

1. The machine incorporatesP₂∗_,₍_x,z,ρ₎and interacts with it by playing the role ofV honestly until the end of stage 1. Letλbe the transcript of this stage.

2. At the start of stage 2,B₂∗_,₍_x,z,ρ₎generates(f,_es,Mf_λ,s)honestly.

3. B₂∗_,₍_x,z,ρ₎employs the simulator of theZKprotocol and samples a a view forB₂∗_,₍_x,z,ρ₎.

4. At the end of simulation,B∗₂_,₍_x,z,ρ₎outputs the contents ofP₂∗_,₍_x,z,ρ₎’s private output tape.

By construction, the view ofP₂∗_,₍_x,z,ρ₎is simulated perfectly byB₂∗_,₍_x,z,ρ₎until the theZKprotocol begins. Therefore, from the properties of theZK-simulator, it holds that the view ofP₂∗_,₍_x,z,ρ₎at the end of the sim-ulation is computationally indistinguishable from its view in a real execution with V. It follows that the outputs ofB₂∗_,₍_x,z,ρ₎is a stringssuch thatf(s) =s_ewith probabilityδ0 ≥δ−negl(n).

To build the inverter for f, the next step is to slightly modify B₂∗_,₍_x,z,ρ₎: instead of computing the

(20)

MachineB∗∗₂_,₍_x,z,ρ₎: This machine is identical toB₂∗_,₍_x,z,ρ₎except at the start of stage 1 it sends(f,_es,Mf_λ,₀n)

where:

f

Mλ,0n ← O(SimLock_λ,₀n ; ζ)

always outputs0non all inputs andλis the transcript of stage 1.

We claim thatB₂∗∗_,₍_x,z,ρ₎outputsssuch thatf(s) =_eswith probabilityδ00 ≥δ0−negl(n). To prove this, we construct a hybrid machineH₂∗. MachineH₂∗violates the indistinguishability property of the obfuscator Owith respect to the nice samplerSamp_s,P∗

1,(x,z,ρ) for everys∈ {0,1} n_.

MachineH₂∗: The machine proceeds as follows:

1. It samples a random f ∈ F_n, s ∈ {0,1}n_{. Sends}_s_{to the challenger, who feeds} _H∗

2 with a challenge(λ,Mf_b)wherebis a random bit and:

(λ,SimLockλ,s,SimLockλ,0n)←Samp_s,P∗ 1,(x,z,ρ)

f

M0← O(n,SimLockλ,s),Mf₁ ← O(n,SimLock_λ,₀n)

Note that the state of P∗ at the end of stage-1 can be completely defined by specifying the transcript of stage-1 (and in particular,(h, r)). Letst1 denote the state of the prover when the transcript is fixed toλ(sampled above).

2. Run the prover P∗ from the statest1 and complete stage 2 as follows: send the tuple(f,es =

f(s),Mf_b)at the start of the stage 2, and proceed exactly asB₂∗_,₍_x,z,ρ₎. I.e., use the simulator of

theZKprotocol to complete stage 2 and output the contents ofP₂∗_,₍_x,z,ρ₎’s private output tape.

Now observe that since the randomnessρ has been fixed, the statest1 sampled by Samps,P∗

1,(x,z,ρ) is dis-tributed identically to the state ofP_x,z,ρ∗ at the end of stage 1. Further, whenb= 0, the rest of the execution (and output) ofH₂∗ is distributed identically to that ofB₂∗_,₍_x,z,ρ₎, and whenb = 1 it is identical to that of B₂∗∗_,₍_x,z,ρ₎. Due to lemma 4.2,Samp_s,P∗

1,(x,z,ρ) is a nice sampler.H ∗

2therefore violates the indistinguishability property of the obfuscatorOunless|δ00−δ0| ≤negl(n).

Therefore, the machineB₂∗∗_,₍_x,z,ρ₎ outputsswith probabilityδ00. However, note thatB₂∗∗_,₍_x,z,ρ₎does not need to know the value ofs, and executes perfectly even if(f,_es)are given by an outside challenger. There-fore, B₂∗∗_,₍_x,z,ρ₎ is actually an inverter for f ∈ F_n, succeeding with the final probability δ −negl(n). It follows thatδmust be negligible inn, establishing the claim.

We now come to the second part of the proof. Suppose thatx /∈Land the success probability ofP_x,z,ρ∗ is not negligible. This means that there exists a polynomialqsuch that for infinitely many values ofn,P_x,z,ρ∗ succeeds with probabilityδ2(n)≥ 1/q(n). We show how to build a proverP₂∗_,₍_x,z,ρ₎which violates claim 5.3.

Let ExtWIPOK be the knowledge extractor corresponding to the 3-round WIPOK. For concreteness,

assume thatExtWIPOK is a black-box extractor which uses cheating proverP_WIPOK∗ as an oracle. Letp := p(n)be the polynomial associated with the running time of the extractorExtWIPOK; i.e.,ExtWIPOKextracts

the witness in expected time _Pr[_P∗ p(n)

WIPOKsucceeds]. The machineP ∗∗

2,(x,z,ρ) incorporates the original proverP

∗

(21)

Cheating proverP₂∗∗_,₍_x,z,ρ₎:

1. Initiate an execution of Simple-cZKwith an external V, routing its messages to the internal proverP₂∗, up to stage 2. Letst2 be the state ofP2∗ at the end of stage 2. Denote the residual prover byP_st∗₂.

2. Stop the execution with outsideV after stage 2, and apply the extractorExtWIPOKto the machine

P_st∗₂.

3. If the extractor halts within 4_δp

2 steps, output whatever it outputs. Otherwise, output a random string of lengthnand halt.

Clearly, the running time ofP₂∗∗_,₍_x,z,ρ₎is at mostpoly(n) + 4p/δ2 =poly(n) + 4pq, which is polynomial. Further, by a standard averaging argument, it holds that if we fix the prover’s state tost2, then with proba-bility at leastδ2/2over the sampling ofst2, the success probability of the residual provePst∗2 is at leastδ2/2 in the remainder execution (i.e., theWIPOK). Call such statesst2“good”.

For every goodst2, the expected running time of theExtWIPOKis2p/δ2and it outputs a valid witness— specifically the secrets(since isxis false)—with probability1−negl(n). Therefore, by Markov’s inequality, ifst2is good, stoppingExtWIPOKafter4p/δ2steps (twice the expectation), still producesswith probability at least(1−negl(n))/2. Sincest2is good with probability at leastδ2/2, we have that our extractor outputs swith probability at leastδ2/2×1/2−negl(n)≥δ2/8. This contradicts claim 5.3 unlessδ2 is negligible. This completes the proof of soundness.

Proof of knowledge property of Simple-cZK. We wish to note that the current construction may not satisfy the proof (argument) of knowledge property since for many statesst2ExtWIPOKmight take too long.

Nevertheless, by a simple and standard modification, it is possible to build a witness extended emulator

[Lin01, BL04] for our protocol. This is usually sufficient for most applications ofPOK. Alternatively, we can use aWIPOKwith strict polynomial time extraction, in which case protocolSimple-cZKalso becomes aPOK.

5.2 The Simulator

The simulator is described in two parts. First we describe an “internal simulator”Sh·i, which requires access to an oracle I that inverts injective one-way functions. This internal simulator is invoked by a “main” simulatorSmain, which is described later.

Before jumping into the full description of our simulators, we make a few remarks to aid our description:

1. The internal simulator is essentially a “light weight twin” of the main simulator. Meaning that it is

identicalto the main simulator in all respects except that it does not run the “heavy” computation for computing the simulation trapdoor (i.e., the secrets). The internal simulator simply makes queries to the inversion oracle, denotedI.

2. The main simulator therefore commits to its “light weight twin” as the program which will determin-isticallypredict the stringrin every session opened byV∗.

(22)

4. The above circularity can be avoided as follows. We allow the internal “twin” simulator to only have acommitmentto each bit of the random tape. The twin can access the bit by making a query to the inversion oracleI. SinceI only inverts (injective) one-way functions, we implement “commitment to the random tape” ashardcore bitsof injective functions. The “committed” tape will be denoted by

e

ρ= (ρ_e1, . . . ,ρepoly(n))and thei-th bit of the tape will be defined as:

ρi =hcbg(g−1(ρei))

whereg is a global function fixed at the beginning by the main simulator. It will be convenient to define a procedureRandomBitwhich computes the bitρias defined here.

5. We note that only the randomness that is used to “interact” with V∗ needs to be the same for both simulators. The main simulator can have additional randomness, which is not available to its twin, for other tasks.

We now describe the twin simulator S to be used internally by the main simulator. Without loss of generality, we assume that the string1n is never chosen as a session identifier for any session; it will be used as a special trigger during execution of the simulator. We will use the words “internal simulator” and “twin simulator” interchangeably to mean the simulatorSh·i described below. For concreteness, we recall the inversion oracleIhere and fix a procedureRandomBitI_g,

e

ρas follows:

Inversion oracleI. TheinversionoracleI(·)takes queries of the form(f,_es)wheref ∈ Fnis a injective

function from the family of (certified) injective functions{Fn}, andesis an element in the range off.

The oracle returns the (unique) values=f−1(_es)if one exists; otherwise it returns a special symbol ⊥.

ProcedureRandomBitI_g,

e

ρ(k). The procedure is defined for an injective function g ∈ Fn and a string

e

ρ = (ρe1,ρe2, . . .)of arbitrary length such that every componentρekis in the range ofg. On input an

integerk, the procedure returns:

ρk=hcbg(~ρk) where ρ~k← I(g,ρek)

Twin simulatorSI(i, Ah·i, B, z, g,ρ_e). AlgorithmSis an oracleTM. The input to the algorithm consists of a “session identifier”i∈ {0,1}n_{, a string}_A_{interpreted as an (interactive) oracle}_TM_{, a string}_B_interpreted

as an interactiveTM, a stringz∈ {0,1}∗ interpreted as an advice string, a stringgdescribing aninjective

one-way function, and a sufficiently long string ρ_e = (ρ_e1,ρe2, . . .) ∈ {0,1}

∗ _{such that every component}

e

ρi ∈Range(g).

Simulator S has access to an inversion oracle I(·) as defined above. Inputs (g,ρ_e) fix the implicit random tapeofS which is the unique bit-stringρ = (ρ1, ρ2. . .) where bitρk = RandomBitIg,ρe(k).

11 _If,

during its execution,Sneeds to accessρkit callsRandomBitg,eρ(k).

The simulatorSI computes its output as described below. It writeszon the auxiliary input tape ofB, and a sufficiently long random string (taken fromρ) on the random tape ofB. It then initiates an execution ofB. IfTMBlaunches a concurrent attack w.r.t. the protocolSimple-cZK,Sproceeds as described below. Otherwise, ifBdeviates from the concurrent attack,S aborts the execution, outputting a special symbol⊥.

11