Homomorphic Signatures for Polynomial Functions

(1)

Homomorphic Signatures for Polynomial Functions

DANBONEH∗ Stanford University, USA [email protected]

DAVIDMANDELLFREEMAN† Stanford University, USA [email protected]

April 5, 2011

An extended abstract of this work appears inAdvances in Cryptology — EUROCRYPT 2011, ed. K. Paterson, Springer LNCS6632(2011), 149–168. This is the full version.

Abstract

We construct the first homomorphic signature scheme that is capable of evaluating multivariate polynomials on signed data. Given the public key and a signed data set, there is an efficient algorithm to produce a signature on the mean, standard deviation, and other statistics of the signed data. Previous systems for computing on signed data could only handle linear operations. For polynomials of constant degree, the length of a derived signature only depends logarithmically on the size of the data set.

Our system uses ideal lattices in a way that is a “signature analogue” of Gentry’s fully homomorphic encryption. Security is based on hard problems on ideal lattices similar to those in Gentry’s system.

Keywords.Homomorphic signatures, ideals, lattices.

1 Introduction

While recent groundbreaking work has shown how to compute arbitrary functions onencrypteddata [18,39, 45], far less is known about computing functions onsigneddata.

Informally, the problem of computing on signed data is as follows. Alice has a numerical data set m1, . . . , mk of size k(e.g., final grades in a course withk students). She independently signs each da-tummi, but before signing she augmentsmiwith a tag and an index. More precisely, Alice signs the triple (“grades”, mi, i)fori= 1, . . . , kand obtainskindependent signaturesσ1, . . . , σk. Hereiis the index ofmi in the data set and the tag “grades” serves as a label that names the data set and binds its members together. For convenience we write~σ:= (σ1, . . . , σk). The data set and theksignatures are stored on some untrusted remote server.

Later, the server is asked to compute authenticated functions of the data, such as the mean or standard deviation of subsets of the data. To compute a functionf, the server uses an algorithmEvaluate(pk,·, f, ~σ) that uses~σandf to derive a signatureσon the triple

“grades”, m:=f(m1, . . . , mk), hfi

, (1.1)

wherehfi is an encoding of the functionf, i.e., a string that uniquely describes the function. Note that

Evaluatedoes not need the original messages — it only acts on signatures. Now the pair(m, σ)can be published and anyone can check that the server correctly appliedf to the data set by verifying that σ is a signature on the triple (1.1). The derived signature authenticates both the functionf and the result of

∗

Supported by NSF, DARPA, and AFOSR. †

(2)

applyingf to the data. The pair(m, σ)can be used further to derive signatures on functions ofmand other signed data. We give precise definitions of the system’s syntax and security below.

Our focus here is on functions that perform arithmetic operations on the data set, such as mean, standard deviation, and other data mining algorithms. Current methods for computing on signed data can handle only linearfunctions [26,12,48,8,17,7]. In these systems, givenkindependently signed vectorsv1, . . . ,vk defined over some finite fieldFp, anyone can compute a signature on any vectorvin theFp-linear span of {v1, . . . ,vk}, and no one without the secret key can compute a valid signature on a vectorvoutside this span. The original motivation for these linear schemes comes from thenetwork codingrouting mechanism [15].

In this paper we present the first signature system that supports computingpolynomial functionson signed data. Specifically, our system supports multivariate polynomials of bounded degree. For a constant-degree polynomial with bounded coefficients, the length of a derived signature only depends logarithmically on the size of the data set. Thus, for example, given a signed data set as input, anyone can compute a short signature on the mean, standard deviation, least squares fit, and other functions of arbitrary subsets of the data. Note that computing standard deviation requires only a quadratic multivariate polynomial; other applications, discussed in Section2.4, may require cubic or higher degree polynomials. While our system intrinsically computes on data defined over a finite fieldFp, it can be used to compute on data defined over the integers by

choosing a sufficiently large field sizep.

Our system’s functionality and security derive from properties of certaininteger lattices. As a “warm-up” to our main result, we describe in Section4 a linearly homomorphic scheme built from random integer lattices that uses the same underlying ideas as our polynomial scheme. Interestingly, this construction gives a homomorphic system overF2 that allows linear functions of many more inputs than the best previous such

system [7]. In Section6we show how replacing the random lattices in the linear scheme withideal lattices leads to a polynomially homomorphic scheme — our main result. In Section7 we describe an abstract signature scheme that incorporates the properties of both of these instantiations. We prove the security of our abstract scheme and show how our proof specializes to the concrete instantiations.

We note that a trivial solution to computing on signed data is to have the server send the entire data set to the client along with all the signatures and have the client compute the function itself. With our constructions only the output of the function is sent to the client along with a short signature. Beyond saving bandwidth, this approach also limits the amount of information revealed to the client about the data set, as formalized in Section2.2.

Related work. Before delving into the details of our construction, we mention the related work on non-interactive proofs [29,44,22] where the prover’s goal is to output a certificate that convinces the verifier that a certain statement is correct. Micali’s computationally sound (CS) proofs [29] with Valiant’s extension to proofs of knowledge [44] can solve the problem discussed above as follows: Alice signs the pair(τ, D) whereDis a data set andτ is a short tag used to nameD. She sendsD, τ and the signatureσto the server. Later, for some functionf, the server publishes τ, t:=f(D), πwhereπis a short proof of knowledge that there exists a data setDand signature σsuch thatt = f(D)andσ is a valid signature by Alice on (τ, D). This tuple convinces anyone thattis the result of applyingf to the original data setDlabeledτ and signed by Alice. Security is proved using Valiant’s witness extractor [44] to extract a signature forgery from a cheating server. The construction ofπuses the full machinery of the PCP theorem and soundness is in the random oracle model. Gentry and Wichs [21] recently showed that the security of such succinct argument systems cannot be proved in the standard model by a black-box reduction to a falsifiable assumption. We note that computational soundness is sufficient in these settings since the server is given signed data and is therefore already assumed to be computationally bounded.

(3)

same amount of work as computingf(D). Moreover, anyone can further computet0 :=g(t) =g(f(D))for some functiongand useσ0to derive a signature ont0and the functiong(f(·)). While further computation can also be done with CS proofs [44], it is much simpler with homomorphic signatures.

More recently Goldwasser, Kalai, and Rothblum [22] and Gennaro, Gentry, and Parno [16] as well as [13,4] show how to outsource computation securely. In [16,13,4] and in [22] (in the non-interactive setting) the interaction between the server and the client is tailored to the client and the client uses a secret key to verify the results. In our settings the server constructs a publicly verifiable signature on the result and anyone can verify that signature using Alice’s public key.

We also mention another line of related work that studies “redactable” signatures [42,25,23,5,34,33, 11,10,9,1,36]. These schemes have the property that given a signature on a message, anyone can derive signatures on subsets of the message. Our focus here is quite different — we look at computing arithmetic functions on authenticated data, rather than computing on a subset of a single message. We also require that the derived signature explicitly authenticate the computed functionf.

Another type of signature that admits computation by a third party istransitive signatures, introduced by Rivest and Micali [30,6,24,47,46]. Transitive signatures are assigned to edges of a graph and enable anyone to derive signatures on paths in the graph.

1.1 Overview of our techniques

The intersection method. Our system uses twon-dimensional integer latticesΛ1andΛ2. The latticeΛ1

is used to sign the data (e.g., a student’s grade or the result of a computation), while the latticeΛ2is used to

sign a description of the functionf applied to the data. The message space for these signatures isZn/Λ1,

which for the lattices we consider is simply a vector space over the finite fieldFpfor some primep.

A signature in our system is a short vectorσinZnin the intersection ofΛ1+u1andΛ2+u2for certain

u1,u2 ∈Zn. In other words, we haveσ =u1 mod Λ1andσ =u2 mod Λ2. Loosely speaking, this single

signatureσ“binds”u1 andu2— an attacker cannot generate a newshortvectorσ0fromσsuch thatσ =σ0

mod Λ1 but σ 6= σ0 mod Λ2. We refer to this method of jointly signing two vectorsu1 and u2 as the

intersection method.

More precisely, letτ be a tag,mbe a message, andhfibe an encoding of a functionf. A signatureσ on a triple(τ, m,hfi)is a short vector inZnsatisfyingσ =mmod Λ1andσ = ωτ(hfi) mod Λ2. Here

ωτ is a hash function defined by the tagτ that maps (encodings of) functions to vectors inZn/Λ2. Thisωτ not only must preserve the homomorphic properties of the system, but also must enable simulation against a chosen-message adversary. Note that theΛ1 component of the signatureσ is essentially the same as a

Gentry-Peikert-Vaikuntanathan signature [20] on the (unhashed) messagem.

It is not difficult to see that these signatures areadditivelyhomomorphic. That is, letσ1be a signature on

(τ, m1,hf1i)and letσ2be a signature on(τ, m2,hf2i). With an appropriate hash functionωτ, we can ensure thatσ1+σ2 is a signature on τ, m1+m2, hf1+f2i

. If we setΛ1= (2Z)n, we obtain a more efficient

linearly homomorphic signature overF2than previously known [7].

Now letg∈_Z[x]be a polynomial of degreenand letRbe the ringZ[x]/(g). ThenRis isomorphic to Znand ideals inRcorrespond to integer lattices inZnunder the “coefficient embedding.” We choose our

two latticesΛ1andΛ2to be prime idealspandqinRand a signature on the triple(τ, m, f)to be a short

element inRsuch thatσ=mmodpandσ =ωτ(hfi) modq. With this setup, letσ1andσ2be signatures

on(τ, m1,hf1i)and(τ, m2,hf2i)respectively. Then for an appropriate hash functionωτ,

σ1+σ2 is a signature on τ, m1+m2, hf1+f2i

and σ1·σ2 is a signature on τ, m1·m2, hf1·f2i

.

(4)

particular, the quadratic polynomialv(m1, . . . , mk) := Pki=1(kmi −Pki=1mi)2 that computes a fixed multiple of the variance can easily be evaluated this way. Anyone can calculate the standard deviation from v(m1, . . . , mk)andkby taking a square root and dividing byk.

Our use of ideal lattices is a signature analogue of Gentry’s “somewhat homomorphic” encryption system [18]. Ideal lattices also appear in the lattice-based public key encryption schemes of Stehl´e, Ste-infeld, Tanaka, and Xagawa [41] and Lyubashevsky, Peikert, and Regev [28] and in the hash functions of Lyubashevsky and Micciancio [27].

Unforgeability. Loosely speaking, a forgery under a chosen message attack is a valid signature σ on a triple(τ, m,hfi)such thatm 6= f(m1, . . . , mk), wherem1, . . . , mk is the data set signed using tag τ. We show that a successful forger can be used to solve the Small Integer Solution (SIS) problem in the latticeΛ2, which for randomq-ary lattices and suitable parameters is as hard as standard worst-case lattice

problems [32]. WhenΛ2 is an ideal lattice we can then use the ideal structure to obtain a solution to the

Shortest Independent Vectors Problem (SIVP) for the (average case) distribution of lattices produced by our key generation algorithm. As is the case with existing linearly homomorphic signature schemes, our security proofs are set in the random oracle model, where the random oracle is used to simulate signatures for a chosen message attacker.

Privacy. For some applications it is desirable that derived signatures be private. That is, ifσis a signature on a messagem:=f(m1, . . . , mk)derived from signatures on messagesm1, . . . , mk, thenσshould reveal no information aboutm1, . . . , mkbeyond what is revealed bymandf. Using similar techniques to those in [7], it is not difficult to show that our linearly homomorphic signatures satisfy a privacy property (also defined in [7]) calledweak context hiding. Demonstrating this property amounts to proving that the distribution obtained by summing independent discrete Gaussians depends only on the coset of the sum.

Interestingly, we can show that our polynomially homomorphic signature is not private. It is an open problem either to design a polynomially homomorphic signature scheme that is also private, or to modify our scheme to make it private.

Length efficiency. We require that derived signatures be not much longer than the original signatures from which they were derived; we define this requirement precisely in Section2.3. All of our constructions are length efficient.

2 Homomorphic Signatures: Definitions and Applications

Informally, a homomorphic signature scheme consists of the usual algorithmsKeyGen,Sign,Verifyas well as an additional algorithmEvaluatethat “translates” functions on messages to functions on signatures. If~σis a valid set of signatures on messagesm~, thenEvaluate(f, ~σ)should be a valid signature forf(m~).

To prevent mixing of data from different data sets when evaluating functions, the Sign, Verify, and

Evaluatealgorithms take an additional short “tag” as input. The tag serves to bind together messages from the same data set. One could avoid the tag by requiring that a new public key be generated for each data set, but simply requiring a new tag for each data set is more convenient.

Formally, a homomorphic signature scheme is as follows:

(5)

• Setup(1n, k). Takes a security parameternand a maximum data set sizek. Outputs a public keypk

and a secret keysk. The public keypkdefines a message spaceM, a signature spaceΣ, and a setF of “admissible” functionsf:Mk_{→ M.}

• Sign(sk, τ, m, i). Takes a secret key sk, a tag τ ∈ {0,1}n_{, a message} _m _{∈ M}_{and an index}_i _∈ {1, . . . , k}, and outputs a signatureσ ∈Σ.

• Verify(pk, τ, m, σ, f). Takes a public keypk, a tagτ ∈ {0,1}n_{, a message}_m_{∈ M, a signature}_σ _∈_Σ, and a functionf ∈ F, and outputs either 0 (reject) or 1 (accept).

• Evaluate(pk, τ, f, ~σ). Takes a public key pk, a tagτ ∈ {0,1}n_{, a function}_f _{∈ F}_{, and a tuple of} signatures~σ ∈Σk_{, and outputs a signature}_σ0 _∈_Σ.

Letπi:Mk→ Mbe the functionπi(m1, . . . , mk) =mithat projects onto theith component. We require thatπ1, . . . , πk∈ F for allpkoutput bySetup(1n, k).

For correctness, we require that for each(pk,sk)output bySetup(1n, k), we have:

1. For all tagsτ ∈ {0,1}n_{, all}_m_{∈ M, and all}_i_{∈ {1}_{, . . . , k}_}, ifσ←Sign(sk, τ, m, i), then with overwhelming probability

Verify(pk, τ, m, σ, πi) = 1.

2. For all τ ∈ {0,1}n_{, all tuples} _m_~ _{= (}_m

1, . . . , mk) ∈ Mk, and all functions f ∈ F, if σi ←

Sign(sk, τ, mi, i)fori= 1, . . . , k, then with overwhelming probability

Verify pk, τ, f(m~), Evaluate pk, τ, f,(σ1, . . . , σk)

, f= 1.

We say that a signature scheme as above isF-homomorphic, orhomomorphic with respect toF.

While theEvaluatealgorithm in our schemes can take as input derived signatures themselves produced byEvaluate, doing so for a large number of iterations may eventually reach a point where the input signatures toEvaluateare valid, but the output signature is not. Therefore, to simplify the discussion we limit the correctness property to require only thatEvaluateproduce valid output when given as input signatures~σ produced by theSignalgorithm.

For ease of exposition we describe our systems as if all data sets consist of exactly k items. It is straightforward to apply the systems to data sets of size`for any`≤k, simply by interpreting a function on `variables as a function onkvariables that ignores the lastk−`inputs. The definitions of unforgeability and privacy below can be adapted accordingly.

2.1 Unforgeability

The security model for homomorphic signatures allows an adversary to make adaptive signature queries on data sets of his choosing, each containing (up to)kmessages, with the signer randomly choosing the tagτ for each data set queried. Eventually the adversary produces a message-signature pair(m∗, σ∗)as well as an admissible functionf and a tagτ∗. The winning condition captures the fact that there are two distinct types of forgeries. In atype 1 forgery, the pair(m∗, σ∗)verifies for some data setnotqueried to the signer; this corresponds to the usual notion of signature forgery. In atype 2 forgery, the pair(m∗, σ∗)verifies for some data set thatwasqueried to the signer, but for whichm∗does not equalf applied to the messages queried; in other words, the signature authenticatesm∗asf(m~)but in fact this is not the case.

(6)

Definition 2.2. A homomorphic signature schemeS = (Setup,Sign,Verify,Evaluate)isunforgeableif for allkthe advantage of any probabilistic, polynomial-time adversaryAin the following game is negligible in the security parametern:

Setup: The challenger runs Setup(1n_{, k}₎ _{to obtain}₍_pk_,_sk₎_{and gives}_pk _to_A_{. The public key defines a} message spaceM, a signature spaceΣ, and a setFof admissible functionsf:Mk_{→ M.}

Queries:Proceeding adaptively,Aspecifies a sequence of data setsm~i ∈ Mk. For eachi, the challenger choosesτiuniformly from{0,1}nand gives toAthe tagτiand the signaturesσij ←Sign(sk, τi, mij, j)for j= 1, . . . , k.

Output:Aoutputs a tagτ∗ ∈ {0,1}n_{, a message}_m∗ _{∈ M}_{, a function}_f _{∈ F}_{, and a signature}_σ∗_∈_Σ. The adversarywinsifVerify(pk, τ∗, m∗, σ∗, f) = 1and either

(1) τ∗ 6=τifor alli (atype 1 forgery), or

(2) τ∗ =τifor someibutm∗6=f(m~i) (atype 2 forgery). TheadvantageofAis the probability thatAwins the security game.

2.2 Privacy

Privacy in our setting means that given signatures on a data setm~ ∈ Mk_{, the derived signatures on messages} f1(m~), . . . , fs(m~)do not leak any information aboutm~ beyond what is revealed byf1(m~), . . . , fs(m~)for some functionsf1, . . . , fsknown to the attacker. While we are hiding the original data setm~, we are not hiding the fact the derivation took place. We want the verifier to test that the correct function was applied to the original data.

More precisely, as in [7] we define privacy for homomorphic signatures using a variation of a definition of Brzuska et al. [10]. The definition captures the idea that given signatures on a number of messages derived from two different data sets, the attacker cannot tell which data set the derived signatures came from, and furthermore that this property holds even if the secret key is leaked. We call signatures with this privacy propertyweakly context hiding. The reason for “weak” is that we assume the original signatures on the data set are not public.

The termcontext hidingwas first used by Ahn et al. [1] to define a strong notion of privacy that requires derived signatures to be distributed as independent fresh signatures on the same message. This requirement ensures privacy even if the original signatures are exposed. We call the privacy concept defined belowweak context hidingto clarify that it provides weaker privacy than full context hiding defined in [1].

Definition 2.3. A homomorphic signature schemeS = (Setup,Sign,Verify,Evaluate) isweakly context hidingif for allk, the advantage of any probabilistic, polynomial-time adversaryAin the following game is negligible in the security parametern:

Setup: The challenger runsSetup(1n, k)to obtain(pk,sk)and givespkandsktoA. The public key defines a message spaceM, a signature spaceΣ, and a setF of admissible functionsf:Mk_{→ M.}

Challenge: Aoutputs(m~∗₀, ~m∗₁, f1, . . . , fs) withm~∗0, ~m∗1 ∈ Mk. The functionsf1, . . . , fs are inF and satisfy

fi m~∗0

=fi m~∗1

for alli= 1, . . . , s.

(7)

computes a signatureσi :=Evaluate(pk, τ, fi, ~σ)onfi(m~∗b). It sends the tagτ and the signaturesσ1, . . . , σs toA. Note that the functionsf1, . . . , fscan be output adaptively afterm~∗0, ~m∗1are output.

Output:Aoutputs a bitb0.

The adversaryAwins the game ifb=b0. TheadvantageofAis the probability thatAwins the game. Winning the weak context hiding game means that the attacker was able to determine whether the challenge signatures were derived from signatures onm~∗₀or from signatures onm~∗₁. We say that the signature scheme iss-weakly context hidingif the attacker cannot win the privacy game after seeing at mostssignatures derived fromm~∗₀orm~∗₁.

2.3 Length efficiency

We say that a homomorphic signature scheme islength efficientif for a fixed security parametern, the length of derived signatures depends only logarithmically on the sizekof the data set. More precisely, we have the following:

Definition 2.4. LetS= (Setup,Sign,Verify,Evaluate)be a homomorphic signature scheme. We say that Sislength efficientif there is some functionµ:N→Rsuch that for all(pk,sk)output bySetup(1n, k), all

~

m= (m1, . . . , mk)∈ Mk, all tagsτ ∈ {0,1}n, and all functionsf ∈ F, if

σi ←Sign(pk, τ, mi, i) fori= 1, . . . , k,

then for allk >0, the derived signatureσ:=Evaluate pk, τ, f,(σ1, . . . , σk)

has bit length

len(σ)≤µ(n)·logk

with overwhelming probability1

A trivial scheme. A trivial construction is one where algorithmEvaluateoutputs the functionf along with all of the message-signature pairs in the source data set.2 The recipient can then evaluate the functionf for himself from the originally signed data. Unlike our constructions, this scheme is not length efficient since the length of derived signatures islinearin the data set size. Moreover, the scheme is not context hiding since a derived signature reveals everything about the original data set.

2.4 Applications

Before describing our constructions we first examine a few applications of computing on signed data. In the Introduction we discussed applications to computing statistics on signed data, and in particular the examples of mean and standard deviation. Here we discuss more complex data mining algorithms.

Least squares fits. Recall that given an integerd≥0and a data set{(xi, yi)}ki=1consisting ofkpairs of

real numbers, thedegreedleast squares fitis a polynomialf ∈_R[x]of degreedthat minimizes the quantity Pk

i=1(yi−f(xi))2. The vector of coefficients off is denoted byf~and is given by the formula ~

f = (XTX)−1XT~y∈_Rd+1,

1

See Section3for a technical definition of “overwhelming.”

2_{Strictly speaking,}_Evaluate_{is not given the source messages as input. However, the signing algorithm in a trivial construction}

(8)

whereX∈_Rk×(d+1)_{is the}_{Vandermonde matrix}_{of the}_x

i, whosejth column is the vector(xj−1 1, . . . , x

j−1

k ), and~yis the column vector(y1, . . . , yk). The degreedis usually small, e.g.d= 1for a least squares fit with a line.

Using homomorphic signatures, a server can be given a set of individually signed data points and derive from it a signature on the least squares fitf (or more precisely, a signature on the vector of coefficients

~

f). If the signature is length efficient, then for fixeddthe length of the derived signature depends only logarithmically on the number of data pointsk. If the signatures are private, then the derived signature onf reveals nothing about the original data set beyond what is revealed byf.

We consider two types of data sets. In the first type, thex-coordinates are universal constants inZand

need not be signed. For example, the data set might contain the temperature on each day of the year, in which case thex-coordinates are simply the days of the year and need not be explicitly included in the data. Only they-coordinates are signed. Then the least squares fit is simply a linear function of~y, namelyf~:=A·~y for some fixed matrixA. The signature onf~can thus be derived using anylinearlyhomomorphic signature scheme. One complication is that our schemes sign integer values, while the matrixAwill in general contain non-integer values even if the data are pairs of integers. However, if we factor out the denominator and write A = A0/cfor some integer matrixA0 and scalarc, then the linearly homomorphic signature enables the server to derive a signature onA0~y, and anyone can divideA0~ybycto obtainf~.

The second type of data set is one in which both thex-coordinate and they-coordinate are signed. More precisely, an integer data set{(xi, yi)}ki=1is signed by signing all2kvalues separately (but with the same

tag) to obtain2ksignatures. The server is given the data set and these2ksignatures. Using a homomorphic signature scheme for polynomials of degreed2+d+ 1over the integers, the server can derive a signature on the least squares fit as follows. First, the server derives signatures on the entries of the integer matrix B :=XTX∈Z(d+1)×(d+1). A straightforward computation shows that the degrees of the entries ofBin

the variablesxiare

deg(Bij) = 

   

0 1 · · · d 1 2 · · · d+ 1

..

. ... . .. ... d d+ 1 · · · 2d

     , (2.1)

Now letc:=det(B). It follows from (2.1) that each term in the computation ofchas degreed(d+ 1)in the variablesxi, and thus the server can compute a signature oncif the signature scheme is homomorphic for polynomials of degreed2+d.

Next, the matrixcB−1, known as theadjugateofB [40, p. 117], is an integer matrix whose(i, j)th entry is(−1)i+j_{times the determinant of the}₍_{j, i}_{)th minor of}_B_{. The degrees of the entries of}_cB−1_{in the}

variablesxiare thus

deg((cB−1)ij) = 

   

d2+d d2+d−1 · · · d2 d2+d−1 d2+d−2 · · · d2−1

..

. ... . .. ...

d2 d2 −1 · · · d2−d      .

Since thejth row ofXT has degreej−1in thexi, we see that the entries of the matrixcB−1XT have degree at mostd2+din thexi. Thus if the signature scheme is homomorphic for polynomials of degree d2+d+ 1, the server can compute a signature on each entry of the integer vectorv:=cB−1XT~y∈_Zd+1_.

The signature on the least squares fit consists of the signature oncand the signatures on the entries ofv. Anyone can then computef~:=v/cto obtain the least squares fitf~. The server ends up publishing a vector v∈_Zd+1_{and a scalar}_c_∈

Zalong with thesed+ 2signatures. In particular, for a least squares fit using a

(9)

on three integer scalars. Note that this method is not private since it revealsc, which is more information about the original data set than is revealed byf~alone.

In summary, linearly homomorphic signatures are sufficient when thex-coordinates are absolute constants and polynomially homomorphic signatures are needed for general data sets.

More advanced data mining. If we had fully homomorphic signatures (i.e., supporting arbitrary com-putation on signed data), then an untrusted server could run more complex data mining algorithms on the given data set. For example, given a signed data set, the server could publish a signed decision tree (e.g., as generated by the ID3 algorithm [37]). Length efficiency means that the length of the resulting signature depends only logarithmically on the size of the data set. If the signatures were private, then publishing the signed decision tree would leak no other information about the original data set.

3 Preliminaries

Notation. For any integerq ≥2, we letZqdenote the ring of integers moduloq. Ifqis prime,Zqis a field and is denoted byFq. We letZ`×nq denote the set of`×nmatrices with entries inZq. We say a function f(n)isnegligibleif it isO(n−c)for allc >0, and we usenegl(n)to denote a negligible function ofn. We sayf(n)ispolynomialif it isO(nc)for somec >0, and we usepoly(n)to denote a polynomial function of n. We say an event occurs withoverwhelming probabilityif its probability is1−negl(n). The functionlgx is the base 2 logarithm ofx.

Lattices. Ann-dimensional latticeis a full-rank discrete subgroup ofRn. We will often be interested in integer lattices, i.e., those whose points have coordinates inZn. Among these lattices are the “q-ary” lattices

defined as follows: for any integerq ≥2and anyA∈_Z`×n_q , we define3

Λ⊥_q(A) :=

e∈_Zn_:_A_·_e₌₀_mod_q

Λu_q(A) := e∈_Zn:A·e=umodq .

The latticeΛu_q(A)is a coset ofΛ⊥_q(A); namely,Λu_q(A) = Λ⊥_q(A) +tfor anytsuch thatA·t=umodq. The Gram-Schmidt norm of a basis. Let S be a set of vectorsS = {s1, . . . ,sk} inRm. We use the

following standard notation:

• kSkdenotes the length of the longest vector inS, i.e.,max1≤i≤kksik.

• Se:={˜s₁, . . . ,˜s_k} ⊂Rmdenotes the Gram-Schmidt orthogonalization of the vectorss₁, . . . ,sk.

We refer tokeSkas theGram-Schmidt normofS. Micciancio and Goldwasser [31] showed that a full-rank set Sin a latticeΛcan be converted into a basisTforΛwith an equally low Gram-Schmidt norm:

Lemma 3.1([31, Lemma 7.1]). LetΛbe anm-dimensional lattice. There is a deterministic polynomial-time algorithm that, given an arbitrary basis ofΛand a full-rank setS={s1, . . . ,sm}inΛ, returns a basisTof Λsatisfying

kTek ≤ keSk and kTk ≤ kSk √

m/2.

3

(10)

Ajtai [2] and later Alwen and Peikert [3] showed how to sample an essentially uniform matrixA∈_Zn×m q along with a basisSofΛ⊥_q(A)with low Gram-Schmidt norm.

Theorem 3.2([3, Theorem 3.2] withδ = 1/3). Letq, `, nbe positive integers withq ≥2andn≥6`lgq. There is a probabilistic polynomial-time algorithmTrapGen(q, `, n)that outputs a pair(A∈_Z`×n

q , S ∈

Zn×n)such thatAis statistically close to a uniform inZ`×nq andSis a basis forΛ⊥q(A), satisfying keSk ≤O(p`logq) and kSk ≤O(`logq)

with all but negligible probability in`.

Gaussian distributions. Let Lbe a subset of Zn. For any vectorc ∈ Rn and any positive parameter

σ∈_R_>₀, letρσ,c(x) := exp −πkx−ck2/σ2

be a Gaussian function onRnwith centercand parameterσ.

Letρσ,c(L) :=Px∈Lρσ,c(x)be the discrete integral ofρσ,coverL, and letDL,σ,cbe the discrete Gaussian

distribution overLwith centercand parameterσ. In particular, for ally∈L, we haveDL,σ,c(y) = ρ_ρσ,_σ,_cc₍(_Ly)₎.

For notational convenience,ρσ,0andDL,σ,0are abbreviated asρσandDL,σ, respectively.

Gentry, Peikert, and Vaikuntanathan [20] construct algorithms for sampling from discrete Gaussians:

Theorem 3.3.

(a) [20, Theorem 4.1] There is a probabilistic polynomial-time algorithmSampleGaussianthat, given a basisTof ann-dimensional latticeΛ, a parameterσ ≥ keTk ·ω(√logn), and a centerc ∈_Rn_, outputs a sample from a distribution that is statistically close toDΛ,σ,c.

(b) [20, Theorem 5.9] There is a probabilistic polynomial-time algorithmSamplePrethat, given a basis

Tof ann-dimensional latticeΛ, a parameterσ≥ kTek ·ω( √

logn), and a vectort∈_Rn_{, outputs a} sample from a distribution that is statistically close toDΛ+t,σ.

Algorithm SamplePre(A,T,t, σ) simply calls SampleGaussian(T, σ,−t) and adds t to the result. For Gaussians centered at the origin, we useSampleGaussian(T, σ)to denoteSampleGaussian(T, σ,0). We use the notationSampleGaussian(Zn, σ)to denote sampling from the latticeZnwith a basis consisting of

themunit vectors.

The smoothing parameter. For ann-dimensional latticeΛand positive real >0, thesmoothing parame-terη(Λ)ofΛis defined to be the smallest positivessuch thatρ1/s(Λ∗\ {0})≤[32]. The key property of the smoothing parameter is that ifσ > η(Λ), then every coset ofΛhas roughly equal mass under the distributionDΛ,σ,c. The following lemma bounds the smoothing parameter:

Lemma 3.4([20, Lemma 3.1]). For anyn-dimensional latticeΛand any basisBofΛ, there is a negligible

(n)for whichη(Λ)≤ kBek ·ω( √

logn).

For randomq-ary latticesΛ⊥_q(A)(withqprime) there is a much better bound:

Lemma 3.5([20, Lemma 5.3]). Letqbe a prime and`, nbe integers such thatn≥2`lgq. Letf be some

ω(√logn)function. Then there is a negligible function(n)such that for all but at most aq−`fraction ofA

inZ`×nq we haveη(Λq⊥(A))< f(n).

The following lemma gives a bound on the length of vectors sampled from a discrete Gaussian. The result follows from [32, Lemma 4.4], using Lemma3.4to bound the smoothing parameter.

Lemma 3.6 ([32, Lemma 4.4]). Let Λbe ann-dimensional lattice, let Tbe a basis forΛ, and suppose

σ≥ kTek ·ω( √

logn). Then for anyc∈_Rn_{we have}

Prkx−ck> σ√n:x← DR _Λ_,σ,_c

(11)

Sums of discrete Gaussians. The privacy property of our linearly homomorphic scheme will depend on the fact that the distribution of a linear combination of samples from a discrete Gaussian is itself a discrete Gaussian. The following theorem makes this statement precise.

Theorem 3.7([7, Theorem 4.13]). LetΛ⊆Zmbe a lattice andσ∈R. Fori= 1, . . . , kletti ∈Zmand let

Xibe mutually independent random variables sampled fromDΛ+ti,σ. Letc= (c1, . . . , ck)∈Zk, and define

g:= gcd(c1, . . . , ck), t:=Pki=1citi.

Suppose thatσ >kck ·η(Λ)for some negligible. ThenZ =Pki=1ciXiis statistically close toDgΛ+t,kckσ. We will also need a generalization of Theorem3.7to multivariate distributions. For a matrixA∈_Zs×k of ranks, define

µ(A) :=pdet(A AT₎_.

IfAhas rank less thansthen to defineµ(A)we first define a matrixA¯as follows: scan the rows ofAfrom first to last adding a row toA¯only if it is linearly independent of the rows already inA¯. ThenA¯is a maximal subset of linearly independent rows ofA, and we defineµ(A) :=µ( ¯A).

Theorem 3.8([7, Theorem 4.14]). LetΛbe anm-dimensional lattice,A∈_Zs×k_and_T _∈

Zk×m. LetXbe ak×mrandom matrix whose rows are mutually independent and whoseith row is distributed asDΛ+ti,σ wheretiis theith row ofT. Furthermore, suppose thatσ > µ(A)η(Λ)for some negligible.

Then the random variableAX is statistically close to a distribution parametrized byΛ, σ, A,andAT.

Theorem3.8shows thatAXdepends on the matrixT only via the quantityAT. Consequently, ifX1

andX2are random variables defined asXin the theorem, but using matricesT1andT2respectively, then

AX1is statistically close toAX2wheneverAT1 =AT2. Theorem3.7is a special case wheres= 1.

Uniformity of discrete Gaussians. LetΛ1andΛ2be two lattices inZnsuch thatΛ1+ Λ2=Zn. We will

need the following fact, which shows that a discrete Gaussian sampled fromΛ1is close to uniform inZn/Λ2.

Lemma 3.9. LetΛ1,Λ2ben-dimensional lattices. Then for any∈(0,1₂), andσ≥ν(Λ1∩Λ2), and any

c∈Rn, the distribution(DΛ1,σ,cmod Λ2)is within statistical distance at most2of the uniform distribution

over(Λ1+ Λ2)/Λ2.

Proof. By [20, Corollary 2.8], the distribution of(D_Λ₁_,σ,_cmod Λ1∩Λ2)is within statistical distance at

most2of uniform overΛ1/(Λ1∩Λ2). By the Second Isomorphism Theorem for abelian groups [14, Ch. 3,

Theorem 18],Λ1/(Λ1∩Λ2)∼= (Λ1+ Λ2)/Λ2. Hence, there is a one-to-one correspondence between cosets

ofΛ1∩Λ2inΛ1and cosets ofΛ2inside ofΛ1+ Λ2.

Now, for somev∈Λ1letIv⊆Λ1be the coset(Λ1∩Λ2) +vinΛ1. LetJv ⊆Znbe the cosetΛ2+v

ofΛ2inΛ1+ Λ2. Then

Jv∩Λ1 =Iv. (3.1)

To see why this is true, first observe that allu ∈Iv are inΛ1 and inΛ2+v =Jvby definition, proving

containment in one direction. Second, observe that sincev∈Λ1, allw∈(Λ2+v)∩Λ1satisfyw−v∈

Λ1∩Λ2. Thereforew∈Iv, proving containment in the other direction, and (3.1) follows.

By definition we know that{Iv}v∈Λ1 contains all the cosets ofΛ1∩Λ2inΛ1. Therefore, by the

one-to-one correspondence,{Jv}v∈Λ1 must contain all the cosets ofΛ2inΛ1+ Λ2. By (3.1), the distribution

DΛ1,σ,cassigns the same weight toJvas toIv. But then, sinceDΛ1,σ,cis statistically close to uniform over

(12)

Complexity assumption. We define a generalization of the now-standardSmall Integer Solution(SIS) problem, which is to find a short nonzero vector in a certain class of lattices.

Definition 3.10. LetL = {L_n}be a distribution ensemble of integer lattices, where lattices inL_nhave dimensionn. An instance of theL-SISn,β problem is a latticeΛ ← Ln. A solution to the problem is a nonzero vectorv∈Λwithkvk ≤β.

IfBis an algorithm that takes as input a latticeΛ, we define theadvantageofB, denotedL-SIS-Adv[B,(n, β)], to be the probability thatBoutputs a solution to anL-SISn,β problem instanceΛchosen according to the distributionLn.

We say that theL-SISn,β problem isinfeasibleif for all polynomial-time algorithmsB, the advantage L-SIS-Adv[B,(n, β)]is a negligible function ofn.

WhenLnconsists ofΛ⊥q(A)for uniformly randomA∈Z`×nq , theL-SISn,βproblem is the standardSISq,n,β problem defined by Micciancio and Regev [32]. For this distribution of lattices, an algorithm that solves the L-SISn,β problem can be used to solve worst-case problems on arbitrary`-dimensional lattices [32,§5].

4 Homomorphic Signatures for Linear Functions over Small Fields

As a “warm-up” to our polynomially homomorphic scheme, we describe a signature scheme that can authenticate any linear function of signed vectors defined over small fieldsFp. Previous constructions can

only achieve this functionality for vectors defined over large fields [12,8] or for a small number of vectors [7]. In particular, our scheme easily accommodates binary data (p= 2). Linearly homomorphic signatures over

F2are an example of a primitive that can be built from lattices, but cannot currently be built from discrete-log

or RSA-type assumptions (cf. the discussion in [7]). In AppendixAThe instantiation of our abstract system forF2 is currently the best such system; see Section4.2for further details. we describe a variant of the

scheme in which the data can take values in large fieldsFp.

Security is based on theSISproblem onq-ary lattices for some primeq; Micciancio and Regev [32], building on the work of Ajtai [2], show that this problem is as hard as standard worst-case problems on arbitrary lattices of dimension approximatelyn/lgq. The system in this section is only secure for smallp, specificallyp= poly(n)withp≤√q/nkfor data sets of sizek.

Overview of the scheme. Since our system builds on the “hash-and-sign” signatures of Gentry, Peikert, and Vaikuntanathan [20], let us recall how GPV signatures work in an abstract sense. The public key is a latticeΛ⊂Znand the secret key is a short basis ofΛ. To sign a messagem, the secret key holder hashesm

to an elementH(m)∈_Zn_/_Λ_{and samples a short vector}_σ_{from the coset of}_Λ_{defined by}_H₍_m_{). To verify} σ, one checks thatσis short and thatσ mod Λ =H(m).

Recall that in a homomorphic signature scheme we wish to authenticate triples(τ, m,hfi), whereτ is a “tag” attached to a data set,mis a message inFnp, andhfiis an encoding of a functionf acting onk-tuples of messages. We encode a linear functionf : (Fnp)k → Fnp defined byf(m1, . . . , mk) = Pki=1cimi by interpreting thecias integers in(−p/2, p/2]and defininghfi:= (c1, . . . , ck)∈Zk.

To authenticate both the message and the function as well as bind them together, we compute a single GPV signature that issimultaneouslya signature on the (unhashed) messagem∈Fnp and a signature on a hash ofhfi.

This dual-role signature is computed via what we call the “intersection method,” which works as follows. LetΛ1andΛ2ben-dimensional integer lattices withΛ1+ Λ2 =Zn. Supposem∈Zn/Λ1 is a message and

ωτ is a hash function (depending on the tagτ) that maps encodings of functionsf to elements ofZn/Λ2.

(13)

Chinese remainder theorem the pair m, ωτ(hfi)

defines a unique coset ofΛ1∩Λ2 inZn. We can thus

use a short basis ofΛ1∩Λ2to compute a short vector in this coset; i.e., a short vectorσwith the property

thatσmod Λ1 =mandσ mod Λ2 =ωτ(hfi). The vectorσis a signature on(τ, m,hfi).

The Sign(sk, τ, m, i) algorithm uses the procedure above to generate a fresh signature on the triple (τ, m,hπii) where πi is the ith projection function defined by πi(m1, . . . , mk) = mi and encoded as

hπii=ei, theith unit vector inZk.

The homomorphic property is now obtained as follows. To authenticate the linear combinationm = Pk

i=1cimifor integersci, we compute the signatureσ :=Pki=1ciσi. Ifkandpare sufficiently small, thenσ is a short vector. Furthermore, we have

σmod Λ1=Pk_i₌₁cimi =m , and

σmod Λ2=Pki=1ciωτ(hπii) =Pki=1ciωτ(ei).

Now suppose thatωτ is linear, namelyPk_i₌₁ciωτ(ei) =ωτ (c1, . . . , ck)

for allc1, . . . , ckinZ. Then since

(c1, . . . , ck)is exactly the encoding of the functionfdefined byf(m1, . . . , mk) =Pk_i₌₁cimi, the signature σauthenticates both the messagemand the fact that it was computed correctly (i.e., viaf) from the original messagesm1, . . . , mk.

We now describe the construction concretely. Let pandq be distinct primes, letΛ1 = pZn, and let

Λ2 = Λ⊥q(A)for someA ∈Fq`×n. Messages are elements ofZn/Λ1 ∼=Fnp, with the isomorphism given explicitly by the mapx 7→ (xmod Λ1)that reduces the coefficients of xmodp. The hash functionωτ takes values inZkand outputs elements inZn/Λ2∼=F`q, with the isomorphism given explicitly by the map x7→(xmod Λ2)that sendsxtoA·xmodq. Now given vectorsm∈Fnp andωτ(hfi)∈F`q, it is easy for anyone to compute an elementt∈_Zn_{of the corresponding coset of}_Λ

1∩Λ2, namely

tmod Λ1=m and tmod Λ2=ωτ(hfi).

If we have a short basis ofΛ1∩Λ2, then we sign(τ, m, f)by usingSamplePreto sample a short vector

from the coset(Λ1∩Λ2) +t.

Finally, we define the hash functionωτ. LetH:{0,1}∗ →(F`q)be a hash function, and setαi :=H(τki) fori= 1, . . . , k. Then for a functionf encoded ashfi= (c1, . . . , ck)∈Zk, we define

ωτ(hfi) := k X

i=1

ciαi ∈F`q ∼=Zn/Λ2.

In particular, we haveωτ(ei) =αi, andPciωτ(ei) =ωτ((c1, . . . , ck)), which is the property we needed to

ensure that the scheme is homomorphic.

The linearly homomorphic scheme. We now describe the scheme.

Setup(1n, k). On input a security parameternand a maximum data set sizek, do the following: 1. Choose two primesp, q= poly(n)withq ≥(nkp)2. Define`:=bn/6 logqc.

2. SetΛ1 :=pZn.

3. UseTrapGen(q, `, n)to generate a matrixA∈Fq`×nalong with a short basisTqofΛ⊥q(A). Define Λ2 := Λ⊥q(A)andT:=p·Tq. Note thatTis a basis ofΛ1∩Λ2=pΛ2.

4. Setν :=p·√nlogq·logn. 5. LetH:{0,1}∗_→

(14)

The public keypkdefines the following system parameters:

• The message space isFnp and signatures are short vectors inZn.

• The set of admissible functionsFis allFp-linear functions onk-tuples of messages inFnp.

• For a functionf ∈ F defined byf(m1, . . . , mk) =Pki=1cimi, we encodef by interpreting thecias

integers in(−p/2, p/2]and defininghfi= (c1, . . . , ck)∈Zk.

• To evaluate the hash functionωτ on an encoded functionhfi= (c1, . . . , ck)∈Zk, do the following:

(a) Fori= 1, . . . , k, computeαi←H(τki)inF`q. (b) Defineωτ(hfi) :=Pk_i₌₁ciαi ∈F`q.

Sign(sk, τ, m, i). On input a secret keysk, a tagτ ∈ {0,1}n_{, a message}_m_∈

Fnp, and an indexi, do: 1. Computeαi :=H(τki)∈F`q. Then, by definition,ωτ(hπii) =αi.

2. Computet∈Znsuch thattmodp=mandA·tmodq =αi. 3. Outputσ ←SamplePre(Λ1∩Λ2,T,t, ν) ∈(Λ1∩Λ2) +t.

Verify(pk, τ, m, σ, f). On input a public keypk, a tagτ ∈ {0,1}n_{, a message}_m_∈

Fnp, a signatureσ∈Zn, and a functionf ∈ F, do:

1. If all of the following conditions hold, output1(accept); otherwise output0(reject): (a) kσk ≤k·p₂·ν√n.

(b) σmodp=m.

(c) A·σmodq =ωτ(hfi).

Evaluate(pk, τ, f, ~σ). On input a public keypk, a tagτ ∈ {0,1}n_{, a function}_f _{∈ F} _{encoded as}_h_f_i ₌ (c1, . . . , ck)∈Zk, and a tuple of signaturesσ1, . . . , σk∈Zn, outputσ :=Pki=1ciσi.

Lemma 4.1. The linearly homomorphic signature scheme defined above is correct with overwhelming probability.

Proof. Letτ ∈ {0,1}n_,_m _∈

Fnp, andi ∈ {1, . . . , k}, and setσ ← Sign(sk, τ, m, i). We check the three verification conditions relative to the projection functionπidefined byπi(m1, . . . , mk) =miand encoded ashπii=ei.

(a) By Theorem3.2, we havekTek ≤O(p· √

`logq), and thereforeν≥ kTek ·ω( √

log`). By Lemma3.6, we havekσk ≤ν√nwith overwhelming probability.

(b) By correctness ofSamplePre, we haveσ∈(Λ1∩Λ2) +t, and thusσmodp=tmodp. By definition

oft, we havetmodp=m.

(c) As above, we haveσ∈(Λ1∩Λ2) +t. SinceΛ2 = Λ⊥q(A), we haveA·σ modq=A·tmodq=αi. By our definition ofωτ, we haveωτ(hπii) =αi.

Now letτ ∈ {0,1}n_,_m_~ _{= (}_m

1, . . . , mk) ∈(Fnp)k, andf ∈ F encoded ashfi = (c1, . . . , ck) ∈Zk. Let

~

σ= (σ1, . . . , σk)withσi ←Sign(sk, τ, mi, i). Then by our definition ofEvaluate, we have

σ0:=Evaluate(pk, τ, f, ~σ) =P iciσi

We check the three verification conditions for the signatureσ0on the messagef(m~), relative to the functionf. (a) Since|ci| ≤p/2for alli, we have that with overwhelming probability,

kσ0k ≤ k·p

2·max{kσik:σi∈~σ} ≤ k· p 2·ν

√ n,

(15)

(b) By correctness of individual signatures, we haveσi modp=mifori∈ {1, . . . , k}. We thus have

σ0modp=X i

ciσi modp= X

i

cimi=f(m~).

(c) By correctness of individual signatures, we haveA·σi modq=αifori∈ {1, . . . , k}. It follows that

A·σ0modq =X i

A·ciσi modq= X

i

ciαi modq=ωτ(hfi).

With overwhelming probability, signatures output bySign(sk,·,·,·)have norm at mostν√nand therefore bit length at mostnlg(ν√n). As we saw in the proof of Lemma4.1, applyingEvaluateto a linear function f : (Fnp)k→Fpnandksignatures generated bySignoutputs a signatureσwhose norm is at mostk·

p

2·ν

√ n with high probability. Therefore, the bit length of the derived signatureσsatisfies

len(σ)≤nlgk+nlgp 2·ν

√ n.

Since the bit length of the derived signature depends logarithmically onk, we conclude that the scheme is length efficient.

4.2 Unforgeability

We will show that an adversary that forges a signature for the linearly homomorphic scheme can be used to compute a short vector in the latticeΛ2 chosen in Step3ofSetup. By Theorem3.2, the distribution of

matricesAused to defineΛ2is statistically close to uniform overF`×nq . Thus the distribution of latticesΛ2

output bySetupis statistically close to the distribution of challenges for theSISq,n,βproblem (for anyβ). Our security theorem is as follows.

Theorem 4.2. IfSISq,n,βis infeasible forβ =k·p2·nlogn √

logq, then the linearly homomorphic signature scheme above is unforgeable in the random oracle model.

The full proof of this theorem is given in Section 7.3. In particular, Theorem4.2is a special case of Theorem7.4, which proves unforgeability of an abstract homomorphic signature system. Here we sketch the intuition behind the proof without the formalism of the abstract system.

Sketch of Proof. LetAbe an adversary that plays the security game of Definition2.1. Given a challenge latticeΛ2 := Λ⊥q(A)forA ∈ Z`×nq , we simulate theSignalgorithm on input(τ, m, i)for random τ by sampling a short vectorσfromD_Λ₁₊_m,νand definingH(τki) :=A·σmodq. Thenσis a valid signature on(τ, m, i). Other queries toHare answered similarly, but with a random messagem. By Lemma3.9, the Gaussian parameterνis large enough so thatH(τki)is statistically close to uniform inF`q.

EventuallyAoutputs a tagτ∗, a messagem∗, a functionfencoded ashfi= (c1, . . . , ck)∈Zk, and a

signatureσ∗. Letσibe the short vector chosen when programmingH(τ∗ki), and letσf := P_iciσi. We claim that ifAoutputs a valid forgery, then with high probability the vectorσ∗−σf is a nonzero vector in Λ2of length at mostβ; i.e., a solution to theSISq,n,βproblem.

SupposeAoutputs a type 2 forgery, so the simulator has generated signatures~σ = (σ1, . . . , σk)on messagesm~ = (m1, . . . , mk)using the tagτ∗. First observe that the verification condition (1a) implies that kσ∗kandkσfkare both less thank·p₂·ν

√

(16)

valid, thenm∗ 6=f(m~). The verification condition (1b) implies that(σ∗−σf) modp=m∗−f(m~)6= 0, and thusσ∗−σf 6= 0. On the other hand, verification condition (1c) implies thatA·σ∗ modq=A·σf modq, and thusσ∗−σf ∈Λ2.

The argument for a type 1 forgery is similar, using random messagesm~ instead of queried ones.

Worst-case connections. By [20, Proposition 5.7], ifq≥β·ω(√nlogn), then theSISq,m,βproblem is as hard as approximating theSIVPproblem in the worst case to withinβ·O˜(√n)factors. Our requirement in theSetupalgorithm thatq ≥(nkp)2guarantees thatqis sufficiently large for this theorem to apply. While the exact worst-case approximation factor will depend on the parameterskandp, it is polynomial innin any case.

Comparison with prior work. Boneh and Freeman [7] describe a linearly homomorphic signature scheme that can authenticate vectors overFp for smallp, with unforgeability also depending on theSISproblem. However, for their system to securely sign k messages, the SISq,2n−k,β problem must be difficult for β = ˜O(k3/2·k!·(n/lgq)k/2+1), and therefore their system is designed to only sign a constant number of vectors per data set (k=O(1)) while maintaining a polynomial connection to worst-case lattice problems. On the other hand, for the same value ofqour system remains secure when signingk= poly(n)vectors per data set.

4.3 Privacy

We now show that our linearly homomorphic signature scheme is weakly context hiding. Specifically, we use Theorem3.7to show that a derived signature on a linear combinationm0 =Pk_i₌₁cimidepends (up to negligible statistical distance) only onm0and theci, and not on the initial messagesmi. Consequently, even an unbounded adversary cannot win the privacy game of Definition2.3. To prove privacy when multiple derived signatures are revealed we need the generalization of Theorem3.7given in Theorem3.8.

Theorem 4.3. Suppose thatνdefined in theSetupalgorithm satisfiesν > ps+1·ks·ω(√logn). Then the linearly homomorphic signature scheme described above iss-weakly context hiding for data sets of size at mostk.

Proof. We follow the argument of [7, Theorem 5.4], which is an unconditional argument for weak context hiding. In the privacy game, let(m~∗₀, ~m∗₁, f1, . . . , fs)be the adversary’s output in the challenge phase, where

~

m₀∗ andm~∗₁containkmessages each. For convenience we treatm~∗_b as a matrix inFk×np whosejth row is message numberjinm~∗_b. We know that

mi :=f ~m∗0) =f ~m1∗)∈Fnp for alli= 1, . . . , s. (4.1) LetΛ := Λ1∩Λ2 and letτ be the tag used to answer the challenge. Forj = 1, . . . , k, letσj(0)andσ

(1)

j be the challenger’s signatures (inZn) on thejth vector inm~∗0andm~∗1, respectively. We arrange these signatures

in two matricesE(0), E(1) ∈_Zk×nso that rowjofE(b)is(σ_j(b))T_∈

Zn.

Fori= 1, . . . , sletd(_ib)be a derived signature onmicomputed using theEvaluatealgorithm applied to the signatures {σ_j(b)}1≤j≤k and the function fi. Again, we arrange these derived signatures in two matricesD(0), D(1) ∈_Zs×nso that rowiofD(b)is(d(_ib))T_∈

Zn. The challenger chooses a bitband gives

(17)

By definition of algorithmEvaluate, there is a matrixF ∈_Zs×k_{such that}_D(b) ₌_{F E}(b)_for_b_{= 0}_,_1.

Theith row ofF is determined by the functionfiand is equal toωτ(hfii); in particular, all the entries ofF are in(−p/2, p/2]. By (4.1) we know thatF ~m∗₀=F ~m∗₁(where we identifyZ∩(−p/2, p/2]withFp).

Supposeb = 0. By definition of algorithmSign, every signatureσ(0)_j is sampled from a distribution statistically close toD_Λ+_tj_,ν wheretj is the vector computed in Step (2) of theSignalgorithm, namely tj mod Λ1 = m∗0,j andtj mod Λ2 = ωτ(hπji). All of these signatures are mutually independent and therefore, by Theorem 3.8, the derived signaturesD(0) = F E(0) are statistically close to a distribution parametrized byΛ,σ,F, andF ~m∗₀. (Note that the rows ofm~∗₀describe cosets ofΛso thatm~∗₀ plays the same role as the matrixT in Theorem3.8.) Since the same holds forb= 1and sinceF ~m∗₀ =F ~m∗₁, we see thatD(0)andD(1)are statistically close. Consequently, even an unbounded adversary cannot win the privacy game with non-negligible advantage.

It remains to argue thatνis sufficiently large to apply Theorem3.8toF E(b), namely thatν > µ(F)η(Λ) for some negligible, whereµ(F)is defined as in Theorem3.8. First, recall that the determinant of a matrix is less than the product of the norms of its rows. Therefore, since the entries ofF are in(−p/2, p/2], it is not difficult to see thatµ(F)≤(pk)s. Second, sinceΛ = Λ1∩Λ2 =pZn∩Λ2 =pΛ2, Lemma3.5shows that

with overwhelming probability we haveη(Λ) =p·η(Λ2)< p·ω(

√

logn)for negligible. Combining these two bounds and the bound onν, we obtain

µ(F)η(Λ)≤p·(pk)s·ω(plogn)< ν

as required for Theorem3.8.

Remark 4.4. The exponential dependence on the numbersof allowed function queries in Theorem4.3shows that for the parameterνdefined in theSetupalgorithm, the scheme is private only when a small number of derived signatures are revealed. If instead we requireSetupto use aνsuch thatν > ps+1·ks·ω(√logn), then the scheme is weakly context hiding when an arbitrary number of derived signatures are revealed. Of course, makingν so large forces us to increase the parameterβ in Theorem4.2, thus weakening the unforgeability property.

In addition, we observe that when s > k, the result of Theorem4.3holds whenever ν > p·(pk)k· ω(√logn). In particular, for small data sets the scheme is weakly context hiding: it remains private even against an adversary allowed to make an arbitrary number of function queries.

5 Background on Ideal Lattices

A number field is a finite-degree algebraic extension of the rational numbers Q. Any number field K

can be represented asQ[x]/(f(x))for some monic, irreducible polynomialf(x)with integer coefficients

(and for each K there are infinitely many suchf). The degreeof a number fieldK is its dimension as a vector space over Q, and is also the degree of any polynomial f defining K. For any given f, the

set{1, x, x2_{, . . . , x}degf−1_} _{is a}

Q-basis for K, and we can therefore identifyK withQn by mapping a

polynomial of degree less thannto its vector of coefficients. By identifyingKwithQnusing this “coefficient

embedding,” we can define a length functionk·kon elements ofKsimply by using any norm onQn. This

length function is non-canonical — it depends explicitly on the choice offused to representK. (Here all norms will be the`2norm unless otherwise stated.)

Our identification ofK=Q[x]/(f(x))withQninduces amultiplicativestructure onQnin addition to

the usualadditivestructure. We define a parameter

γf := sup u,v∈K

(18)

This parameter bounds how much multiplication can increase the length of the product, relative to the product of the length of the factors. For our applications we will need to haveγf = poly(n). Ifnis a power of2, then the functionf(x) =xn+ 1hasγf ≤

√

n(cf. [18, Lemma 7.4.3]). We will think of thisf(x)as our “preferred” choice for applications.

Number rings and ideals. Anumber ringis a ring whose field of fractions is a number fieldK. A survey of arithmetic in number rings can be found in [43]; here we summarize the key points.

Every number field has a subring, called thering of integersand denoted byO_K, that plays the same role with respect toKas the integersZdo with respect toQ. The ring of integers consists of all elements ofK

whose characteristic polynomials have integer coefficients. Under the identification ofK withQn, the ring

O_K forms a full-rank discrete subgroup ofQn; i.e., a lattice. InsideOK is the subringR =Z[x]/(f(x)).

Under our identification ofKwithQn, the ringRcorresponds toZn. In generalRis a proper sublattice

ofOK. The “distance” betweenRandOKis measured by theconductorfR, which is the largest ideal of OKcontained inR.

An(integral) idealofRis an additive subgroupI ⊂Rthat is closed under multiplication by elements ofR. By our identification ofRwithZn, the idealI is a sublattice ofRand is therefore also called anideal lattice. Note that this usage of “ideal lattice” to refer to a rank oneR-module differs from that of [41,27], which use the terminology to refer toR-modules of arbitrary rank.

An idealI ⊂Risprimeif forx, y∈ R,xy ∈I implies eitherx ∈ I ory ∈ I. Ifpis a prime ideal, thenR/pis a finite fieldFpe; the integereis thedegreeofpand the primepis thecharacteristicofp. An

idealIisprincipalif it can be written asα·Rfor someα∈R. In general most ideals are not principal; the proportion of principal ideals is 1 over the size of theclass groupofR, which is exponential inn. Thenorm of an idealI is the size of the (additive) groupR/I.

If p is a prime ideal ofR, then by a theorem of Kummer and Dedekind [43, Theorem 8.2] we can writep = p·R+h(x)·Rfor some polynomialh(x)whose reduction modpis an irreducible factor of f(x) modp. Writingpin this “two-element representation” makes it easy to compute the corresponding quotient mapZ[x]/(f(x)) → Fpe; we simply reduce a polynomial in_Z[x]modulo bothp andh(x). In

particular, ifpis a degree-one prime, thenh(x) =x−αfor some integerαand the quotient map is given by z(x)7→z(α) modp.

Generating ideals with a short basis. If we are to use ideals as the latticesΛ1 andΛ2 in our abstract

signature scheme, we will need a method for generating idealspandqinRalong with a short basis forp∩q

(which is equal top·qifpandqare relatively prime ideals). Furthermore, our security proof requires that givenqwithouta short basis, we can still compute a primepwith a short basis.

In our construction we generate ideals using an algorithm of Smart and Vercauteren [39]. This algorithm generates aprincipalprime idealpalong with a short generatorgofp. We can multiplygby powers of x to generate a full-rank set of vectors {g, xg, x2_{g, . . . , x}n−1_g_}_{that spans} _p_{. Since}_k_x_k _{= 1, we have}

kxigk ≤γf · kgk, so ifγf is small then these vectors are all short.

Theorem 5.1([39,§3.1]). There is an algorithmPrincGenthat takes input a monic irreducible polynomial

f(x)∈_Z[x]of degreenand a parameterδ, and outputs a principal degree-one prime idealp= (p, x−a) inK:=Q[x]/(f(x)), along with a generatorgofpsatisfyingkgk ≤δ

√ n.

(19)

6 Homomorphic Signatures for Polynomial Functions

In this section we describe our main construction, a signature scheme that authenticates polynomial functions on signed messages.

Recall the basic idea of our linearly homomorphic scheme from Section4: messages are elements of

Znmod Λ1, functions are mapped (via the hash functionωτ) to elements ofZnmod Λ2, and a signature

on(τ, m,hfi) is a short vector in the coset ofΛ1∩Λ2 defined bymandωτ(hfi). To verify a signature σ, we simply confirm thatσ is a short vector and thatσ mod Λ1 = m andσ mod Λ2 = ωτ(hfi). The homomorphic property follows from the fact that the mapsx7→(xmod Λi)are linear maps — i.e.,vector space homomorphisms— and therefore adding signatures corresponds to adding the corresponding messages and (encoded) functions.

Our polynomial system is based on the following idea: what if the latticeZnhas aringstructure and the

latticesΛ1,Λ2areideals? Then the mapsx7→(xmod Λi)arering homomorphisms, and therefore adding or multiplyingsignatures corresponds to addingor multiplyingthe corresponding messages and functions. Since any polynomial can be computed by repeated additions and multiplications, adding this structure to our lattices allows us to authenticate polynomial functions on messages.

Concretely, we letF(x)∈_Z[x]be a monic, irreducible polynomial of degreen. We define the number fieldK = Q[x]/(F(x))and identify K withQn via the coefficient embedding. Under this embedding,

R=Z[x]/(F(x))corresponds to the latticeZn. We now letΛ1andΛ2be (degree one) prime idealsp,q⊂R

of normp, qrespectively. We fix an isomorphism fromR/ptoFpby representingpasp·R+ (x−a)·Rand mappingh(x)∈Rtoh(a) modp∈_F_p, and similarly forR/q∼=Fq. We can now sign messages exactly as in the linearly homomorphic scheme.

In our linearly homomorphic scheme we used the projection functionsπias a generating set for admissible functions, and we encoded the function f = P

ciπi by its coefficient vector (c1, . . . , ck) (with the ci interpreted as integers in(−p/2, p/2]). When we consider polynomial functions onFp[x1, . . . , xk], the projection functionsπiare exactly the linear monomialsxi, and we can obtain any (non-constant) polynomial function by adding and multiplying monomials. If we fix an ordering on all monomials of the formxe1

1 · · ·x

ek

k , then we can encode any polynomial function as its vector of coefficients, with the unit vectorseirepresenting the linear monomialsxifori= 1, . . . , k.

The hash functionωτ is defined exactly as in our linear scheme: for a functionf inFp[x1, . . . , xk]whose encoding ishfi= (c1, . . . , c`)∈Z`, we define a polynomialfˆ∈Z[x1, . . . , xk]that reduces tofmodp. We then defineωτ(hfi) = ˆf(α1, . . . , αk), whereαi∈Fqare defined to beH(τ, i)for some hash functionH.

We use the same lifting off tofˆ∈_Z[x1, . . . , xk]to evaluate polynomials on signatures; specifically, given a polynomialf and signaturesσ1, . . . , σk ∈ K on messages m1, . . . , mk ∈ Fp, the signature on f(m1, . . . , mk)is given byfˆ(σ1, . . . , σk).

Recall that forv1,v2 ∈ R, the length ofv1 ·v2 is bounded byγF · kv1k · kv2k. Thus if we choose

F(x)so thatγF is polynomial inn, then multiplying together a constant number of vectors of lengthpoly(n) produces a vector of lengthpoly(n). It follows that the derived signaturef(σ1, . . . , σk)is short as long as the

degree off is bounded and the coefficients off are small (when lifted to the integers). The system therefore can support polynomial computations on messages for polynomials with small coefficients and bounded degree.

The polynomially homomorphic scheme. We now describe the scheme formally.

Setup(1n, k). On input a security parameternand a maximum data set sizek, do the following: 1. Choose a monic irreducible polynomialF(x)∈_Z[x]of degreenwithγF = poly(n).

(20)

LetR=Znbe the lattice correspondingZ[x]/(F(x))⊂ OK.

2. Run thePrincGenalgorithm twice on inputsF, nto produce distinct principal degree-one prime ideals

p= (p, x−a)andq= (q, x−b)ofRwith generatorsgp, gq, respectively.

3. LetTbe the basis{gpgq, gpgqx, . . . , gpgqxn−1}ofp·q.

4. Defineν:=γ2_F ·n3logn. Choose integersy= poly(n)andd=O(1). 5. LetH:{0,1}∗→Fqbe a hash function (modeled as a random oracle). 6. Output the public keypk= (F, p, q, a, b, ν, y, d, H)and secret keysk=T.

The public keypkdefines the following system parameters:

• The message space isFp and signatures are short vectors inR.

• The set of admissible functionsFis all polynomials inFp[x1, . . . , xk]with coefficients in{−y, . . . , y},

degree at mostd, and constant term zero. The quantityyis only used in algorithmVerify. • Let ` = k+_dd

−1. Let{Yj}`j=1 be the set of all non-constant monomials x

e1

1 · · ·x

ek

k of degree P

ei≤d, ordered lexicographically.4 Then any polynomial functionf ∈ F is defined byf(m~) = P`

j=1cjYj(m~) for cj ∈ Fp. We interpret the cj as integers in [−y, y]and encode f as hfi = (c1, . . . , c`)∈Z`.

• To evaluate the hash functionωτ on an encoded functionhfi= (c1, . . . , c`)∈Z`, do the following:

(a) Fori= 1, . . . , k, computeαi←H(τki).

(b) Defineωτ(hfi) :=P`j=1cjYj(α1, . . . , αk)∈Fq.

Sign(sk, τ, m, i). On input a secret keysk, a tagτ ∈ {0,1}n_{, a message}_m_∈

Fp, and an indexi, do:

1. Computeαi :=H(τki)∈Fq.

2. Computeh=h(x)∈Rsuch thath(a) modp=mandh(b) modq =αi. 3. Outputσ ←SamplePre(p·q,T, h, ν) ∈(p·q) +h.

Verify(pk, τ, m, σ, f). On input a public keypk, a tagτ ∈ {0,1}n_{, a message} _m _∈

Fp, a signatureσ =

σ(x)∈R, and a functionf ∈ F, do:

1. If all of the following conditions hold, output1(accept); otherwise output0(reject): (a) kσk ≤`·y·γ_Fd−1·(ν√n)d.

(b) σ(a) modp=m. (c) σ(b) modq=ωτ(hfi).

Evaluate(pk, τ, f, ~σ). On input a public keypk, a tagτ ∈ {0,1}n_{, a function}_f _{∈ F} _{encoded as}_h_f_i ₌ (c1, . . . , c`)∈Z`, and a tuple of signaturesσ1, . . . , σk∈Zn, do:

1. Liftf ∈Fp[x1, . . . , xk]toZ[x1, . . . , xk]by settingfˆ:=P`j=1cjYj(x1, . . . , xk).

2. Outputfˆ(σ1, . . . , σk).

Lemma 6.1. The polynomially homomorphic signature scheme defined above is correct with overwhelming probability.

Proof. Letτ ∈ {0,1}n_,_m _∈

Fp, andi∈ {1, . . . , k}, and setσ ← Sign(sk, τ, m, i). We check the three

verification conditions relative to the projection functionπi, which in this context is the monomialxi. (Note thathxii=ei.)

(a) By Theorem5.1the generators gp, gq have norm at mostn1.5, and therefore kgpgqxik ≤ γF2 ·n3 fori= 0, . . . , n−1. It follows from Lemma3.1that the basisThaskTek ≤γ_F2n3, and therefore ν ≥ kTek ·ω(

√

logn). By Lemma3.6, we havekσk ≤ν√nwith overwhelming probability.

4

Specifically, we identify a monomialxe1

1 · · ·x ek

k with the non-decreasing sequence(0, . . . ,0,1, . . . ,1, . . . , k, . . . , k)of

lengthdthat has the integeriappearingeitimes (and0appearingd−Peitimes), and use the lexicographic ordering onZd. In

(21)

(b) By correctness ofSamplePre, we haveσ ∈p·q+h(x), and thusσ(a) modp = h(a) modp. By definition ofh(x), we haveh(a) modp=m.

(c) As above, we haveσ∈p·q+h(x)and thereforeσ(b) modq =h(b) modq =αi.

Now letτ ∈ {0,1}n_,_m_~ _{= (}_m

1, . . . , mk) ∈ Fkp, and f ∈ F encoded ashfi = (c1, . . . , c`) ∈ Z`. Let

~

σ= (σ1, . . . , σk)withσi ←Sign(sk, τ, mi, i). Interpret thecj as integers in[−y, y]. Then by our definition ofEvaluate, we have

σ0 :=Evaluate(pk, τ, f, ~σ) =P

jcjYj(~σ)

(Recall that{Yj}is the set of non-constant monomialsxe1i· · ·x

ek

k of degree at mostd.) We check the three verification conditions for the signatureσ0on the messagef(m~), relative to the functionf.

(a) Since evaluatingYj(σ)consists of multiplying together at mostdof theσi, we have

kYj(~σ)k ≤γd−_F 1·(max{kσik:σi∈~σ})d.

Sincef is admissible we have|cj| ≤yfor allj. Thus we have

kσ0k ≤ `·y·γ_Fd−1·(max{kσik:σi ∈~σ})d ≤ `·y·γ_Fd−1·(ν √

n)d,

where the second inequality follows from Lemma3.6.

(b) The fact that the map fromRtoFpgiven byh(x)7→h(a) modpis a ring homomorphism means that for each monomialYj, we have

(Yj(σ1, . . . , σk))(a) modp=Yj(σ1(a), . . . , σk(a)) modp.

By correctness of individual signatures, we haveσi(a) modp=mifori∈ {1, . . . , k}. It follows that

σ0(a) modp=X j

cj(Yj(σ1, . . . , σk))(a) modp

=X j

cjYj(σ1(a), . . . , σk(a)) modp= X

j

cjYj(m1, . . . , mk) =f(m~).

(c) By correctness of individual signatures, we haveσi(b) modq = αi fori ∈ {1, . . . , k}. The same analysis as in part (b) above, using the ring homomorphismh(x)7→h(b) modq, shows that

σ0(b) modq =X j

cjYj(α1, . . . , αk) modq =ωτ(hfi).

With overwhelming probability, signatures output bySign(sk,·,·,·)have norm at mostν√nand therefore bit length at mostnlg(ν√n). As shown in the proof of Lemma6.1above, applyingEvaluateto a polynomial functionf:Fkp → Fp and ksignatures generated by Signoutputs a signatureσ whose norm is at most `·y·γ_Fd−1·(ν√n)d_{with overwhelming probability. Since}_`_≤_kd_{, the bit length of this derived signature}_σ satisfies

len(σ)≤n·dlgk+nlgy·γ_Fd−1·ν√n.