www.theoryofcomputing.org
Noisy Interpolating Sets
for Low-Degree Polynomials
Zeev Dvir
∗Amir Shpilka
†Received: September 4, 2009; published: February 12, 2011.
Abstract: A Noisy Interpolating Set (NIS) for degree-dpolynomials is a setS⊆Fn, where
Fis a finite field, such that any degree-d polynomialq∈F[x1, . . . ,xn]can be efficiently
interpolated from its values onS, even if an adversary corrupts a constant fraction of the values. In this paper we construct an explicit NIS for every fieldFpof prime orderpand
any degreed. Our sets are of sizeO(nd)and have efficient interpolation algorithms that can
recoverqin the presence of a fraction exp(−Ω(d))of errors.
Our construction is based on a theorem which roughly states that if S is a NIS for degree-1 polynomials thend·S={a1+· · ·+ad|ai∈S}is a NIS for degree-dpolynomials.
Furthermore, given an efficient interpolation algorithm forS, we show how to use it in a black-box manner to build an efficient interpolation algorithm ford·S.
As a corollary we obtain an explicit family of punctured Reed-Muller codes (codes that are restrictions of a Reed-Muller code to a subset of the coordinates) which are good codes and have an efficient decoding algorithm against a constant fraction of errors. To the best of our knowledge, even the existence of punctured Reed-Muller codes that are good codes was not previously known.
ACM Classification:F.2.2
AMS Classification:68P30, 68Q25
Key words and phrases:polynomial interpolation, error-correcting codes
∗Research supported by Binational Science Foundation (BSF) grant, by Israel Science Foundation (ISF) grant and by
Minerva Foundation grant.
1
Introduction
An interpolating set for degree-d polynomials is a set of pointsS⊆Fn such that if q(x
1, . . . ,xn) is a
degree-dpolynomial overFand we are given the set of values(q(s))s∈Sthen we can reconstructq. That
is, we can find all the coefficients ofq. A setSis anoisy interpolating setfor degree-d polynomials if we can reconstructqeven if an adversary corrupts anε fraction of the values in(q(s))s∈S. It is not
difficult to prove that a random set of sizeO(nd)is a noisy interpolating set. In this paper we study the problem of giving an explicit construction of a small noisy interpolating set for degree-dpolynomials that has an efficient interpolation algorithm against a constant fraction of errors. Besides being a very natural problem, this question is related to two other interesting problems: giving explicit constructions of punctured Reed-Muller codes that are good codes and the problem of constructing pseudorandom generators against low-degree polynomials. Before stating our results we describe the connection to these two problems.
Reed-Muller codes are an important family of error-correcting codes. They are obtained by evaluating low-degree polynomials on all possible inputs taken from a finite fieldF. More precisely, in the code RM(n,d)the messages correspond to polynomialsq(x1, . . . ,xn)overFof total degreed, and the encoding is the vector of evaluations(q(α¯))α¯∈Fn. The well-known Hadamard code, obtained from evaluating linear functions, is the code RM(n,1). (For more information on Reed-Muller codes see [10].) Although Reed-Muller codes are notgoodcodes (they do not have constant rate) they play an important role both in practice and in theory. For example Reed-Muller codes were used in the early proofs of the PCP theorem [2,1], in constructions of pseudorandom generators [3,7,12] and extractors [13,11], and more. It is thus an important goal to better understand these codes and their properties.
Apuncturingof a code is the operation of throwing away some of the coordinates of every codeword.
For example, every binary linear code (without repeated coordinates) is a punctured Hadamard code. In particular, a punctured Reed-Muller code is a mapping that sends a degree-dpolynomialqto its vector of values on some subsetS⊆Fn. As Reed-Muller codes are not good codes themselves, it is an interesting
question to find a puncturing that yields good codes. As before, it is not difficult to see that a random subset ofFnof sizeO(nd)gives a good punctured Reed-Muller code. An even more interesting question is to find a puncturing that yields good codes that have efficient encoding and decoding algorithms. (For a randomSwe don’t have an efficient decoding algorithm.) It is not hard to see thatSis a noisy interpolating set (that has an efficient interpolation algorithm) if and only if the mappingq→(q(α¯))α¯∈S
is a code of linear minimal distance (that has an efficient decoding algorithm). Our noisy interpolating sets are of sizeO(nd)and therefore they give a construction of good punctured Reed-Muller codes that
can be decoded efficiently. To the best of our knowledge no such construction was known prior to our result.
Another problem that is related to our results is the problem of constructingpseudorandom generators
for low-degree polynomials. A mappingG:Fs→Fnis anε-Pseudorandom Generator (PRG)against degree-dpolynomials if it fools every degreedpolynomial innvariables. That is, the two distributions
For the case of degree-1 polynomials, it is not hard to see that any noisy interpolating set implies a PRG and vice versa. That is, ifG:Fs→Fnis a PRG for polynomials of degree 1 then the image ofGis a noisy interpolating set. Similarly ifS⊆Fnis a noisy interpolating set for degree-1 polynomials then any one to one mapping whose image isSis a PRG for such polynomials. For larger values ofdthere is no such strong correspondence between PRGs and noisy interpolating sets. As before, given a PRG we get a noisy interpolating set (without an obvious efficient interpolation algorithm), however it is not clear how to get a PRG from a noisy interpolating set. Constructions of PRGs against low degree polynomials over very large fields were given in [5] using tools from algebraic geometry. Independently from our work, Viola [14], building on the works of [6,9], showed (over any field) thatd·Sis PRG for degree-d
polynomials wheneverSis PRG for degree-1 polynomials. In particular this result implies thatd·Sis a noisy interpolating set. However, [14] did not consider the problem of giving a noisy interpolating set and in particular did not give an efficient interpolation algorithm. One can view our work as saying that the construction analyzed in [6,9,14] with respect to minimal distance also has an efficient decoding algorithm.
In the next section we consider two types of noisy interpolating sets. The first one allows for repetition of points. That is, we allowSto be a multiset. This corresponds to repeating coordinates in an error-correcting code. In the second variant (properNIS) we do not allow such repetitions, and requireS
to be a simple set (and not a multiset). This is what allows us to get punctured Reed-Muller codes. For each of these types we prove a composition theorem, saying that the sumset of a NIS for degree-1 polynomials is a NIS for higher-degree polynomials. (The non-multiset version of this theorem requires an additional condition on the initial NIS.) We then combine these theorems with known constructions of error-correcting codes to obtain our final results.
1.1 Our results
In the followingFwill always denote a finite field of prime order.1 We denote byF[x1, . . . ,xn]the space
of all polynomials in the variablesx1, . . . ,xnwith coefficients inF. We also denote byFd[x1, . . . ,xn]the
set of all polynomials of degree≤d inF[x1, . . . ,xn]. Since we are interested in polynomials only as
functionsoverFnwe will assume in the following that the individual degree of each variable is smaller
thanp=|F|. This is without loss of generality since for everya∈Fwe haveap=a. For example, over F2, the field of two elements, we will only consider multilinear polynomials. We denote the Hamming
distance between two stringsxandyas dist(x,y).
Definition 1.1(Noisy Interpolating Set (NIS)). Let S= (a1, . . . ,am)∈(Fn)m be a list of points (not necessarily distinct) inFn. We say thatSis an(n,d,ε)-Noisy Interpolating Set(NIS) if there exists an algorithmASsuch that for everyq∈Fd[x1, . . . ,xn]and for every vectore= (e1, . . . ,em)∈Fmsuch that
|{i∈[m]|ei6=0}| ≤ε·m,
the algorithmAS, when given as input the list of values(q(a1) +e1, . . . ,q(am) +em), outputs the
polyno-mialq(as a list of its coefficients). We say thatSis aproperNIS if the pointsa1, . . . ,amare distinct. IfS
is a proper NIS we can treat it as a subsetS⊆Fn.
1It is possible to extend our results to other finite fields (see Section5); however, it requires more notation and can easily be
LetS= S(n)n∈
Nbe a sequence such that for everyn∈Nwe have thatS
(n)is an(n,d,
ε)-NIS (dand εare fixed for alln). We say thatShas an efficient interpolation algorithm if there exists a polynomial time algorithmM(n,L)that takes as input an integernand a listLof values inFsuch that the restriction
M(n,·)has the same behavior as the algorithmAS(n) described above (in places where there is no risk of
confusion we will omit the superscript(n)and simply say thatShas an efficient interpolation algorithm).
1.1.1 Multiset NIS
Our first theorem shows how to use a NIS for degree-1 polynomials in order to create a NIS for higher-degree polynomials. In order to state the theorem we need the following notation. For two lists2of points
S= (a1, . . . ,am)∈(Fn)mandT= (b1, . . . ,b`)∈(Fn)`we define the sumST to be the list
(ai+bj)i∈[m],j∈[`]∈(Fn)m·`.
The reason for the notation is to make a clear distinction between sets and multi-sets. In particular we shall use the notationS+T to denote set addition. That is, for two setsSandT we define
S+T ={s+t|s∈S,t∈T}.
Theorem 1.2. Let0<ε1<1/2be a real number. Let S1be an (n,1,ε1)-NIS and for each d >1let
Sd=Sd−1S1. Then for every d>1the set Sd is an(n,d,εd)- NIS withεd= (ε1/2)d.Moreover, if S1
has an efficient interpolation algorithm, then so does Sd.
CombiningTheorem 1.2with known constructions of error-correcting codes (in order to defineS1)
we get the following corollary:
Corollary 1.3. For every field F of prime order and for every d>0, there exists an ε >0 and a familyS= S(n)n∈
Nsuch that for every n∈N, S
(n)is an(n,d,
ε)-NIS,and such thatShas an efficient
interpolation algorithm. Moreover, for each n∈Nwe haveS(n)=O(nd)and it is possible to generate
the set S(n)in timepoly(n).
1.1.2 Proper NIS
We are able to prove an analog ofTheorem 1.2for proper NIS. We would like to show that ifS1is a proper(n,1,ε1)-NIS then the setsSd =Sd−1+S1 (this time we use the operation of set addition) are
proper(n,d,εd)-NISs. To show this we will require the initial setS1to satisfy a certain natural condition
on the intersections of different ‘shifts’ of the setsSd−1. Roughly speaking, this condition will guarantee
that there is no small set of errors inSd=Sd−1+S1that has a large intersection with many shifts ofSd−1
(one can see from the proof ofTheorem 1.2why this is useful). The operation of set addition is defined as follows: LetAandBbe two subsets ofFn. We defineA+B={a+b|a∈A,b∈B}. For an element
c∈Fnand for a subsetA⊆
Fnwe denoteA+c={a+c|a∈A}.
2Instead of a list one can have a multi-set in mind, however it is slightly easier to describe and prove our results using the
Definition 1.4(Condition3Fk). LetS⊆Fn. Let S0={0}and for eachd≥1 letSd=Sd−1+S. Let
k>0 be an integer. Forx∈Sdwe letNd(x) =|{b∈S|x∈Sd−1+b}|. We say thatSsatisfies condition
Fkif for every 0<d≤kwe have
|{x∈Sd|Nd(x)>d}| ≤ |Sd−2|.
We are now ready to state the analog ofTheorem 1.2for proper NISs.
Theorem 1.5. Let0<ε1<1/2be a real number and let k>0an integer. There exists a constant C0,
depending only onεand k, such that for all n>C0the following holds: Let S1be a proper(n,1,ε1)-NIS
and for each d>1let Sd=Sd−1+S1. Suppose S1satisfies conditionFk. Then, for every1<d≤k the
set Sdis a (proper)(n,d,εd)-NIS with
εd=
1
d!·
ε1
3
d .
Moreover, if S1has an efficient interpolation algorithm, then so does Sd.
As before, combiningTheorem 1.5with known constructions of error-correcting codes gives the following corollary:
Corollary 1.6. For every fieldFof prime order and for every d>0, there exists anε>0and a family S= S(n)n∈
Nsuch that for every n∈N, S
(n)is a proper(n,d,
ε)-NIS with an efficient interpolation
algorithm. Moreover, for each n∈Nwe haveS(n)=O(nd)and it is possible to generate the set S(n)in
timepoly(n).
It is easy to see that any(n,d,ε)-proper NIS induces a puncturing of the Reed-Muller code RM(n,d) that has an efficient decoding from a fractionε of errors. (We keep the coordinates of the RM-code corresponding to the points in the NIS.) The following corollary is therefore immediate. (SeeSection 2.2
for the definition of a good family of codes.)
Corollary 1.7. For every fieldFof prime order and for every d>0, there exists anε>0and a family
{Cn}of good codes such that Cnis a puncturedRM(n,d)-code. Furthermore, the family{Cn}has an
efficient encoding and decoding algorithms from a constant fraction of errors.
1.2 Organization
InSection 2we give preliminary results that we will use in the rest of the paper. InSection 3we prove
Theorem 1.2andCorollary 1.3. InSection 4, we proveTheorem 1.5andCorollary 1.6.Section 5contains a sketch of an argument showing how to extend our results from fields of prime order to any finite field.
Section 6describes a construction of a family of error-correcting codes with certain special properties that is used inSection 4.
2
Preliminaries
Letq∈F[x1, . . . ,xn]be a polynomial. We denote by deg(q)the total degree ofq. Thei-th homogeneous
part ofqis the sum of all monomials inqof degreei, and is denoted withqi. We denoteq≤i=∑ij=0qj.
We now give some useful facts and definitions regarding partial derivatives of polynomials and error-correcting codes.
2.1 Partial and directional derivatives
In this section we define partial and directional derivatives of polynomials over finite fields and discuss some of their properties.
We start by defining the partial derivatives of a monomial. LetM(x) =xc1
1 ·. . .·x
cn
n be a monomial in nvariables so thatci≥0 for alli. The partial derivative ofM(x)with respect toxiis zero ifci=0 and
∂M
∂xi(x) =ci·x c1
1 ·. . .·x
ci−1
i ·. . .·x cn n
ifci>0. The definition extends to general polynomials by linearity. It is easy to verify that if deg(q) =d
then all of the partial derivatives ofqare of degree at mostd−1. Also, since we assume that the individual degrees are smaller than the characteristic of the field we have that∂∂xq
i(x)≡0 iffxi does not appear in q(x). We denote by
∆q(x) =
∂q
∂x1
(x), . . . , ∂q ∂xn
(x)
the partial derivative vector ofq. The directional derivative of a polynomialq∈F[x1, . . . ,xn]in direction
a∈Fnis denoted by
∂q(x,a)and is defined as
∂q(x,a) = n
∑
i=1ai·∂q ∂xi(x).
That is,∂q(x,a)is the inner product ofaand∆q(x).
The following lemma gives a more “geometric” view of the partial derivatives ofqby connecting them to difference ofqalong certain directions.
Lemma 2.1. Let q∈Fd[x1, . . . ,xn]and let qd be its homogeneous part of degree d. Let a,b∈Fn. Then
q(x+a)−q(x+b) =∂qd(x,a−b) +E(x),
wheredeg(E)≤d−2(we use the convention thatdeg(0) =−∞). That is, the directional derivative of qd
in direction a−b is given by the homogeneous part of degree d−1in the difference q(x+a)−q(x+b).
Proof. It is enough to prove the lemma for the case thatqis a monomial of degreed. The result will then
dare of degree at mostd−2. LetM(x) =xc1
1 ·. . .·x
cn
n be a monomial of degreed. We observe that the
expressionM(x+a)can be written as
M(x+a) =
n
∏
i=1(xi+ai)ci
= M(x) +
n
∑
i=1ai·∂M ∂xi
(x) +E1(x),
where deg(E1)≤d−2. Now, subtractingM(x+b)fromM(x+a), the termM(x)cancels out and we are
left with the desired expression.
The next easy lemma shows that a homogeneous polynomialqcan be reconstructed from its partial derivatives.
Lemma 2.2. Let q∈Fd[x1, . . . ,xn]be a homogeneous polynomial of degree at least 1. Given the vector
of partial derivatives∆q(x), it is possible to reconstruct q in polynomial time.
Proof. We go over all monomials of degree≤dand find out the coefficient they have inqusing the
following method. For a monomialMletibe the first index such thatxiappears in the monomialMwith
positive degree. Consider ∂q
∂xi(x)and check whether the coefficient of the derivative of that monomial is
zero or not. For example, if we want to “decode” the monomialx51·x32we will look at the coefficient of
x41·x32in∂∂xq
1(x). To get the coefficient inqwe have to divide it by the degree ofxiin the monomial, that
is by 5 (remember that the individual degrees are smaller thanp=|F|=char(F)).
2.2 Error-correcting codes
We start with the basic definitions of error-correcting codes. LetFbe a field and letE:Fn→Fm. Denote byC=E(Fn)the image ofE. ThenCis called an[m,n,k]-codeif for any two codewordsE(v),E(u)∈C, whereu6=v, we have that dist(E(u),E(v))≥k(“dist” denotes the Hamming distance). We refer tomas
theblock lengthofCand tokas theminimal distanceofC. We denote byR=n/mtherelative rateofC
and byδ =k/mtherelative minimal distanceofC, and say thatCis an[R,δ]-code. WhenEis a linear mapping we say thatCis a linear code and call them×nmatrix computingEthegenerator matrixofC. A mapD:Fm→Fncan correctterrors (with respect toE) if for anyv∈Fnand anyw∈Fmsuch that dist(E(v),w)≤twe have thatD(w) =v. Such aDis called adecoding algorithmforC.
A family of codes{Ci}, whereCi is an[Ri,δi]-code of block lengthmi, hasconstant rateif there exists a constantR>0 such that for all codes in the family it holds thatRi≥R. The family has alinear
distanceif there exists a constantδ >0 such that for all codes in the family we haveδi≥δ. In such a
case we say that the family is a family of[R,δ]-codes. If a family of codes as above has limi→∞mi=∞, a constant rate and a linear minimal distance then we say that the family is a family ofgood codesand that the codes in the family are good. Similarly, we say that the family of codes has a decoding algorithm for a fractionτ of errors if for eachCithere is a decoding algorithmDithat can correct aτ fraction of errors. WhenCis a linear code we define the dual ofCin the following way:
Lemma 2.3. Let C be an[m,n,k]-code overFsuch that C has an efficient decoding algorithm that can
correct anα fraction of errors. For i∈[m]let ai∈Fnbe the i-th row of the generator matrix of C. Then,
1. Let S0= (a1, . . . ,am,¯0, . . . ,¯0)∈(Fn)2m. Then S0is an(n,1,α/2)-NIS with an efficient interpolation algorithm.
2. Let S= (a1, . . . ,am)∈(Fn)mand suppose that the maximal Hamming weight of a codeword in C is
smaller than(1−2α)·m. Then S is an(n,1,α)-NIS with an efficient interpolation algorithm.
Proof.
1. Letqbe a degree-1 polynomial. The interpolation algorithm forS0will work as follows: we first look at the values ofq(x)on the lastmpoints (the zeros). The majority of these values will beq(0)
which will give us the constant term inq. Letq1(x) =q(x)−q(0)be the linear part ofq. Subtracting
q(0)from the values in the firstmcoordinates we reduce ourselves to the problem of recovering the homogeneous linear functionq1from its values on(a1, . . . ,am)with at mostα·merrors. This
task is achieved using the decoding algorithm forC, since the vector(q1(a1), . . . ,q1(am))is just
the encoding of the vector of coefficients ofq1with the codeC.
2. Here we cannot decipher the constant term ofq(x)so easily. Let (v1, . . . ,vm)∈Fbe the list of input values given to us. (vi=q(ai)for a 1−α fraction of the valuesi.) What we will do is we will go over all p=|F|possible choices for q(0) and for each “guess”c∈Fdo the following: Subtractcfrom the values(v1, . . . ,vm)and then use the decoding algorithm ofCto decode the
vectorVc= (v1−c, . . . ,vm−c). It is clear that forc=q(0), this procedure will give us the list of
coefficients ofq(x)(without the constant term). Our task is then to figure out which invocation of the decoding algorithm is the correct one.
Say that the decoding algorithm, on inputVcreturned a linear polynomialqc(x)(there is no constant term). We can then check to see whetherqc(ai) +cis indeed equal tovifor a 1−α fraction of the valuesi. If we could show that this test will succeed only for a singlec∈Fthen we are done and the lemma will follow. Suppose for contradiction that there are two linear polynomialsqc(x)and
qc0(x)such that both agree with a fraction 1−α of the input values. This means that there exist
two codewordsWc,Wc0∈Fm(the encodings ofqcandqc0) inCsuch that dist(Vc,Wc)≤α·mand
dist(Vc0,Wc0)≤α·m. Therefore
dist(Vc−Vc0,Wc−Wc0)≤2α·m.
The vectorVc−Vc0 has the valuec0−c6=0 in all of its coordinates and so we get that the Hamming
weight of the codewordWc−Wc0 is at least(1−2α)·m, contradicting the properties ofC.
Lemma 2.4. Let C be an[m,n,k]-code overFsuch that the dual C⊥ has minimal distance>r. Let
S⊆Fn be the set of points corresponding to the rows of the generator matrix of C. Then S satisfies
Proof. Let 0<d≤r/2 be an integer and letSd=S+. . .+S ,dtimes. We will prove the lemma by
showing that
{x∈Sd|Nd(x)>d} ⊆Sd−2,
(the reader is encouraged to reviewDefinition 1.4for the definition ofNd(x)). Letx∈Sd. We can thus write
x=a1+· · ·+ad,
witha1, . . . ,ad∈S. Suppose now thatNd(x)>d. This means that there exists a way to writexas a sum
of elements inSas follows
x=b1+· · ·+bd−1+c,
withc6∈ {a1, . . . ,ad}. Combining the two ways to writexwe get
a1+· · ·+ad−b1− · · · −bd−1−c=0.
Since the distance of the dual is larger than 2dthe above equality must be a trivial one. That is, every vector must appear with a coefficient that is a multiple ofp(the characteristic). Sincec∈ {a/ 1, . . . ,ad}, we infer thatcmust appear as one of the valuesbi(otherwise it would have coefficient−1). This means
thatxis inSd−2.
3
Constructing a NIS
In this section we proveTheorem 1.2andCorollary 1.3. We start with the proof ofCorollary 1.3, which is a simple application of known results in coding theory.
Proof ofCorollary 1.3. To prove the corollary, all we need to do is to construct, for alln∈N, an(n,1,ε)
-NISS(1n)with an efficient interpolation algorithm andεwhich does not depend onn. The corollary will then follow by applyingTheorem 1.2.
To constructS1we take a good family of linear codes{Cn}, whereCnis an[mn,n,kn]-code overF (seeSection 2.2) that has an efficient decoding algorithm that can decode a constant fraction of errors and such that the generator matrix ofCncan be found in polynomial time (such families of codes are known to exists, see [10, Chap. 10]). Leta1, . . . ,amn ∈Fnbe the rows of its generator matrix. We defineS
(n)
1 to
be the list of points(a1, . . . ,amn,b1, . . . ,bmn), where for each j∈[mn]we setbj=0. That is,S
(n)
1 contains
the rows of the generator matrix of a good code, together with the points 0, taken with multiplicitymn.
Section 2.3now shows thatS(1n)satisfies all the required conditions.
3.1 Proof ofTheorem 1.2
LetS1= (a1, . . . ,am)∈(Fn)mbe an(n,1,ε1)-NIS of size|S1|=m. LetSd−1= (b1, . . . ,br)∈(Fn)r be an(n,d−1,εd−1)-NIS of size|Sd−1|=r. LetAd−1andA1be interpolation algorithms forSd−1andS1
respectively, as described inDefinition 1.1. LetSd=Sd−1S1. We will prove the theorem by showing
thatSdhas an interpolation algorithm that makes at most a polynomial number of calls toAd−1and toA1
and can “correct” a fraction
εd=
ε1·εd−1
of errors. We will describe the algorithmAdand prove its correctness simultaneously (the number of calls
toAd−1andA1will clearly be polynomial and so we will not count them explicitly).
Fixq∈Fd[x1, . . . ,xn]to be some degree-dpolynomial and letqdbe its homogeneous part of degree d. Let us denoteSd= (c1, . . . ,cmr), where eachciis inFn. We also denote bye= (e1, . . . ,emr)∈Fmr the list of “errors,” so that|{i∈[mr]|ei6=0}| ≤εd·mr. The listSdcan be partitioned in a natural way into
all the “shifts” of the listSd−1. Let us define for eachi∈[m]the listT(i)= (b1+ai, . . . ,br+ai)∈(Fn)r. We thus have thatSdis the concatenation ofT(1), . . . ,T(m)(the order does not matter here). We can also partition the list of errors in a similar way intomlistse(1), . . . ,e(m), each of lengthr, such thate(i)is the list of errors corresponding to the points inT(i).
We say that an indexi∈[m]isgoodif the fraction of errors inT(i)is at mostεd−1/2, otherwise we
say thatiisbad. That is,iis good if
|{j∈[r]|e(ji)6=0}| ≤(εd−1/2)· |T(i)|= (εd−1/2)·r.
LetE={i∈[m]|iis bad}. From the bound on the total number of errors we get that
|E| ≤ε1·m. (3.1)
Overview The algorithm is divided into three steps. In the first step we look at all pairs(T(i),T(j))and from each one attempt to reconstruct, usingAd−1, the directional derivative∂qd(x,ai−aj). We will claim
that this step gives the correct output for most pairs(T(i),T(j))(namely, for all pairs in which bothiand j
are good). In the second step we take all the directional derivatives obtained in the previous step (some of them erroneous) and from them reconstruct, usingA1, the vector of derivatives∆qd(x)and so also recover
qd(x). In the last step of the algorithm we recover the polynomialq≤d−1(x) =q(x)−qd(x), again using
Ad−1. This will give usq(x) =q≤d−1(x) +qd(x).
Step I: Let i6= j∈[m] be two good indices as defined above. We will show how to reconstruct ∂qd(x,ai−aj)from the values inT
(i),T(j). Recall that we have at our disposal two lists of values
Li=q(b1+ai) +e(
i)
1 , . . . ,q(br+ai) +e
(i)
r
and
Lj=
q(b1+aj) +e
(j)
1 , . . . ,q(br+aj) +e
(j)
r
.
Taking the difference of the two lists we obtain the list
Li j=q(b1+ai)−q(b1+aj) +e(
i)
1 −e
(j)
1 , . . . ,q(br+ai)−q(br+aj) +e
(i)
r −e( j)
r
.
Observe that since i and j are both good we have that the term e`(i)−e(`j) is non zero for at most εd−1·rvalues of`∈[r]. Therefore, we can use algorithmAd−1to recover the degree-(d−1)polynomial
We carry out this first step for allpairs (T(i),T(j)) and obtain m2 homogeneous polynomials of degreed−1, let us denote them byRi j(x). We know that ifiand jare both good then
Ri j(x) =∂qd(x,ai−aj).
If eitherior jare bad, we do not know anything aboutRi j(x)(we can assume without loss of generality
that it is homogeneous of degreed−1 since if it is not then clearly the pair is bad and we can modify it to any polynomial we wish, without increasing the total number of errors).
Step II: In this step we take the polynomials Ri j obtained in the first step and recover from them
the polynomial∆qd(x)(then, usingSection 2.2, we will also haveqd(x)). We start by setting up some
notations: The set of degree-(d−1)monomials is indexed by the set
Id−1={(α1, . . . ,αn)|0≤αi≤p−1,α1+. . .+αn=d−1}.
We denote by xα =
∏ixαii and also denote by coef(xα,h) the coefficient of the monomial xα in a
polynomialh(x). Letα ∈Id−1and define the degree-1 polynomial
Uα(y1, . . . ,yn) =
n
∑
`=1
coef
xα,∂qd
∂x`
·y`.
Observe that
∂qd(x,ai−aj) = n
∑
`=1
(ai−aj)`·
∂qd
∂x`
(x)
=
∑
α∈Id−1
xα·U
α(ai−aj).
Therefore, for each pairi,jsuch that bothiand jare good we can get the (correct) valuesUα(ai−aj)for
allα∈Id−1by observing the coefficients ofRi j.
Fix someα∈Id−1. Using the procedure implied above for all pairsi6=j∈[m], we get m2
values
ui j∈Fsuch that ifiand jare good then
ui j=Uα(ai−aj).
We now show how to recoverUα from the valuesui j. Repeating this procedure for allα ∈Id−1will give
us∆qd(x).
Since we fixedα let us denote byU(y) =Uα(y). We have at out disposal a list of values(ui j)i,j∈[m]
such that there exists a setE ={i∈[m]|iis bad} of size |E| ≤ε1·m such that if i and j are not
in E then ui j =U(ai−aj). Let us partition the list (ui j) according to the index j into m disjoint lists: Bj = (u1j, . . . ,um j). If j6∈E than the list Bj contains the values of the degree-1 polynomial
Uj(y) =U(y−aj)on the setS1with at mostε1·merrors (the errors will correspond to indicesi∈E).
Therefore, we can useA1in order to reconstructUj, and from itU. The “problem” is that we do not
know which values jare good. This problem can be easily solved by applying the above procedure for all
j∈[m]and then taking the majority vote. Since all the good values of jwill return the correctU(y)we will have a clear majority of at least a 1−ε1fraction. Combining all of the above gives us the polynomial
Step III: Having recoveredqdin the last step we now subtract the valueqd(ci)from the input list of
values (these are the values ofq(x)onSd, with≤εd fraction of errors). This reduces us to the problem of
recovering the degree-(d−1)polynomialq≤d−1(x) =q(x)−qd(x)from its values inSd with a fraction
εdof errors. Solving this problem is easy to do by using the algorithmAd−1on the values in each listT(j)
(defined in the beginning of this section) and then taking the majority. Since for a good value j, the list
T(j)contains at mostεd−1·rerrors, and since more than half of the values jare good, we will get a clear
majority and so be able to recoverq≤d−1.
4
Constructing a proper NIS
In this section we proveTheorem 1.5andCorollary 1.6. As before we start with the proof ofCorollary 1.6
and then go on to prove the theorem.
Proof ofCorollary 1.6. This will be similar to the proof ofCorollary 1.3. All we need to do is find, for
eachn, a proper(n,1,ε)-NIS that satisfies conditionFd. In order to do so we will need a good family of
codes{Cn}, such that eachCnis an[mn,n,kn]-code, which satisfies the following three conditions:
1. {Cn}has an efficient decoding algorithm that can decode a constant fractionα=Ω(1)of errors.
2. The minimal distance of the dualCn⊥is larger than 2·d.
3. Each codeword inCnhas Hamming weight at most(1−2α)·mn.
Such a family of codes can be constructed explicitly using standard techniques (seeSection 6).
The points inS1are given by the rows of the generator matrix of the codeCn. The rows are all distinct
since otherwise we would have a codeword of weight 2 in the dual. FromSection 2.4we get thatS1
satisfies conditionFdand from item 2. ofSection 2.3we get thatS1is an(n,1,α)-proper NIS with an efficient interpolation algorithm.
4.1 Proof ofTheorem 1.5
The proof ofTheorem 1.5uses the same arguments as the proof ofTheorem 1.2. The only difference is that the “shifts” ofSd−1(denoted byTi,i=1, . . . ,m) might intersect. However, using the fact thatS1
satisfies conditionFk fork≥d, we will be able control the effect of these intersections. To be more
precise, the only place where the proof will differ from the proof ofTheorem 1.2is where we claim that there are not too many “bad” indicesi∈[m](see below for details). The rest of the proof will be identical and so we will not repeat the parts that we do not have to.
We start by setting the notations for the proof. Let S1⊆Fn be a proper (n,1,ε1)-NIS such that
|S1|=mand letSd−1⊆Fn be a proper(n,d−1,εd−1)-NIS obtained by summingS1 with itselfd−1
times. LetAd−1 andA1be interpolation algorithms forSd−1andS1 respectively. LetSd =Sd−1+S1.
As before, the theorem will follow by giving an interpolation algorithm for Sd that makes at most a polynomial number of calls toAd−1and toA1and can “correct” a fraction
εd=
εd−1·ε1
of errors (the expression forεd appearing in the theorem can be deduced from this equation by a simple
induction).
Fix a degree-dpolynomialq∈Fd[x1, . . . ,xn]and letqdbe its homogeneous part of degreed. Write
S1={a1, . . . ,am} (theai are distinct). The setSd can be divided intom (possibly overlapping) sets T1, . . . ,Tmsuch that
Ti=Sd−1+ai and Sd=
[
i∈[m]
Ti.
Our algorithm is given a list of values ofq(x)on the setSd with at mostεd· |Sd|errors. LetB⊆Sd be the
set of points inSdsuch that the value ofq(x)on these points contains an error. We therefore have
|B| ≤εd· |Sd|. (4.2)
We say that an indexi∈[m]isbadif
|Ti∩B|>(εd−1/2)· |Ti|= (εd−1/2)· |Sd−1|.
The next claim bounds the number of bad valuesi.
Claim 4.1. Let E={i∈[m]|i is bad}. Then
|E| ≤ε1·m.
If we can proveClaim 4.1then the rest of the proof ofTheorem 1.5will be identical to the proof of
Theorem 1.2(starting from the part titled “Step I”), as the reader can easily verify. IndeedClaim 4.1is the analog of Equation (3.1) for the case of proper sets. We therefore end this section with a proof of
Claim 4.1.
4.1.1 Proof ofClaim 4.1
In order to proveClaim 4.1we will require the following auxiliary claim on the growth rate of the sumsets ofS.
Claim 4.2. Let S⊆Fnbe of size m such that S satisfies conditionFkand let the sets S0,S1, . . . ,Sk be
as inDefinition 1.4(the sumsets of S). Then there exists a constant C1depending only on k such that if
|S|>C1then for every d≤k we have
|Sd| ≥ 1
2d· |Sd−1| · |S|. (4.3)
Proof. The proof is by induction ond. Ford=1 the bound in (4.3) is trivial. Suppose the bound holds
ford−1 and lets prove it ford. Let us denoteS={a1, . . . ,am}andTi=Sd−1+ai. ClearlySd=
Sm
i=1Ti.
We will lower bound the number of points inSdin the following way: For eachiwe will take all points
x∈Ti such thatNd(x)≤d and than divide the number of points we got byd. Since there are at most |Sd−2|elementsx∈Sd withNd(x)>d, we get the bound
|Sd| ≥ 1 d·
m
∑
i=1(|Ti| − |Sd−2|)
= 1
d· |S| · |Sd−1| −
1
Using the inductive hypothesis on the second term yields
|Sd| ≥
1
d· |S| · |Sd−1| −
2d−2
d · |Sd−1|
≥ 1
2d· |S| · |Sd−1|,
where the last inequality holds for sufficiently large|S|(as a function ofd).
Proof ofClaim 4.1. Lett=|E|and suppose in contradiction thatt>ε1·m=ε1· |S|. This means that
there are at leastt setsTi1, . . . ,Tit such that|B∩Tij| ≥(εd−1/2)· |Sd−1|. From conditionFd we know
that, apart from a set of size at most|Sd−2|, each “bad” element appears in at mostdsetsTij. Therefore
εd· |Sd| ≥ |B| ≥
1
d·ε1· |S| · εd−1
2 · |Sd−1| − |Sd−2|
= εd−1·ε1
2d · |Sd−1| · |S| − ε1
d · |S| · |Sd−2| ≥ εd−1·ε1
2d · |Sd| − ε1
d · |S| · |Sd−2|.
UsingClaim 4.2twice we get that
|S| · |Sd−2| ≤2(d−1)· |Sd−1| ≤4d(d−1)·
|Sd| |S| .
Plugging this inequality into the previous one we get that
εd· |Sd| ≥
εd−1·ε1
2d · |Sd| −ε1·4·(d−1)· |Sd|
|S| ,
which after division by|Sd|gives
εd ≥
εd−1·ε1
2d −
ε1·4·(d−1)
|S|
> εd−1·ε1
3d ,
(whenm=|S|is sufficiently large, as a function ofε1andd) contradicting Equation (4.2).
5
Other finite fields
We briefly sketch how one can extend Theorems1.2and1.5to non-prime finite fields.
The main difference is that over extension fields, one has to considerthe weight-degreeinstead of the degree. Specifically, letFbe a field of order prfor a primep. Given an integerd<prletd=∑ri=−01aipi,
0≤ai<p, be its base-pexpansion. The weight ofd, denoted wt(d), is defined as wt(d) =∑ri=−01ai. For
example, the weight of 14 overF9is 4 as 14=1·9+1·3+2 and 1+1+2=4. The weight-degree of a
monomialxdis simply the weight ofd. Similarly the weight-degree of a multivariate monomial∏ni=1x
is∑ni=1wt(di). The weight-degree of a polynomialq(x1, . . . ,xn)is the maximum of the weight-degrees of
its monomials. Intuitively, the weight-degree is the correct notion of degree as the polynomialxpi is in fact a linear map overF, and soxaip
i
“behaves” like a polynomial of degreeai.
It is not hard to see that polynomials of weight-degreed in Fpr[x1, . . . ,xn]that map (Fpr)n toF
correspond to degree-dpolynomials innrvariables overFpand vice versa (see, e. g., [4]).
Now, given a polynomialqof weight-degreedoverFpr and anyα∈Fpr, letqα(x1, . . . ,xn)be defined
asqα=Tr(α·q(x1, . . . ,xn)), where Tr is the trace operator; Tr(x) =∑ri=−01x
pi
. It is easy to see thatqα is of weight-degree at mostdand thatqα:(Fpr)n→F. It is also not hard to see that if one can learn all the
valuesqαthen one can easily reconstructq(by a simple linear transformation).
The argument above shows that in order to construct NIS for polynomials of weight-degreedover Fpr it is sufficient to construct NIS for polynomials of degree d in nrvariables. This explains how
Theorems1.2and1.5can be extended to non-prime finite fields. We thus obtain the following Theorem that enables an immediate translation of the results stated in Theorems1.2and1.5and Corollaries1.3,
1.6and1.7to the context of fields of prime power order.
Theorem 5.1. Let p be a prime number, n,r and d integers, andε>0a real number. Fix a basis forFpr overFp. If Sdis an(nr,d,ε)-NIS overFpthen, when viewed as a collection of n-tuples overFpr according
to the fixed basis, Sd is an(n,d,ε)-NIS overFpr (i. e., it is a NIS for polynomials of weight-degree d
overFpr). Moreover, if Sd has an interpolation algorithm that runs in time T overFpthen it has such
an algorithm also when viewed as a NIS overFpr. The algorithm can handle the same amount of errors
and its running time is O(rT) +poly(n). In addition, if Sdis a proper NIS overFpthen it is a proper NIS
overFpr.
We leave the exact details of the proof to the reader.
6
A family of codes with special properties
In the proof ofCorollary 1.6we used a family of good linear error-correcting codes{Cn}, over a fieldF of prime orderp, such thatCnhas message lengthnand such that the following three properties
1. The family{Cn}has an efficient encoding/decoding algorithms from a constant fraction of errors.
2. The dual distance ofCnisω(1).
3. If the block length ofCnismnthen every codeword inCnhas Hamming weight at most(1−δ)·mn for some constantδ >0.
Efficient codes with good dual distance are known to exists over any finite field. Results of this type can be found, e. g., in [10]. It is possible that some of these codes already posses property (3) above, however, we could not find a proof for that anywhere. Since we only require the dual distance to be super-constant we can give a simple self-contained construction satisfying all three conditions above. We sketch the argument below without giving a complete proof.
Say we wish to construct a code as above for messages of lengthn. We pick an integer`such that
that` <log(n). We construct a fieldEof orderp`(this can be done deterministically in time poly(n)). We now take a Reed-Solomon codeRoverE, of block length p`, whose messages are univariate polynomials of degreeα·p`, for some constantα. Notice that this code has minimal distance(1−α)·p` (over
E), relative rateα, and its dual distance is at leastα·p`. Letϕ:E→F`be the natural representation ofEas`-tuples overF. We can extendϕ in the natural way to be a mapping fromEp
`
to(F`)p
`
. In particular we can think ofϕ(R)as a code overFof the same relative rate and of at least the same minimal distance. It is not hard to see that there exists an invertible linear transformationA:F`→F`, such that
v= (v1, . . . ,vp`)∈R⊥if and only if4(Aϕ(v1), . . . ,Aϕ(vp`))∈ϕ(R)⊥(there is a slight abuse of notation
here, but the meaning is clear). In particular we get that the minimal distance ofϕ(R)⊥is the same as the minimal distance ofR⊥.
Now, for every 1≤i≤p`we pick a code ˜Ci:F`→Fmi, such thatmi=O(`)and such that∑p
`
i=1mi=n.
Moreover, we would like the codes ˜Cito satisfy the three properties stated above. It is not hard to play a little with the parameters to get that such ˜Ci andmiexist. We also note that the ˜Cican be constructed recursively. We now “concatenate” the code5ϕ(R)with the ˜Cisuch thatCiis applied on thei-th block, of
length`, ofϕ(R). Denote the resulting code withCn. It is now easy to verify that indeed the block length ofCnisn, that it has constant rate and linear minimal distance, and thatC⊥n has minimal distanceω(1).
Acknowledgement
We would like to thank Parikshit Gopalan for bringing the problem to our attention, Emanuele Viola for sharing the manuscript of [14] with us, and Shachar Lovett and Chris Umans for helpful discussions. We also thank the anonymous reviewers for comments that improved the presentations of the results. We are thankful to Laci Babai for helpful comments.
References
[1] SANJEEV ARORA, CARSTEN LUND, RAJEEV MOTWANI, MADHU SUDAN, AND MARIO SZEGEDY: Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555, 1998. [doi:10.1145/278298.278306] 2
[2] SANJEEVARORA ANDSHMUELSAFRA: Probabilistic checking of proofs: A new characterization of NP. J. ACM, 45(1):70–122, 1998. [doi:10.1145/273865.273901] 2
[3] L ´ASZLO´ BABAI, LANCEFORTNOW, NOAMNISAN,ANDAVIWIGDERSON: BPP has subexponen-tial time simulations unless EXPTIME has publishable proofs. Comput. Complexity, 3(4):307–318, 1993. [doi:10.1007/BF01275486] 2
[4] ELI BEN-SASSON AND MADHUSUDAN: Limits on the rate of locally testable affine-invariant codes. Electron. Colloq. on Comput. Complexity (ECCC), 17:108, 2010. [ECCC:TR10-108] 15 4We would like to haveϕ(R)⊥=ϕ(R⊥), however this is not true, so we haveAthat “fixes” this.
[5] ANDREJBOGDANOV: Pseudorandom generators for low degree polynomials. InProc. 37th STOC, pp. 21–30. ACM Press, 2005. [doi:10.1145/1060590.1060594] 3
[6] ANDREJ BOGDANOV AND EMANUELE VIOLA: Pseudorandom bits for polynomials. SIAM J.
Comput., 39(6):2464–2486, 2010. [doi:10.1137/070712109] 3
[7] RUSSELL IMPAGLIAZZO AND AVI WIGDERSON: P=BPP unless E has subexponential cir-cuits: Derandomizing the XOR lemma. InProc. 29th STOC, pp. 220–229. ACM Press, 1997. [doi:10.1145/258533.258590] 2
[8] J. JUSTESEN: A class of constructive asymptotically good algebraic codes. IEEE Trans. Inform.
Theory, 18(5):652–656, 1972. [doi:10.1109/TIT.1972.1054893] 16
[9] SHACHARLOVETT: Unconditional pseudorandom generators for low degree polynomials. Theory
of Computing, 5(1):69–82, 2009. [doi:10.4086/toc.2009.v005a003] 3
[10] F. J. MACWILLIAMS AND N. J. A. SLOANE: The Theory of Error-Correcting Codes, Part II. North-Holland, 1977. 2,9,15
[11] RONENSHALTIEL ANDCHRISTOPHERUMANS: Simple extractors for all min-entropies and a new pseudorandom generator. J. ACM, 52(2):172–216, 2005. [doi:10.1145/1059513.1059516] 2
[12] MADHUSUDAN, LUCATREVISAN,ANDSALILVADHAN: Pseudorandom generators without the XOR lemma. J. Comput. System Sci., 62(2):236–266, 2001. [doi:10.1006/jcss.2000.1730] 2
[13] AMNONTA-SHMA, DAVIDZUCKERMAN,ANDSHMUELSAFRA: Extractors from Reed-Muller codes. J. Comput. System Sci., 72(5):786–812, 2006. [doi:10.1016/j.jcss.2005.05.010] 2
[14] EMANUELEVIOLA: The sum ofdsmall-bias generators fools polynomials of degreed. Comput.
Complexity, 18(2):209–217, 2009. [doi:10.1007/s00037-009-0273-5] 3,16
AUTHORS
Zeev Dvir
post-doctoral researcher
Princeton University, Princeton, NJ zeev dvir gmail com
http://www.cs.princeton.edu/~zdvir/
Amir Shpilka associate professor
Technion – Israel Institute of Technology, Haifa, Israel shpilka cs technion ac il
ABOUT THE AUTHORS
ZEEVDVIRis currently a postdoc in the Computer Science department atPrinceton Univer-sity. He received his Ph. D. from theWeizmann Institutein Israel in 2008. His advisors wereRan RazandAmir Shpilka. He is interested mainly in questions in computational complexity that have a connection with classical mathematics.