2.5 Disscussion
2.5.2 An O(n 1/3 )-query algorithm for our distributions
The idea sketched above can be applied to our far from monotone distributionDno from Section2.2. It is slightly more complicated, since now the algorithm must attack two levels of Talagrand functions, which will incur the query cost of ˜O(n1/3)rather than O(n1/4).
Similarly to Section2.5.1above, we will give a high level description rather than a formal analysis. The goal is to show the main obstacle one faces in improving the lower bound.
Assume g is in the support ofDno. The algorithm works in stages and follows a similar pattern to the one described in Section2.5.1above. We may assume the algorithm has a string x∈ {0, 1}nwhere x satisfies a unique term Ti, and falsifies no clauses, so g(x) = 1 (this happens with Ω(1) probability for a random string x).
Stage 1. Repeat the following for n1/3times: pick a random subset R⊂ A1of size
√nand query g(x(R)). Let A′1denote A1 after removing those indices of R with
g(x(R))) = 1encountered. Then we have Ti ⊂ A′1and most likely, C1 = A1\ A′1
has size Θ(n5/6). Suppose this is the case and let’s fix the new (smaller) set A1 = A1 (as well as C1).
The following stage will be repeated for n1/6 many times, and each makes n1/6 many queries.
Stage 2. Pick a random subset C0 ⊂ A0 of size n5/6. Let y = x(C1∪C0)and query g(y). With probability Ω(1), g(y) satisfies the unique term Ti (as did x), falsifies a unique clause Ci,j, and g(y) = hi,j(y) = 0. Additionally, with probability
Ω(n−1/6), hi,j(y) = yℓ, where ℓ∈ C0.
Ti
C1
C
Ci,j
C0 ℓ
Figure 2.4: A visual representation of the algorithm for finding violations in the two-level Talagrand construction. The whole rectangle represents the set [n], which is shaded for indices which are set to 1, and clear for indices which are set to 0 (at the beginning of Stage 1, indices on the left of the centerline are set to 1’s, and indices on the right are set to 0’s). Tiis the unique term satisfied and Ci,j is the unique clause falsified. The functions hi,j is an anti-dictator of index ℓ. The sets illustrated represent the current knowledge at the end of Stage 3 of the algorithm. Note that|C1| = Θ(n5/6),|C| = Θ(n2/3),|C0| = n5/6,
|Ti| = |Ci,j| = Θ(√ n).
Overall the event ℓ∈ C0 happens with probability Ω(1) since we repeat Stage 2 for n1/6 times. Fixing C0, y and ℓ such that the above event happens, we will likely find a violation with help of them as follows:
Stage 3. Repeat the following for n1/6times: pick a random subset R⊂ A0\ C0 of size√
nand query g(y(R)). Let A′0denote A0\ C0 after removing those indices of R with g(y(R)) = 0. Let C = (A0 \ C0)\ A′0, where very likely|C| = Θ(n2/3). Fix C = C such that it’s of size Θ(n2/3). Our sets satisfy the following three
conditions: 1) Ti ⊂ A1, 2) Ci,j ⊂ A′0∪ C1\ C0, and 3) ℓ∈ C0. See Figure2.4for a visual representation of these sets.
Stage 4. Partition C0 into O(n1/6)many disjoint parts C0,1, C0,2, . . ., each of size Θ(n2/3)and query g(y(C0,j∪C)). For each C0,j with ℓ /∈ C0,j and no new terms are satisfied, g must return 0. If for some sets C0,j, g returns 1, then either ℓ∈ C0,j and
no new terms are satisfied, or new terms are satisfied; however, we can easily distinguish these cases with a statistical test.
The final stage is very similar to the final stage of Section2.5.1. After Stage 4, we assume we have found a set C0,j containing ℓ. We further partition C0,j(when g(y(C0,j∪C)) = 1) into O(n1/6)parts of size√
nto find a violation. One can easily generalize the above algorithm sketch to O(1)-many levels of Talagrand. This suggests that the simple extension of our construction to O(1) many levels (which still gives a far-from-monotone function) cannot achieve lower bounds better than ˜Ω(n1/3).
This page intentionally left blank.
Chapter 3
Distribution-free Testing of k-juntas
In this chapter we will start to discuss about the more general problem of distribution-free testing. As the first example, we will study testing of k-juntas under this setting.
We will present an adaptive distribution-free tester for k-juntas with query complexity O(k˜ 2/ϵ)(independent of n), and also show that any non-adaptive distribution-free testers for k-juntas must make at least ˜Ω(2k/3) queries (for some constant distance parameter ϵ). Combining these two results together we know adaptivity provides an exponential im-provement (in terms of query complexity) for this problem, which stands in sharp contrast to the standard uniform distribution testing of k-juntas. More formally, we will start with some preparation in Section3.1. Then we will present our algorithm and prove Theorem 1.3.2in Section3.2, and present the proof of our lower bound, stated as Theorem1.3.3, in Section3.3.
3.1 Preparation
In this section we give some formal definitions and notation that will be useful.
We study the distribution-free testing of k-juntas in this chapter. Recall that a Boolean function f is a junta if it depends on at most k variables. More precisely, f is a k-junta if there exists a subset J = {i1, . . . , ik} ⊂ [n] of size k and a Boolean function g : {0, 1}k → {0, 1} over k variables such that f(x1, . . . , xn) = g(xi1, . . . , xik)for all x = (x1, . . . , xn)∈ {0, 1}n.
We say that a Boolean function f is a literal if f depends on exactly one variable, i.e.
Procedure BinarySearch(f, x, y)
Input: Black-box oracle access to f : {0, 1}n→ {0, 1} and two strings x, y ∈ {0, 1}n with f(x) ̸= f(y).
Output: Two strings x′, y′ ∈ {0, 1}nwith f(x′)̸= f(y′)and x′ = y′(i)for some i∈ diff(x, y).
1. Let B = diff(x, y)⊆ [n].
2. If|B| = 1 return x and y.
3. Partition (arbitrarily) B into B1 and B2 of size⌊|B|/2⌋ and ⌈|B|/2⌉, respectively.
4. Query f(x(B1)).
5. If f(x)̸= f(x(B1)), return BinarySearch(f, x, x(B1)).
6. Otherwise, return BinarySearch(f, x(B1), y).
Figure 3.1: Description of the standard binary search procedure.
for some i∈ [n], we have that either f(x) = xi for all x or f(x) = xi for all x. Note that the two constant (all-1 and all-0) functions are one-juntas but are not literals.
We will often work with restrictions of Boolean functions. Given f : {0, 1}n → {0, 1}, B ⊆ [n] and a string z ∈ {0, 1}B, the restriction of f over B by z, denoted by f↾z, is the Boolean function g : {0, 1}B → {0, 1} defined by g(x) = f(x ◦ z) for all x∈ {0, 1}B. We will also use the term “block” to refer to a nonempty subset of [n], which should be interpreted as a nonempty subset of the n input variables of a Boolean function f :{0, 1}n→ {0, 1}. The following definition of distinguishing pairs and relevant blocks will be heavily used in our algorithms.
Definition 3.1.1 (Distinguishing pairs and relevant blocks). Given x, y ∈ {0, 1}n, a Boolean function f, and a block B ⊆ [n], we say that (x, y) is a distinguishing pair of f for B if xB = yB(and therefore diff(x, y)⊂ B) and f(x) ̸= f(y). We say B is a relevant block of f if such a distinguishing pair exists for B (or equivalently, the influence of B in f is positive).
When B ={i} is a relevant block we simply say that the ith variable is a relevant variable of f.
Clearly an empty set B can never has a distinguish pair, and from now on whenever
we say there is a set with a distinguish pair, it can be assumed to be a (nonempty) block automatically.
An important ingredient of our algorithms is the following binary search procedure (see Figure 3.1): it takes as input two strings x, y ∈ {0, 1}n with f(x)̸= f(y), makes O(log n) queries on f, and returns a pair of strings x′, y′ ∈ {0, 1}nwith f(x′)̸= f(y′)and x′ = y′(i)for some i ∈ diff(x, y), i.e., a distinguishing pair for the ith variable for some i∈ diff(x, y).