Monotonicity testing, alternating paths,
directed isoperimetry, and strawberries
C. Seshadhri
(Sandia National Labs, Livermore)
Chapter I
Getting to know the problem
Property Testing in a Slide
•
f: {0,1}
n-> R
– Amen
•
R = {0,1}, or the reals
•
A “property” P is some subset of these functions
•
Hamming distance between two functions:
d(f,g) = (# x s.t. f(x) ≠ g(x))/ 2
n•
d(f,P) = min
g in Pd(f,g). The minimum fraction of values you
need to change so f “has P”
Property Testing in (another) Slide
•
Suppose you have query access to f
•
Standard decision problem: is f in P or d(f,P) ≠ 0?
•
Relaxed version with ε
(0,1): is f in P or d(f,P) > ε
•
Can hopefully decide with much fewer queries than 2
n(using
randomization)
•
Property tester: if input f has P, accept with prob > 2/3.
If input f is ε-far from P, reject with prob > 2/3.
ε-far
Monotonicity
•
Take product ordering on {0,1}
n: x ≤ y if
i, x
i
≤ y
i•
Standard partial order given by containment in {0,1}
n•
f is monotone if
x < y, f(x) ≤ f(y)
5 4
5 8
6 7
3 9
(0,0,0)
(1,1,1) 7
Distance to monotonicity
•
ε
f= distance to monotonicity = min fraction of values to be
changed to make f monotone
•
Can change to any real number
•
ε
f= min fraction of value to remove to make remaining f
monotone
•
This definition is independent of range of f
5 4 5 8 6 7 3 9
7
7
6x < y,
then f(x) ≤ f(y)
x y
Can “fill in”
Monotonicity testing
• f is monotone if x < y, f(x) ≤ f(y)
• [Goldreich Goldwasser Lehman Ron Rubinfeld Samorodnitsky 00]:
Property testing monotonicity
• f:{0,1}n -> {0,1}
• Tester accepts when f is monotone, rejects when εf > ε
• Violation is pair (x,y) where x < y, but f(x) > f(y)
• How to find violation in poly(n/ε) queries, when ε > ε?
1 0
1 1
1 1
The edge tester
• Edge tester: sample q uniform random edges. If any violation
detected, reject. Otherwise, accept.
• [GGL+00]: f:{0,1}n -> {0,1}. There are Ω(ε
f 2n) violated edges
• Total number of edges = n2n-1. Fraction of violated edges = Ω(ε
f /n)
• So O(n/ε) query edge tester finds violation when εf > ε
• Tight. εf = ½, 2n-1 violated edges
1 0
8
1 0 1
General ranges
• Edge tester: sample q uniform random edges. If any violation
detected, reject. Otherwise, accept.
• [Dodis Goldreich Lehman Raskhodnikova Ron Samorodnitsky 99]:
f:{0,1}n -> R. There are Ω(ε
f 2n/(log |R|)) violated edges
• Queries required by tester = O(n log |R|/ε) • log |R| can be n, so worst case tester O(n2/ε)
• Not known to be tight
Questions
• [GGL+00] f:{0,1}n -> {0,1}. There are ≥ ε
f 2n violated edges. Bound is
tight
• [DGL+99] f:{0,1}n -> R. There are ≥ ε
f 2n/(log |R|) violated edges.
• For boolean range, can we beat the O(n/ε)-query bound of the edge
tester?
• For general range, can we get better than O(n(log |R|)/ε) tester?
Can we improve bound on violated edges? Does it depend on |R|?
• [Blais Brody Matulef 11, Brody 13] For |R| > n1/2, Ω(n/ε) queries
required by any tester
• [Fischer Lehman Newman Raskhodnikova Rubinfeld 02]
For R = {0,1}, Ω(n1/2) queries required by any non-adaptive tester
• Ω(log n) required by any tester
10
Answers
• For f:{0,1}n -> R, there are ≥ ε
f 2n-1 violated edges
• This gives O(n/ε) monotonicity tester. Optimal when |R| > n1/2
• For f:{0,1}n -> {0,1}, there is O(ε-5/3n5/6) tester for monotonicity
• So edge tester can be beaten for boolean functions
• Key tool: new analysis using alternating paths in violation graph
• New “isoperimetric bounds” for directed hypercube
Chapter II
Previous work
The fixing argument
•
[GGL+00]
Suppose there are < ε
2
nviolated edges.
•
Change f at these endpoints and “few” other points to get
monotone function. Bound total change by O(ε
2
n), so ε
f
= O(ε)
•
Works for boolean range, get bound of violated edges ≥ ε
f2
n-1•
Fixing very hard with larger ranges
(0,0,…,0) (1,1,…,1)
1 0
Only violated edges
The range reductions
• [DGL+00] Given f, consider boolean fr : fr(x) = 0 if f(x) < r, f(x) = 1 otherwise
• Violated edge in fr is violated edge in f
• Argue about violated edges in various fr (using previous analysis) to give bound in f
• Get εf 2n/log |R| violated edges
0
1
xy x < y
f(x) > r f(y) < r
The routing approach
• Any path from s to t contains a violated edge
• There are at least
ε
f 2n-1 disjoint violated pairs• (s1, t1), (s2, t2),… all violations and disjoint
• [Lehman Ron 01] Given k (si, ti) pairs, how many can be routed with edge disjoint paths?
• [LR01] conjectured this to be k. Would give
ε
f 2n-1 violated edges for general ranges
t
(s,t) violation
s < t
The routing approach
• [LR01] Given k (si, ti) pairs, how many can be routed with edge
disjoint paths? Conjectured this to be k. Would give
ε
f 2n-1 violatededges for general range
• [Briet Chakraborty GarciaSoriano Matsliah10]
No! There are Ω(2n) pairs with only 2n/ n1/2 routable with
edge-disjoint paths
• Shows [LR01] approach can only give O(n3/2/ε) tester
• No non-trivial upper bound given
s
t
(s,t) violation
s < t
f(s) > f(t)
Beating O(n) testers for boolean range?
•
Edge tester: pick u.a.r x, take random walk of length 1 to
get y, compare f(x), f(y)
•
Path tester: pick u.a.r x, take random walk of length t to
get y, compare f(x), f(y)
•
t ≈ n
1/2is the “right distance”
•
Try to find distant violations
x
y Take directed hypercube
x y
Beating O(n) testers for boolean range?
•
Path tester: pick u.a.r x, take random walk of length n
1/2to
get y, compare f(x), f(y)
•
[Ron Rubinfeld Safra Weinstein 11]
For monotone boolean
f, path tester with n
1/2queries can approximate average
sensitivity
•
For anti-monotone f, path tester can approximate number
of violated edges
•
But can this give a o(n) monotonicity tester?
18x
y Take directed hypercube
x y
Chapter III
Directed edge isoperimetry
•
f:{0,1}
n-> {0,1}
•
Think of set S indexed by 1s. Let μ = |S|/2
n, μ ≤ ½
•
Φ(S) = E(S, S
c)/2
n-1•
[Folklore]
Φ(S) ≥ μ
•
(Follows from E(S, S
c) ≥|S|)
•
What about directed hypercube?
•
Φ
+(S) = E
+(S, S
c)/2
n-1•
E
+(S, S
c) = no. of violated edges
20S S
Edge testing = directed isoperimetry
•
Φ(S) ≥ μ
•
Φ
+(S) = E
+(S, S
c)/2
n-1•
Implicit from [GGL+00]: f:{0,1}
n-> {0,1}
Φ
+(S)
≥
ε
f
•
We give new proof that generalizes to f:{0,1}
n-> R
•
And prove that
# violated edges
≥
ε
f2
n-1Edge and vertex isoperimetry
•
Vertex expansion = Γ(S) = |Nbrs(S)|/2
n-1•
[Harper 66]
For size s, Γ(S) is minimized by Hamming ball
of size s
•
S is “dimension cut”: Φ(S) = 1 (small), but Γ(S) = 1 (large)
•
S is “majority”: Φ(S) ≈ n
1/2(large), but Γ(S) ≈ 1/n
1/2(small)
•
Size of middle layer is
22
S
S
Касающихся край и вершина границы
•
[Margulis 74]
Φ(S) Γ(S) = Ω(
μ
2)
•
(μ = |S|/2
n)
•
Actually proves it for the product distribution
•
If S is dimension cut, Φ(S) = 1, Γ(S) = 1
•
If S is majority, Φ(S) ≈ n
1/2, Γ(S) ≈ 1/n
1/2 SA directed version of Margulis’ theorem
•
[Margulis 74]
Φ(S) Γ(S) = Ω(
μ
2)
•
We prove
Φ
+(S) Γ
+(S) = Ω((ε
f)
2)
•
Either “many” violated edges or “large 1-0 boundary”
•
Proof also works with
Γ
+(S) := (max # disjoint 1-0 edges)/2
n-1•
Either “many” violated edges or “quite a few” disjoint
violated edges
24Γ
+(S) = |Nbrs
+(S)|/2
n-1Applications to monotonicity
•
Assume ε
fis constant, and we want to find violation is
o(n) time
•
We have Φ
+(S) Γ
+(S) = Ω(1)
•
If Φ
+(S) > n
1/6, then #violated edges > n
1/62
n•
So edge tester with n
5/6queries suffices
(n2
n/ n
1/62
n= n
5/6)
•
Otherwise, Γ
+(S) > n
-1/6Applications to monotonicity
•
Run edge tester with n
5/6queries
•
If edge tester fails, Φ
+(S) < n
1/6. So Γ
+(S) > n
-1/6•
So there are 2
n/n
1/6disjoint violated edges
•
Only 2
n/n
1/2in
•
Argue that the path tester finds violation with prob > n
-5/626
2n/n1/6
S
Chapter IV
But how to prove the isoperimetry?
•
f:{0,1}
n-> R
•
Assume that for edge (x,y), f(x) ≠ f(y)
•
Will show that there are ε
f2
n-1violated edges
28
The violation graph
• Violation graph (VG): vertices = domain, undirected edge (u,v) if (u,v) is violation
• Consider any vertex cover S of VG.
• No edges if we delete S ≡ No violations if f(x) undefined for all x in S • εf 2n = |min vertex cover|
• [GGL+00,DGL+99, FLN+02] Let M be maximal violation matching. |M|
≥ εf 2n-1
v u
From matchings to edges
•
Let VG be weighted with “magnitude of violation”
•
Key idea: let M be matching of maximum weight. (M must be
maximal, so |M| ≥ ε
f2
n-1)
•
Theorem: # violated edges ≥ |M|
v
u u < v, f(u) > f(v)
w(u,v) = f(u) - f(v)
10
6
Dimension by dimension
• Let Mi be pairs in M crossing dimension i.
• Not a partition, but
U
i Mi = M• Lemma: #violated dimension i edges ≥ |Mi|
• So total # violated edges ≥ Σi|Mi| ≥ |M|
Trying to find a violated edge
•
Start with (x,y) in M
i. So f(x) > f(y)
•
Project down along dimension i to get edge (s
1, y).
•
If violation, done. Otherwise f(s
1) < f(y) (using assumption of
distinct values at edges)
•
f(x) – f(s
1) > f(x) – f(y)
•
Suppose s
1was M-unmatched. Replace (x,y) by (x,s
1) in M to
increase weight and contradict maximum weight property of M
i x
y
s1
Trying to find a violated edge
• Start with (x,y) in Mi. So f(x) > f(y), and s1 is matched in M
• Suppose s1 < s2, so f(s1) > f(s2). Also x < s2
• f(x) – f(s2) = [f(x) – f(s1)] + [f(s1) – f(s2)] > [f(x) – f(y)] + [f(s1) – f(s2)]
• Max weight of M contradicted
i x
y
s1 s2
A few more steps…
i x
y
s1 s2
f(x) > f(y) s1 > s2 f(s2) > f(s1)
• If (s2, s3) violated, done. So f(s2) < f(s3)
• Suppose s3 is M-unmatched.
• [f(x) – f(s1)] + [f(s3) – f(y)] = [f(x) – f(y)] + [f(s3) – f(s1)] > [f(x) – f(y)] + [f(s2) – f(s1)]
(Contradiction!)
s3
Project up s2 to get s3. So s3 < y
A few more steps…
i x
y
s1 s2
• If (s2, s3) violated, done. So f(s2) < f(s3)
• Suppose s3 is M-unmatched.
• [f(x) – f(s1)] + [f(s3) – f(y)] = [f(x) – f(y)] + [f(s3) – f(s1)] > [f(x) – f(y)] + [f(s2) – f(s1)] (Contradiction!) • Suppose s4 < s3, so s4 < y
• [f(x) – f(s1)] + [f(s4) – f(y)] > [f(x) – f(y)] + [f(s2) – f(s1)]
s3
s4 f(x) > f(y)s
1 > s2
f(s2) > f(s1)
Where we are
i
• No violated edge encountered so far. Hence, s4 exists and s3 < s4.
• Project down to s4, and continue…? x
y
s1 s2
s3
s4
s5
Alternating paths: what’s really going on
• Let Hi be ith dimension edge. It’s a perfect matching.
• Symmetric diff of Hi and M is collection of alternating paths and cycles
• Partition this into segments between Mi-pairs
• Claim: Each segment has a violating edge.
Hi
M
x
y
s2
s1 s5 s6
The basic tools
•
For even i: if s
iexists, so does s
i+1. (H
iis perfect matching.)
•
What happens at odd i?
– si not matched by M, so alternating path ends
– si matched by M and (si, z) in Mi, so segment ends
– si matched by M and (si, si+1) not in Mi, so segment continues
•
We show: if no violated edge seen so far, only the last can
happen
•
If no violated edge ever seen, segment goes on forever.
Contradiction
x
y
s2
s1 s5 s6
s3 s4
The basic tools
• For even i: if si exists, so does si+1. (Hi is perfect matching.)
• What happens at odd i?
• Suppose no violated edge seen so far.
• Structure lemma: For j = 1 (mod 4), sj > sj+1. For j = 3 (mod 4), sj < sj+1
• Progress lemma: si is matched in M, so si+1 also exists. (Structure implies (si, si+1) not in Mi.)
• Hence, if no violated edge ever seen, segment goes on forever. x
y
s2
s1 s5 s6
s3 s4
Proving structure
• si exists. For j < i: if j = 1 (mod 4), sj < sj+1. If j = 3 (mod 4), sj > sj+1
• Prove by induction on j. (And a contradiction) • Suppose true for all j’ < j. And sj < sj+1
• Then remove red pairs and add blue pairs from M. • Argue (index chasing) that weight has increased. • Induction hypothesis gives handle on red weight
• w(Blue) = [f(x) – f(s1)] + [f(s3) – f(y)] + [f(s2) – f(s5)] + [f(s8) – f(s4)]
• w(Red) = [f(x) – f(y)] + [f(s2) – f(s1)] + [f(s3) – f(s4)] + [f(s6) – f(s5)] + [f(s8) – f(s7)]
>
40
x
y
s2
s1 s5 s6
s3 s4
Proving structure
• si exists. For j < i: if j = 1 (mod 4), sj < sj+1. If j = 3 (mod 4), sj > sj+1
• Then remove red pairs and add blue pairs from M. • Argue (index chasing) that weight has increased.
• w(Blue) = w(Red) + [f(s7) – f(s6)]
• w(Blue) > w(Red). Contradiction.
Positive because (s6, s7) not violated edge!
Must be < >
x
y
s2
s1 s5 s6
s3 s4 s7
Proving progress
•
s
iexists. What about s
i+1?
•
Suppose s
iunmatched in M
•
Replace red by blue, and go through the motions
•
I’ll spare you the details…
42
x
y
s2
s1 s5 s6
s3 s4
Recap
•
Segment of alternating path between two M
ipairs has
violated edge
•
Hence, # violated H
iedges ≥ |M
i|
•
Hence, #violated edges ≥ ∑
i|M
i| ≥ ε
f2
n-1•
The rearrangement idea: alternating paths generated from
max weight matchings in VG are highly structured
x y s1 s2 s3 s4 s5 x y s2
s1 s5 s6
Chapter V
The directed version of Margulis’
theorem
For the boolean range
•
Assumption of distinct value removed by perturbation
argument
•
#violated edges ≥ ∑
i|M
i| ≥ ε
f2
n-1•
Consider (x,y) in M. This belongs to|y – x|
1different M
is
•
#violated edges ≥ ∑
(x,y) in M|y – x|
1S
xy
x
y 1
0 i
x
For the boolean range
•
#violated edges ≥ ∑
(x,y) in M|y – x|
1≥ ε
f2
n-1•
Φ
+(S) = (#violated edges)/2
n= Ω(1)
•
If Φ
+(S) < r, then 2
-n∑
(x,y) in M
|y – x|
1< r
•
Average distance between pairs in M < r
•
For constant ε
f,
want to show Φ
+(S) Γ
+(S) = Ω(1)
Þ
Want to show: if Φ
+(S) < r, Γ
+(S) > 1/r
Þ
If Φ
+(S) < r, (#disjoint violated edges) > 2
n/r
46
S
xy
x
y 1
A routing theorem
•
[Lehman Ron 01]
Consider k comparable (s
i, t
i) pairs in
two levels. (
For all i, s
i< t
i)
We can route k (s
i,
t
j) pairs in vertex disjoint paths
•
If k pairs are in M, there are k disjoint edge violations in
between the levels
s1 s2 s3 s4 t1 t2 t3 t4
1 1 1 1
Getting the disjoint edges
•
Take pairs of M between two levels, apply LR theorem to
get disjoint violated edges
•
Average distance between pairs in M < r, so mostly violated
edges obtained between close levels
•
Total number of disjoint violated edges > |M|/r = Ω(2
n/r)
•
If number of violated edges < r2
n, then number of disjoint
violated edges > 2
n/r
48
Chapter VI
Summarizing
•
Number of violated edges
≥
ε
f2
n-1•
If number of violated edges ≤
ε
fr 2
n-1, then number of
disjoint violated edges
≥
ε
f2
n-1/r
•
We can get a O(ε
-5/3n
5/6) tester for boolean monotonicity
by running edge tester and path testers with O(ε
-5/3n
5/6)
queries
50
2n/n1/6
Monotonicity for hypergrids
•
f: [k]
n-> R. x ≤ y if for all i, x
i
≤ y
i•
k=2, hypercube. n=1, total order. And everything in
between
•
[DGL+99, EKK+99, AC04, HK04]
O(ε
-1n log k log|R|) and O(ε
-12
nlog k) testers
•
Theorem: There is a O(ε
-1n log k)-query tester for
monotonicity on hypergrids
•
[Blais-Raskhodnikova-Yaroslavtsev 13, Chakrabarty S 13]
(0,0)
The Lipschitz property
•
f: [k]
n-> R
is c-Lipschitz if for all nbrs (x,y), |f(x) – f(y)| ≤ c• For all (x,y) |f(x) – f(y)| ≤ c|x – y|1
• [Jha Raskhodnikova 11] “Testing c-Lipschitz”. Applications to differential
privacy
• [JR11, Awasthi Jha Molinari Raskhodnikova 12] R = δ Z. There is O((δε)-1 n2k log k) tester.
• Theorem: There is a O(ε-1 n log k)-query tester for c-Lipschitz on
hypergrids.
4 5 4 4 5 4 3 3 4 3 2 2 3 2 2 1
The obvious open problems
•
Monotonicity for f:{0,1}
n-> {0,1}
•
We have O(ε
-5/3n
5/6) tester. Improve it!
•
[FLN+02]
Best lower bounds: Ω(log n) for general testers,
Ω(n
1/2) for non-adaptive testers
•
[Blais Brody Matulef 11]
Communication complexity
reductions
–
[BRY 13, CS 13]
Progress on lower bounds for general
ranges
•
f:{0,1}
n-> R, where |R| < n
1/2•
[DGL+99]
Reducing general R to {0,1}
The path tester
•
We think path tester with O(n
1/2) is bonafide tester
•
But our approach won’t get there. Can probably chip off
exponent in O(n
5/6)
•
Path and edge tester are “pair testers”. Define some
distribution on pairs (x,y). Sample repeatedly
•
[BCGM10]
Any pair tester requires Ω(ε
-1n/log n) queries
•
Look beyond path testers. Correlate queries, or use
Appendix
Proving path tester works
•
Two sets S and T, perfectly matched by directed edges
•
Let |S|/2
n= μ
•
What is the probability that directed random walk of
length Θ(n
1/2) starts in S and lands in T?
•
What is the probability that walk starts and lands in S?
•
Claim: Prob = Ω(μ
2)
56
Proving path tester works
•
What is the probability that walk starts and lands in S?
•
Claim: Prob = Ω(μ
2)
•
Pr(x,y) is prob that path tester starts at x and end at y
•
Claim: Pr(x,y’) ≥ Pr(x,y)/n
1/2•
Probability of starting in S and ending at T is = Ω(μ
2/n
1/2)
S T
x
y