Primal-Dual Methods for p-Modulus on Graphs
Dominique Zosso
Montana State University | Department of Mathematical Sciences http://www.math.montana.edu/zosso | [email protected]
SIAM Central States Section Meeting University of Arkansas at Little Rock | 2016-10-02
[p-Modulus]
Definitions I
Definition (p-Energy)
Consider a weighted undirected graph G(V , E, σ), where V is the set of vertices, E the set of edges, and σ : E → R+ non-negative edge-weights.
Let ρ: E → R and 1 ≤ p < ∞.
Then the p-energy is the quantity
p,σ(ρ) :=X
e∈E
σ(e)|ρ(e)|p.
Definition
Γ :family of objects γ (walks, trees, ...) on G
Definitions II
Definition (object cost)
Given G(V , E, σ) and ρ we define the cost of an object γ:
`ρ(γ) :=X
e∈E
NΓ(γ, e)ρ(e) = Nγρ = (NΓρ)(γ)
Definition (ρ Admissibility) The function ρ is admissible if:
∀γ ∈ Γ : `ρ(γ) ≥1 Equivalently: infγ∈Γ`ρ(γ) ≥1
Set of admissible ρ:
A(Γ) := {ρ | ∀γ ∈ Γ : `ρ(γ) ≥1}
Definitions III
Definition (p-Modulus)
Given G(V , E, σ), p, and Γ, we define the p-Modulus of the family Γ:
Modp,σ(Γ) := inf
ρ∈A(Γ)p,σ(ρ)
Computational goal:
minρ p,σ(ρ) s.t. NΓρ ≥ 1 (elementwise). (1)
[Primal-Dual Hybrid
Gradients]
Primal Dual formulation
The Legendre-Fenchel transform is used in the following primal-dual equivalence:
Theorem (Ekeland and Témam, 1976)
Let F : W → R be a closed and convex functional on the set W , G : X → R a closed and convex functional, and K : X → W be a continuous linear operator. Then we have the following equivalence:
minx∈X
nF (Kx) + G (x)o
| {z }
Primal
=min
x∈X max
φ∈W∗
nhKx, φi − F∗(φ) + G (x)o
| {z }
Primal−Dual
(2)
where x and φ are the primal and dual variables, respectively, F∗ is the
Primal-Dual Hybrid Gradients
Theorem (Chambolle and Pock, 2011) The iterative scheme
φn+1 =arg min
φ∈W∗
{−hK xn, φi + F∗(φ) + 1 2r1
kφn− φk22} (3) xn+1 =arg min
x∈V
{hx, K∗φi + G (x) + 1 2r2
kxn−xk22} (4)
xn+1 = xn+1+ θ(xn+1−xn) (5)
for θ ∈ [0, 1], converges to the saddle-point for r1r2≤1/L2, where L = operator norm/induced norm ofK , or L2 = spectral radius ofK∗K . θ =0 corresponds to the Arrow-Hurwitz algorithm
Primal-Dual formulation I
We rewrite
minρ p,σ(ρ) s.t. NΓρ ≥ 1 (elementwise)
as
minρ p,σ(ρ) + χ(N ρ) with the barrier function
χ(µ) :=(0 if µ ≥ 1
∞ otherwise
Primal-Dual formulation II
We compute the dual (convex conjugate) of the barrier function:
χ∗(λ) :=sup
µ
{hλ, µi − χ(µ)} (6a)
=
(hλ, 1i if λ ≤ O
∞ otherwise (6b)
Thus the original problem becomes equivalent to:
minρ max
λ≤O p,σ(ρ) + hλ, N ρi − χ∗(λ)
=min
ρ max
λ≤O p,σ(ρ) + hλ, N ρ − 1i (7)
“obvious”.
Primal-Dual core algorithm I
Now, PDHG (Chambolle-Pock) suggests the following iterative scheme:
ρn+1 =arg min
ρ p,σ(ρ) + hNTλn, ρi + 1 2r1
kρ − ρnk22 (8a) λn+1 =arg min
λ≤O hλ, 1 − N ρi + 1 2r2
kλ − λnk22 (8b)
λn+1 =2λn+1− λn (8c)
Primal-Dual core algorithm II
Primal Update:
ρn+1 =arg min
ρ
p,σ(ρ) + 1 2r1
kρ − (ρn−r1NTλn)k22 Depends on p, e.g.,
p =1:
ρn+1=shrink(ρn−r1NTλn, r1σ), where shrink(z, τ)(e) :=
z(e) − τ (e) ifz(e) ≥ τ (e) 0 if |z(e)| < τ (e) z(e) + τ (e) ifz(e) ≤ −τ (e)
p =2:
ρn+1= (I +2r1S)−1(ρn−r1NTλn),
where S := diag(σ)
Primal-Dual core algorithm III
Dual update:
λn+1=arg min
λ≤O
1 2r2
kλ − (λn+ r2(N ρ − 1))k22
Projection onto the (closed convex) set of non-positive λ:
λ∗ = λn+ r2(N ρ − 1) (9a) λn+1=min(λ∗, O) (elementwise) (9b)
Primal-Dual core algorithm IV
Complete instruction set: (for p = 2)
ρn+1= (I +2r1S)−1(ρn−r1NTλn) (10a)
λ∗ = λn+ r2(N ρ − 1) (10b)
λn+1=min(λ∗, O) (elementwise) (10c)
λn+1=2λn+1− λn (10d)
Note: in the general case 1 < p 6= 2, (10a) involves an inner optimization.
[Essential family of objects]
How to get N
Γ? I
Given a graph and objects of interest (walks, trees, ...), the family Γ can become extremely big, thus NΓ extremely tall.
There exists an essential subfamily of objects that “spans” the set A(Γ) of admissible ρ.
Equivalently:
inessential objects = rows of NΓ corresponding to inactive constraints:
λ(γ) =0;
essential family = rows of NΓ for which λ(γ) <0.
How to get N
Γ? II
Idea: Construct essential NΓ greedily.
1 Start with Γ(0) := {} i.e., N(0) empty (0 × |E|), and ρ(0)=0
2 Find a γ(n+1)∈Γ s.t. `ρ(n)(γ(n+1)) <1 (−ε) (if none found: done) 3 Add γ(n+1) to constraints: Γ(n+1):= Γ(n)∪ {γ(n+1)} i.e.,
N(n+1):= N(n) Nγ(n+1)
4 Compute Modp,σ(Γ(n+1))and let ρ(n+1) achieve the minimum
5 Optional housekeeping = remove inactive constraints:
Γ(n+1):= Γ(n+1)\ {γ}for each γ : λ(n+1)(γ) =0
How to get N
Γ? III
Reasonable questions:
How to find γ(n+1)∈Γ?
I pick the one that violates “most”; (not unique)
I object = walk on G fromv to w , (v , w ∈ V ): shortest path (Dijkstra)
I object = spanning tree of G: minimal spanning tree (Kruskal) Housekeeping:
I Necessary?
I Do we get the essential family?
I Cycles?
Numerical stability?
Bounds on Mod
p,σ(Γ) I
As we search for the essential family, based on current iterates, what can we say about the value of Modp,σ(Γ)?
Monotonicity Γ(n)⊂Γ(n+1) ⊆Γ:
Modp,σ(Γ(n)) ≤Modp,σ(Γ(n+1)) ≤Modp,σ(Γ)
Bounds on Mod
p,σ(Γ) II
Also: Given current ρ(n), let γ∗ =arg minγ∈Γ`ρ(n)(γ) <1.
Then the rescaled ρ∗ := ` ρ(n)
ρ(n)(γ∗) ∈A(Γ).
Now: Modp,σ(Γ) = inf
ρ∈A(Γ)p,σ(ρ)
≤ p,σ(ρ∗))
= 1
`ρ(n)(γ∗)pp,σ(ρ(n))
= 1
`ρ(n)(γ∗)pModp,σ(Γ(n)) Therefore:
Modp,σ(Γ(n)) ≤Modp,σ(Γ) ≤ 1
`ρ(n)(γ∗)p Modp,σ(Γ(n)).
[Results]
Walks on simple graph
1 2
3 4
1 0
1 0
1 0
1 0
1 0
1 2
3 4
1 0
1 0
1 0
1 0
1 0
0 ≤ Mod2,σ(Γ) ≤ ∞
1 2
3 4
1 0.5
1 0
1 0
1 0.5
1 0
1 2
3 4
1 0.5
1 0
1 0
1 0.5
1 0
0.5 ≤ Mod2,σ(Γ) ≤ ∞
1 2
3 4
1 0.5
1 0.5
1 -0
1 0.5
1 0.5
σ(e),ρ(e) 1 ≤ Mod2,σ(Γ) ≤1
Walks on less simple graph I
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 0
0.267 0
0.962 0
1.09 0
0.614 0
0.984 0
0.688 0
0.255 0
0.3 0
0.507 0
0.849 0
0.926 0 0.89
0 0.419
0 0.634
0 0.19
0
0.212 0
0.236 0
0.779 0
0.595 0
0.29 0
0.595 0
0.248 0
0.155 0
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 0
0.267 0
0.962 0
1.09 0
0.614 0
0.984 0
0.688 0
0.255 0
0.3 0
0.507 0
0.849 0
0.926 0 0.89
0 0.419
0 0.634
0 0.19
0
0.212 0
0.236 0
0.779 0
0.595 0
0.29 0
0.595 0
0.248 0
0.155 0
0 ≤ Mod2,σ(Γ) ≤ ∞
Walks on less simple graph II
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 0
0.267 0
0.962 0
1.09 0
0.614 1
0.984 0
0.688 0
0.255 0
0.3 0
0.507 0
0.849 0
0.926 0 0.89
0 0.419
0 0.634
0 0.19
0
0.212 0
0.236 0
0.779 0
0.595 0
0.29
0 0.595
0 0.248
0 0.155
0
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 0
0.267 0
0.962 0
1.09 0
0.614 1
0.984 0
0.688 0
0.255 0
0.3 0
0.507 0
0.849 0
0.926 0 0.89
0 0.419
0 0.634
0 0.19
0
0.212 0
0.236 0
0.779 0
0.595 0
0.29
0 0.595
0 0.248
0 0.155
0
0.6144 ≤ Mod2,σ(Γ) ≤ ∞
Walks on less simple graph III
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 -0
0.267 0.527
0.962 -0
1.09 -0
0.614 1
0.984 -0
0.688 -0
0.255 -0
0.3 -0
0.507 -0
0.849 -0
0.926 -0 0.89
-0 0.419
0.337 0.634
0.222 0.19
-0
0.212 -0
0.236 -0
0.779 -0
0.595 -0
0.29
-0 0.595
-0 0.248
-0 0.155
-0
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 -0
0.267 0.527
0.962 -0
1.09 -0
0.614 1
0.984 -0
0.688 -0
0.255 -0
0.3 -0
0.507 -0
0.849 -0
0.926 -0 0.89
-0 0.419
0.337 0.634
0.222 0.19
-0
0.212 -0
0.236 -0
0.779 -0
0.595 -0
0.29
-0 0.595
-0 0.248
-0 0.155
-0
0.7676 ≤ Mod2,σ(Γ) ≤ ∞
Walks on less simple graph IV
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 -0
0.267 0.527
0.962 -0
1.09 -0
0.614 1
0.984 -0
0.688 -0
0.255 0.497
0.3 -0
0.507 -0
0.849 -0
0.926 -0 0.89
-0 0.419
0.337 0.634
0.222 0.19
-0
0.212 -0
0.236 0.536
0.779 0.163
0.595 -0
0.29
-0 0.595
-0 0.248
-0 0.155
-0
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 -0
0.267 0.527
0.962 -0
1.09 -0
0.614 1
0.984 -0
0.688 -0
0.255 0.497
0.3 -0
0.507 -0
0.849 -0
0.926 -0 0.89
-0 0.419
0.337 0.634
0.222 0.19
-0
0.212 -0
0.236 0.536
0.779 0.163
0.595 -0
0.29
-0 0.595
-0 0.248
-0 0.155
-0
0.9192 ≤ Mod2,σ(Γ) ≤ ∞
Walks on less simple graph V
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 -0
0.267 0.534
0.962 -0
1.09 0.0586
0.614 1
0.984 0.0649
0.688 0.0928
0.255 0.514
0.3 0.213
0.507 -0
0.849 0.0752
0.926 -0 0.89
-0 0.419
0.341 0.634
0.225 0.19
-0
0.212 0.302
0.236 0.555
0.779 0.168
0.595 0.107
0.29
-0 0.595
0.107 0.248
0.258 0.155
-0
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 -0
0.267 0.534
0.962 -0
1.09 0.0586
0.614 1
0.984 0.0649
0.688 0.0928
0.255 0.514
0.3 0.213
0.507 -0
0.849 0.0752
0.926 -0 0.89
-0 0.419
0.341 0.634
0.225 0.19
-0
0.212 0.302
0.236 0.555
0.779 0.168
0.595 0.107
0.29
-0 0.595
0.107 0.248
0.258 0.155
-0
1.015 ≤ Mod2,σ(Γ) ≤2.751
Walks on less simple graph VI
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 -0
0.267 0.537
0.962 -0
1.09 0.0816
0.614 1
0.984 0.0904
0.688 0.0651
0.255 0.52
0.3 0.149
0.507 0.0872
0.849 0.105
0.926 0.0477 0.89
-0 0.419
0.343 0.634
0.226 0.19
-0
0.212 0.42
0.236 0.561
0.779 0.17
0.595 0.149
0.29
0.153 0.595 0.0752
0.248 0.181
0.155 0.285
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 -0
0.267 0.537
0.962 -0
1.09 0.0816
0.614 1
0.984 0.0904
0.688 0.0651
0.255 0.52
0.3 0.149
0.507 0.0872
0.849 0.105
0.926 0.0477 0.89
-0 0.419
0.343 0.634
0.226 0.19
-0
0.212 0.42
0.236 0.561
0.779 0.17
0.595 0.149
0.29
0.153 0.595 0.0752
0.248 0.181
0.155 0.285
1.06 ≤ Mod2,σ(Γ) ≤2.112
Walks on less simple graph VII
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
0.869 0.0568
0.267 0.644
0.962 0.0513
1.09 0.12
0.614 1
0.984 0.133
0.688 0.0598
0.255 0.521
0.3 0.137
0.507 0.0802
0.849 0.0963
0.926 0.0439 0.89
0.0554 0.419
0.294 0.634
0.194 0.19
0.26
0.212 0.386
0.236 0.562
0.779 0.171
0.595 0.137
0.29
0.14 0.595 0.0691
0.248 0.166
0.155 0.262
1.096 ≤ Mod2,σ(Γ) ≤1.096
Spanning tree on simple graph I
1 2
3 4
1 0
1 0
1 0
1 0
1 0
1 2
3 4
1 0
1 0
1 0
1 0
1 0
0 ≤ Mod2,σ(Γ) ≤ ∞
1 2
3 4
1 0.333
1 0.333
1 0
1 0.333
1 0
1 2
3 4
1 0.333
1 0.333
1 0
1 0.333
1 0
0.3333 ≤ Mod2,σ(Γ) ≤3
1 2
3 4
1 0.5
1 0.25
1 0.25
1 0.25
1 0.25
1 2
3 4
1 0.5
1 0.25
1 0.25
1 0.25
1 0.25
0.5 ≤ Mod2,σ(Γ) ≤0.8889
Spanning tree on simple graph II
1 2
3 4
1 0.385
1 0.385
1 0.308
1 0.231
1 0.308
1 2
3 4
1 0.385
1 0.385
1 0.308
1 0.231
1 0.308
0.5385 ≤ Mod2,σ(Γ) ≤0.6319
1 2
3 4
1 0.364
1 0.364
1 0.364
1 0.273
1 0.273
1 2
3 4
1 0.364
1 0.364
1 0.364
1 0.273
1 0.273
0.5455 ≤ Mod2,σ(Γ) ≤0.66
2 4
1 0.333
1 0.333
1 0.333
1 0.333
Spanning tree on large graph I
Random geometric graph:
6,000 vertices drawn from [0, 1]2 epsilon-neighbors graph: = 0.05 134,826 edges
Family of spanning trees; each tree uses 5,999 edges
|Γ| ≈109787
4
!p =2
Spanning tree on large graph II
Some hours of number crunching:
n bounds t
1: 0 ≤ Mod2,σ(Γ) ≤ ∞ 2: 0.0001985 ≤ Mod2,σ(Γ) ≤3.099 3: 0.0003876 ≤ Mod2,σ(Γ) ≤1.203 4: 0.0005676 ≤ Mod2,σ(Γ) ≤0.5809 5: 0.0007387 ≤ Mod2,σ(Γ) ≤0.3366 10: 0.00138 ≤ Mod2,σ(Γ) ≤0.0941 20: 0.002533 ≤ Mod2,σ(Γ) ≤0.03805 50: 0.003681 ≤ Mod2,σ(Γ) ≤0.005329 100: 0.003728 ≤ Mod2,σ(Γ) ≤0.004021 200: 0.003735 ≤ Mod2,σ(Γ) ≤0.003821 500: 0.003736 ≤ Mod2,σ(Γ) ≤0.00375
Spanning tree on large graph III
Numerical breakdown as |Γ(n)|increases:
core computation gets obviously larger (size of λ, size of N ) slower: kN k grows from ∼ 80 to ∼ 690
worse conditioned: κ(N NT)grows from 1 to ∼ 107
convergence of core computation worsens: from 10 iterations to 1,000 convergence error on ρ adversely affects path selection, bounds roundoff error on constraints, λ becomes relevant
Last example I
Unit resistor network:
1 2 3
4 5 6
7 8 9
1 0.357
1 0.0714 1
0.357
1 0.357
1
1 1
0.143 1
0.286
1 0.286 1
0.357 1
0.0714 1
0.0714
1 0.0714
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
1 0.0624
1 0.146
1 0.0702
1 0.0243 1
0.0624
1 0.103
1 0.103
1 0.0624
1
0.0407 1
0.307 1
0.0399 1
0.0026 1
0.0841
1 0.35
1 0.35
1 0.0841 1
0.206 1
1 1
0.22 1
0.0537 1
0.0763
1 0.343
1 0.343
1 0.0763 1
0.0407 1
0.307 1
0.0399 1
0.0026 1
0.0459
1 0.0832
1 0.0832
1 0.0459 1
0.0624 1
0.146 1
0.0702 1
0.0243 1
0.0243
1 0.0269
1 0.0269
1 0.0243
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 32 33 34 35
36 37 38 39 40 41 42
43 44 45 46 47 48 49
1 0.0214
1 0.0478
1 0.0711
1 0.0496
1 0.0263
1 0.0107 1
0.0214
1 0.038
1 0.0319
1 0.0319
1 0.038
1 0.0214
1 0.0165 1
0.0508 1
0.116 1
0.0514 1
0.0186 1
0.0058 1
0.0263
1 0.0606
1 0.0896
1 0.0896
1 0.0606
1 0.0263 1
0.00607 1
0.023 1
0.29 1
0.0217 1
0.00925 1
0.00607 1
0.0233
1 0.0884
1 0.356
1 0.356
1 0.0884
1 0.0233 1
0.0638 1
0.243 1
1 1
0.245 1
0.0712 1
0.0208 1
0.0215
1 0.0859
1 0.354
1 0.354
1 0.0859
1 0.0215 1
0.00607 1
0.023 1
0.29 1
0.0217 1
0.00925 1
0.00607 1
0.0233
1 0.0562
1 0.0871
1 0.0871
1 0.0562
1 0.0233 1
0.0165 1
0.0508 1
0.116 1
0.0514 1
0.0186 1
0.0058 1
0.0156
1 0.0284
1 0.0252
1 0.0252
1 0.0284
1 0.0156 1
0.0214 1
0.0478 1
0.0711 1
0.0496 1
0.0263 1
0.0107 1
0.0107
1 0.0165
1 0.0104
1 0.0104
1 0.0165
1 0.0107
· · ·
n =3 5 7
Last example II
0 10 20
1.7 1.8 1.9 2
n Modp,σ(Γ)
At n = 23, = 0.0001, inner convergence tolerance 10−9: 316 paths in the constraint, and 1.99585 ≤ Mod2,σ(Γ) ≤1.99624
At n = 23, = 0.00001, inner convergence tolerance 10−10: 371 paths in the constraint, and 1.99585 ≤ Mod2,σ(Γ) ≤1.99589
[?]
Acknowledgements
Thank you!
Definitions and greedy algorithm:
Pietro’s talk at MSU in April 2016 Support from:
NSF DMS-1461138 (co-PI with Braxton Osting, U Utah)