Matrix-Vector Multiplication in Sub-Quadratic Time (Some Preprocessing Required)
Ryan Williams
January 17, 2007
0-0
Introduction
Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing
1
Introduction
Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can
n × n
matrix-vector multiplication be?1-a
Introduction
Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can
n × n
matrix-vector multiplication be?Θ(n
2)
steps just to read the matrix!1-b
Introduction
Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can
n × n
matrix-vector multiplication be?Θ(n
2)
steps just to read the matrix!Main Result: If we allow
O(n
2+ε)
preprocessing, then matrix-vector multiplication over any finite semiring can be done inO(n
2/(ε log n)
2)
.1-c
Better Algorithms for Matrix Multiplication
Three of the major developments:
2
Better Algorithms for Matrix Multiplication
Three of the major developments:
•
Arlazarov et al., a.k.a. “Four Russians” (1960’s):O(n
3/ log n)
operationsUses table lookups
Good for hardware with short vector operations as primitives
2-a
Better Algorithms for Matrix Multiplication
Three of the major developments:
•
Arlazarov et al., a.k.a. “Four Russians” (1960’s):O(n
3/ log n)
operationsUses table lookups
Good for hardware with short vector operations as primitives
•
Strassen (1969):n
log 7log 2= O(n
2.81)
operationsAsymptotically fast, but overhead in the big-O
Experiments in practice are inconclusive about Strassen vs. Four Russians for Boolean matrix multiplication (Bard, 2006)
2-b
Better Algorithms for Matrix Multiplication
Three of the major developments:
•
Arlazarov et al., a.k.a. “Four Russians” (1960’s):O(n
3/ log n)
operationsUses table lookups
Good for hardware with short vector operations as primitives
•
Strassen (1969):n
log 7log 2= O(n
2.81)
operationsAsymptotically fast, but overhead in the big-O
Experiments in practice are inconclusive about Strassen vs. Four Russians for Boolean matrix multiplication (Bard, 2006)
•
Coppersmith and Winograd (1990):O(n
2.376)
operationsNot yet practical
2-c
Focus: Combinatorial Matrix Multiplication Algorithms
3
Focus: Combinatorial Matrix Multiplication Algorithms
•
Also called non-algebraic; let’s call them non-subtractive E.g. Four-Russians is combinatorial, Strassen isn’t3-a
Focus: Combinatorial Matrix Multiplication Algorithms
•
Also called non-algebraic; let’s call them non-subtractive E.g. Four-Russians is combinatorial, Strassen isn’tMore Non-Subtractive Boolean Matrix Mult. Algorithms:
•
Atkinson and Santoro:O(n
3/ log
3/2n)
on a(log n)
-word RAM•
Rytter and Basch-Khanna-Motwani:O(n
3/ log
2n)
on a RAM•
Chan: Four Russians can be implemented onO(n
3/ log
2n)
on a pointer machine3-b
Main Result
The
O(n
3/ log
2n)
matrix multiplication algorithm can be “de-amortized”4
Main Result
The
O(n
3/ log
2n)
matrix multiplication algorithm can be “de-amortized”More precisely, we can:
Preprocess an
n × n
matrixA
over a finite semiring inO(n
2+ε)
Such that vector multiplications with
A
can be done inO(n
2/(ε log n)
2)
4-a
Main Result
The
O(n
3/ log
2n)
matrix multiplication algorithm can be “de-amortized”More precisely, we can:
Preprocess an
n × n
matrixA
over a finite semiring inO(n
2+ε)
Such that vector multiplications with
A
can be done inO(n
2/(ε log n)
2)
Allows for “non-subtractive” matrix multiplication to be done on-line
4-b
Main Result
The
O(n
3/ log
2n)
matrix multiplication algorithm can be “de-amortized”More precisely, we can:
Preprocess an
n × n
matrixA
over a finite semiring inO(n
2+ε)
Such that vector multiplications with
A
can be done inO(n
2/(ε log n)
2)
Allows for “non-subtractive” matrix multiplication to be done on-line Can be implemented on a pointer machine
4-c
Main Result
The
O(n
3/ log
2n)
matrix multiplication algorithm can be “de-amortized”More precisely, we can:
Preprocess an
n × n
matrixA
over a finite semiring inO(n
2+ε)
Such that vector multiplications with
A
can be done inO(n
2/(ε log n)
2)
Allows for “non-subtractive” matrix multiplication to be done on-line Can be implemented on a pointer machine
This Talk: The Boolean case
4-d
Preprocessing Phase: The Boolean Case
Partition the input matrix
A
into blocks of⌈ε log n⌉ × ⌈ε log n⌉
size:A1,1
A2,1
A n
εlog n,1
A1,2 A1, n
εlog n
A n
εlog n,εlog nn
· · ·
...
· · · · · ·
...
...
Ai,j
ε log n
ε log n
A =
5
Preprocessing Phase: The Boolean Case
Build a graph
G
with partsP
1, . . . , P
n/(ε log n),Q
1, . . . , Q
n/(ε log n) 2ε log nP
2... ... ...
...
P
nε log n
P
1 2ε log n2ε log n 2ε log n
2ε log n 2ε log n
Q
1Q
2Q
nε log n
Each part has 2ε log n vertices, one for each possible ε log n vector
6
Preprocessing Phase: The Boolean Case
Edges of
G
: Each vertexv
in eachP
i has exactly one edge into eachQ
j2
ε log nP i 2ε log n Q j
v
A
j,iv
7
Preprocessing Phase: The Boolean Case
Edges of
G
: Each vertexv
in eachP
i has exactly one edge into eachQ
j2
ε log nP i 2ε log n Q j
v
A
j,iv
Time to build the graph:
n
ε log n ·
n
ε log n · 2
ε log n
· ( ε log n) 2 = O(n 2+ε )
number of Qj
number of Pi
number of nodes in Pi
matrix-vector mult of Aj,i and v
7-a
How to Do Fast Vector Multiplications
Let
v
be a column vector. Want:A · v
.8
How to Do Fast Vector Multiplications
Let
v
be a column vector. Want:A · v
.(1)
Break upv
intoε log n
sized chunks:v =
v
1v
2...
v
nεlog n
8-a
How to Do Fast Vector Multiplications
(2)
For eachi = 1, . . . , n/(ε log n)
, look upv
i inP
i.9
How to Do Fast Vector Multiplications
(2)
For eachi = 1, . . . , n/(ε log n)
, look upv
i inP
i.2ε log n
P
2... ... ...
...
P
nε log n
P
1 2ε log n2ε log n 2ε log n
2ε log n 2ε log n
Q
1Q
2Q
nε log n
v1
v2
vn/(ε log n)
Takes
O(n) ˜
time.9-a
How to Do Fast Vector Multiplications
(2)
For eachi = 1, . . . , n/(ε log n)
, look upv
i inP
i.2ε log n
P
2... ... ...
...
P
nε log n
P
1 2ε log n2ε log n 2ε log n
2ε log n 2ε log n
Q
1Q
2Q
nε log n
v1
v2
vn/(ε log n)
Takes
O(n) ˜
time.10
How to Do Fast Vector Multiplications
(3)
Look up the neighbors ofv
i, mark each neighbor found.11
How to Do Fast Vector Multiplications
(3)
Look up the neighbors ofv
i, mark each neighbor found.2ε log n
P
2... ... ...
...
P
nε log n
P
1 2ε log n2ε log n 2ε log n
2ε log n 2ε log n
Q
1Q
2Q
nε log n
v1
v2
vn/(ε log n)
A1,1 · v1
A2,1 · v1
A n
ε log n,1 · v1
11-a
How to Do Fast Vector Multiplications
(3)
Look up the neighbors ofv
i, mark each neighbor found.2ε log n
P
2... ... ...
...
P
nε log n
P
1 2ε log n2ε log n 2ε log n
2ε log n 2ε log n
Q
1Q
2Q
nε log n
v1
v2
vn/(ε log n)
A1,2 · v2
A2,2 · v2
A n
ε log n,2 · v2
12
How to Do Fast Vector Multiplications
(3)
Look up the neighbors ofv
i, mark each neighbor found.2ε log n
P
2... ... ...
...
P
nε log n
P
1 2ε log n2ε log n 2ε log n
2ε log n 2ε log n
Q
1Q
2Q
nε log n
v1
v2
vn/(ε log n)
A1, n
ε log n · vn/(ε log n)
A2, n
ε log n · vn/(ε log n)
A n
ε log n,ε log nn · vn/(ε log n)
Takes
O
n ε log n
213
How to Do Fast Vector Multiplications
(4)
For eachQ
j, definev
j′ as the OR of all marked vectors inQ
j2ε log n
P
2... ... ...
...
P
nε log n
P
1 2ε log n2ε log n 2ε log n
2ε log n 2ε log n
Q
1Q
2Q
nε log n
v1
v2
vn/(ε log n)
⇒
⇒
⇒
v
1′v
2′v
n/(ε log n)′∨
∨
∨
Takes
O(n ˜
1+ε)
time14
How to Do Fast Vector Multiplications
(4)
For eachQ
j, definev
j′ as the OR of all marked vectors inQ
j2ε log n
P
2... ... ...
...
P
nε log n
P
1 2ε log n2ε log n 2ε log n
2ε log n 2ε log n
Q
1Q
2Q
nε log n
v1
v2
vn/(ε log n)
⇒
⇒
⇒
v
1′v
2′v
n/(ε log n)′∨
∨
∨
Takes
O(n ˜
1+ε)
time15
How to Do Fast Vector Multiplications
(5)
Outputv
′:=
v
1′v
2′...
v
′ nεlog n
. Claim:
v
′= A · v
.16
How to Do Fast Vector Multiplications
(5)
Outputv
′:=
v
1′v
2′...
v
′ nεlog n
. Claim:
v
′= A · v
.Proof: By definition,
v
j′= W
n/(ε log n)i=1
A
j,i· v
i.16-a
How to Do Fast Vector Multiplications
(5)
Outputv
′:=
v
1′v
2′...
v
′ nεlog n
. Claim:
v
′= A · v
.Proof: By definition,
v
j′= W
n/(ε log n)i=1
A
j,i· v
i.Av =
A
1,1· · · A
1,n/(ε log n)... . . . ...
A
n/(ε log n),1· · · A
n/(ε log n),n/(ε log n)
v
1...
v
nεlog n
16-b
How to Do Fast Vector Multiplications
(5)
Outputv
′:=
v
1′v
2′...
v
′ nεlog n
. Claim:
v
′= A · v
.Proof: By definition,
v
j′= W
n/(ε log n)i=1
A
j,i· v
i.Av =
A
1,1· · · A
1,n/(ε log n)... . . . ...
A
n/(ε log n),1· · · A
n/(ε log n),n/(ε log n)
v
1...
v
nεlog n
= ( W
n/(ε log n)i=1
A
1,i· v
i, . . . , W
n/(ε log n)i=1
A
1,n/(ε log n)· v
i) = v
′.16-c
Some Applications
Can quickly compute the neighbors of arbitrary vertex subsets
Let
A
be the adjacency matrix ofG = (V, E)
.Let
v
S be the indicator vector for aS ⊆ V
.17
Some Applications
Can quickly compute the neighbors of arbitrary vertex subsets
Let
A
be the adjacency matrix ofG = (V, E)
.Let
v
S be the indicator vector for aS ⊆ V
.Proposition:
A · v
S is the indicator vector forN (S)
, the neighborhood ofS
.17-a
Some Applications
Can quickly compute the neighbors of arbitrary vertex subsets
Let
A
be the adjacency matrix ofG = (V, E)
.Let
v
S be the indicator vector for aS ⊆ V
.Proposition:
A · v
S is the indicator vector forN (S)
, the neighborhood ofS
.Corollary: After
O(n
2+ε)
preprocessing, can determine the neighborhood of any vertex subset inO(n
2/(ε log n)
2)
time.(One level of BFS in
o(n
2)
time)17-b
Graph Queries
Corollary: After
O(n
2+ε)
preprocessing, can determine if a given vertex subset is an independent set, a vertex cover, or a dominating set, all inO(n
2/(ε log n)
2)
time.18
Graph Queries
Corollary: After
O(n
2+ε)
preprocessing, can determine if a given vertex subset is an independent set, a vertex cover, or a dominating set, all inO(n
2/(ε log n)
2)
time.Proof: Let
S ⊆ V
.18-a
Graph Queries
Corollary: After
O(n
2+ε)
preprocessing, can determine if a given vertex subset is an independent set, a vertex cover, or a dominating set, all inO(n
2/(ε log n)
2)
time.Proof: Let
S ⊆ V
.S
is dominating⇐⇒ S ∪ N (S) = V
.18-b
Graph Queries
Corollary: After
O(n
2+ε)
preprocessing, can determine if a given vertex subset is an independent set, a vertex cover, or a dominating set, all inO(n
2/(ε log n)
2)
time.Proof: Let
S ⊆ V
.S
is dominating⇐⇒ S ∪ N (S) = V
.S
is independent⇐⇒ S ∩ N (S) = ∅
.18-c
Graph Queries
Corollary: After
O(n
2+ε)
preprocessing, can determine if a given vertex subset is an independent set, a vertex cover, or a dominating set, all inO(n
2/(ε log n)
2)
time.Proof: Let
S ⊆ V
.S
is dominating⇐⇒ S ∪ N (S) = V
.S
is independent⇐⇒ S ∩ N (S) = ∅
.S
is a vertex cover⇐⇒ V − S
is independent.18-d
Graph Queries
Corollary: After
O(n
2+ε)
preprocessing, can determine if a given vertex subset is an independent set, a vertex cover, or a dominating set, all inO(n
2/(ε log n)
2)
time.Proof: Let
S ⊆ V
.S
is dominating⇐⇒ S ∪ N (S) = V
.S
is independent⇐⇒ S ∩ N (S) = ∅
.S
is a vertex cover⇐⇒ V − S
is independent.Each can be quickly determined from knowing
S
andN (S)
.18-e
Triangle Detection
19
Triangle Detection
Problem: Triangle Detection
Given: Graph
G
and vertexi
.Question: Does
i
participate in a 3-cycle, a.k.a. triangle?19-a
Triangle Detection
Problem: Triangle Detection
Given: Graph
G
and vertexi
.Question: Does
i
participate in a 3-cycle, a.k.a. triangle?Worst Case: Can take
Θ(n
2)
time to check all pairs of neighbors ofi
19-b
Triangle Detection
Problem: Triangle Detection
Given: Graph
G
and vertexi
.Question: Does
i
participate in a 3-cycle, a.k.a. triangle?Worst Case: Can take
Θ(n
2)
time to check all pairs of neighbors ofi
Corollary: After
O(n
2+ε)
preprocessing onG
, can solve triangle detection for arbitrary vertices inO(n
2/(ε log n)
2)
time.19-c
Triangle Detection
Problem: Triangle Detection
Given: Graph
G
and vertexi
.Question: Does
i
participate in a 3-cycle, a.k.a. triangle?Worst Case: Can take
Θ(n
2)
time to check all pairs of neighbors ofi
Corollary: After
O(n
2+ε)
preprocessing onG
, can solve triangle detection for arbitrary vertices inO(n
2/(ε log n)
2)
time.Proof: Given vertex
i
, letS
be its set of neighbors (gotten inO(n)
time).S
is not independent⇐⇒ i
participates in a triangle.19-d
Conclusion
A preprocessing/multiplication algorithm for matrix-vector multiplication that builds on lookup table techniques
20
Conclusion
A preprocessing/multiplication algorithm for matrix-vector multiplication that builds on lookup table techniques
•
Is there a preprocessing/multiplication algorithm for sparse matrices? Can we do multiplication in e.g.O(m/poly(log n) + n)
,where
m
= number of nonzeroes?20-a
Conclusion
A preprocessing/multiplication algorithm for matrix-vector multiplication that builds on lookup table techniques
•
Is there a preprocessing/multiplication algorithm for sparse matrices? Can we do multiplication in e.g.O(m/poly(log n) + n)
,where
m
= number of nonzeroes?•
Can the algebraic matrix multiplication algorithms (Strassen, etc.) be applied to this problem?20-b
Conclusion
A preprocessing/multiplication algorithm for matrix-vector multiplication that builds on lookup table techniques
•
Is there a preprocessing/multiplication algorithm for sparse matrices? Can we do multiplication in e.g.O(m/poly(log n) + n)
,where
m
= number of nonzeroes?•
Can the algebraic matrix multiplication algorithms (Strassen, etc.) be applied to this problem?•
Can our ideas be extended to achieve non-subtractive Boolean matrix multiplication ino(n
3/ log
2n)
?20-c
Thank you!
21