Matrix-Vector Multiplication in Sub-Quadratic Time (Some Preprocessing Required)

(1)

Matrix-Vector Multiplication in Sub-Quadratic Time (Some Preprocessing Required)

Ryan Williams

January 17, 2007

0-0

(2)

Introduction

Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing

1

(3)

Introduction

Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can

n × n

matrix-vector multiplication be?

1-a

(4)

Introduction

n × n

Θ(n

²

)

steps just to read the matrix!

1-b

(5)

Introduction

n × n

Θ(n

²

)

steps just to read the matrix!

Main Result: If we allow

O(n

^2+ε

)

preprocessing, then matrix-vector multiplication over any finite semiring can be done in

O(n

²

/(ε log n)

²

)

^.

1-c

(6)

Better Algorithms for Matrix Multiplication

Three of the major developments:

2

(7)

Better Algorithms for Matrix Multiplication

•

Arlazarov et al., a.k.a. “Four Russians” (1960’s):

O(n

³

/ log n)

^operations

Uses table lookups

Good for hardware with short vector operations as primitives

2-a

(8)

Better Algorithms for Matrix Multiplication

• O(n

³

/ log n)

^operations

•

Strassen (1969):

n

^{log 7}^{log 2}

= O(n

^2.81

)

^operations

Asymptotically fast, but overhead in the big-O

Experiments in practice are inconclusive about Strassen vs. Four Russians for Boolean matrix multiplication (Bard, 2006)

2-b

(9)

Better Algorithms for Matrix Multiplication

• O(n

³

/ log n)

^operations

•

Strassen (1969):

n

^{log 7}^{log 2}

= O(n

^2.81

)

^operations

Asymptotically fast, but overhead in the big-O

Experiments in practice are inconclusive about Strassen vs. Four Russians for Boolean matrix multiplication (Bard, 2006)

•

Coppersmith and Winograd (1990):

O(n

^2.376

)

^operations

Not yet practical

2-c

(10)

Focus: Combinatorial Matrix Multiplication Algorithms

3

(11)

Focus: Combinatorial Matrix Multiplication Algorithms

•

Also called non-algebraic; let’s call them non-subtractive E.g. Four-Russians is combinatorial, Strassen isn’t

3-a

(12)

Focus: Combinatorial Matrix Multiplication Algorithms

•

Also called non-algebraic; let’s call them non-subtractive E.g. Four-Russians is combinatorial, Strassen isn’t

More Non-Subtractive Boolean Matrix Mult. Algorithms:

•

Atkinson and Santoro:

O(n

³

/ log

^3/2

n)

^{on a}

(log n)

^{-word RAM}

•

Rytter and Basch-Khanna-Motwani:

O(n

³

/ log

²

n)

^{on a RAM}

•

Chan: Four Russians can be implemented on

O(n

³

/ log

²

n)

on a pointer machine

3-b

(13)

Main Result

The

O(n

³

/ log

²

n)

matrix multiplication algorithm can be “de-amortized”

4

(14)

Main Result

The

O(n

³

/ log

²

n)

More precisely, we can:

Preprocess an

n × n

^matrix

A

over a finite semiring in

O(n

^2+ε

)

Such that vector multiplications with

A

can be done in

O(n

²

/(ε log n)

²

)

4-a

(15)

Main Result

The

O(n

³

/ log

²

n)

Preprocess an

n × n

^matrix

A

O(n

^2+ε

)

A

can be done in

O(n

²

/(ε log n)

²

)

Allows for “non-subtractive” matrix multiplication to be done on-line

4-b

(16)

Main Result

The

O(n

³

/ log

²

n)

Preprocess an

n × n

^matrix

A

O(n

^2+ε

)

A

can be done in

O(n

²

/(ε log n)

²

)

Allows for “non-subtractive” matrix multiplication to be done on-line Can be implemented on a pointer machine

4-c

(17)

Main Result

The

O(n

³

/ log

²

n)

Preprocess an

n × n

^matrix

A

O(n

^2+ε

)

A

can be done in

O(n

²

/(ε log n)

²

)

Allows for “non-subtractive” matrix multiplication to be done on-line Can be implemented on a pointer machine

This Talk: The Boolean case

4-d

(18)

Preprocessing Phase: The Boolean Case

Partition the input matrix

A

into blocks of

⌈ε log n⌉ × ⌈ε log n⌉

^size:

A_1,1

A_2,1

A ⁿ

εlog n,1

A_1,2 A_1, ⁿ

εlog n

A ⁿ

εlog n,_ε_{log n}ⁿ

· · ·

...

· · · · · ·

...

A_i,j

ε log n

A =

5

(19)

Preprocessing Phase: The Boolean Case

Build a graph

G

^{with parts}

P

₁

, . . . , P

n/(ε log n),

Q

₁

, . . . , Q

n/(ε log n) 2^{ε log n}

P

₂

... ... ...

...

P

ⁿ

ε log n

P

₁ ²^{ε log n}

2^{ε log n} 2^{ε log n}

Q

₁

Q

₂

Q

ⁿ

ε log n

Each part has 2^{ε log n} vertices, one for each possible ε log n vector

6

(20)

Preprocessing Phase: The Boolean Case

Edges of

G

^: Each vertex

v

^{in each}

P

_i has exactly one edge into each

Q

_j

2

^{ε log n}

P _i 2

^{ε log n}

Q _j

v

A

_j,i

v

7

(21)

Preprocessing Phase: The Boolean Case

Edges of

G

^: Each vertex

v

^{in each}

P

_i has exactly one edge into each

Q

_j

2

^{ε log n}

P _i 2

^{ε log n}

Q _j

v

A

_j,i

v

Time to build the graph:

n

ε log n ·

n

ε log n · 2

ε log n

· ( ε log n) ² = O(n ^2+ε )

number of Q_j

number of P_i

number of nodes in P_i

matrix-vector mult of A_j,i and v

7-a

(22)

How to Do Fast Vector Multiplications

Let

v

be a column vector. Want:

A · v

^.

8

(23)

How to Do Fast Vector Multiplications

Let

v

be a column vector. Want:

A · v

^.

(1)

^{Break up}

^v

^into

^{ε log n}

sized chunks:

v =







v

₁

v

₂

...

v

ⁿ

εlog n







8-a

(24)

How to Do Fast Vector Multiplications

(2)

^{For each}

i = 1, . . . , n/(ε log n)

^{, look up}

v

_i ⁱⁿ

P

_i^.

9

(25)

How to Do Fast Vector Multiplications

(2)

^{For each}

i = 1, . . . , n/(ε log n)

^{, look up}

v

_i ⁱⁿ

P

_i^.

2^{ε log n}

P

₂

... ... ...

...

P

ⁿ

ε log n

P

₁ ²^{ε log n}

Q

₁

Q

₂

Q

ⁿ

ε log n

v₁

v₂

vn/(ε log n)

Takes

O(n) ˜

^time.

9-a

(26)

How to Do Fast Vector Multiplications

(2)

^{For each}

i = 1, . . . , n/(ε log n)

^{, look up}

v

_i ⁱⁿ

P

_i^.

2^{ε log n}

P

₂

... ... ...

...

P

ⁿ

ε log n

P

₁ ²^{ε log n}

Q

₁

Q

₂

Q

ⁿ

ε log n

v₁

v₂

vn/(ε log n)

Takes

O(n) ˜

^time.

10

(27)

How to Do Fast Vector Multiplications

(3)

Look up the neighbors of

v

_i, mark each neighbor found.

11

(28)

How to Do Fast Vector Multiplications

(3)

v

2^{ε log n}

P

₂

... ... ...

...

P

ⁿ

ε log n

P

₁ ²^{ε log n}

Q

₁

Q

₂

Q

ⁿ

ε log n

v₁

v₂

vn/(ε log n)

A_1,1 · v1

A_2,1 · v¹

A ⁿ

ε log n,1 · v¹

11-a

(29)

How to Do Fast Vector Multiplications

(3)

v

2^{ε log n}

P

₂

... ... ...

...

P

ⁿ

ε log n

P

₁ ²^{ε log n}

Q

₁

Q

₂

Q

ⁿ

ε log n

v₁

v₂

vn/(ε log n)

A_1,2 · v2

A_2,2 · v²

A ⁿ

ε log n,2 · v2

12

(30)

How to Do Fast Vector Multiplications

(3)

v

2^{ε log n}

P

₂

... ... ...

...

P

ⁿ

ε log n

P

₁ ²^{ε log n}

Q

₁

Q

₂

Q

ⁿ

ε log n

v₁

v₂

vn/(ε log n)

A_1, ⁿ

ε log n · vn/(ε log n)

A_2, ⁿ

ε log n · vn/(ε log n)

A ⁿ

ε log n,_{ε log n}ⁿ · vn/(ε log n)

Takes

O

n ε log n

2

13

(31)

How to Do Fast Vector Multiplications

(4)

^{For each}

^Q

j, define

v

_j^′ ^{as the} ^OR of all marked vectors in

Q

_j

2^{ε log n}

P

₂

... ... ...

...

P

ⁿ

ε log n

P

₁ ²^{ε log n}

Q

₁

Q

₂

Q

ⁿ

ε log n

v₁

v₂

vn/(ε log n)

⇒

v

₁^′

v

₂^′

v

n/(ε log n)^′

∨

Takes

O(n ˜

^1+ε

)

^time

14

(32)

How to Do Fast Vector Multiplications

(4)

^{For each}

^Q

j, define

v

_j^′ ^{as the} ^OR of all marked vectors in

Q

_j

2^{ε log n}

P

₂

... ... ...

...

P

ⁿ

ε log n

P

₁ ²^{ε log n}

Q

₁

Q

₂

Q

ⁿ

ε log n

v₁

v₂

vn/(ε log n)

⇒

v

₁^′

v

₂^′

v

n/(ε log n)^′

∨

Takes

O(n ˜

^1+ε

)

^time

15

(33)

How to Do Fast Vector Multiplications

(5)

^Output

^v

^′

^:=







v

₁^′

v

₂^′

...

v

^′ ⁿ

εlog n







. Claim:

v

^′

= A · v

^.

16

(34)

How to Do Fast Vector Multiplications

(5)

^Output

^v

^′

^:=







v

₁^′

v

₂^′

...

v

^′ ⁿ

εlog n







. Claim:

v

^′

= A · v

^.

Proof: By definition,

v

_j^′

= W

n/(ε log n)

i=1

A

_j,i

· v

_i^.

16-a

(35)

How to Do Fast Vector Multiplications

(5)

^Output

^v

^′

^:=







v

₁^′

v

₂^′

...

v

^′ ⁿ

εlog n







. Claim:

v

^′

= A · v

^.

v

_j^′

= W

n/(ε log n)

i=1

A

_j,i

· v

_i^.

Av =







A

_1,1

· · · A

1,n/(ε log n)

... . . . ...

A

n/(ε log n),1

· · · A

n/(ε log n),n/(ε log n)













v

₁

...

v

ⁿ

εlog n







16-b

(36)

How to Do Fast Vector Multiplications

(5)

^Output

^v

^′

^:=







v

₁^′

v

₂^′

...

v

^′ ⁿ

εlog n







. Claim:

v

^′

= A · v

^.

v

_j^′

= W

n/(ε log n)

i=1

A

_j,i

· v

_i^.

Av =







A

_1,1

· · · A

1,n/(ε log n)

... . . . ...

A

n/(ε log n),1

· · · A

n/(ε log n),n/(ε log n)













v

₁

...

v

ⁿ

εlog n







= ( W

n/(ε log n)

i=1

A

_1,i

· v

_i

, . . . , W

n/(ε log n)

i=1

A

1,n/(ε log n)

· v

_i

) = v

^′^.

16-c

(37)

Some Applications

Can quickly compute the neighbors of arbitrary vertex subsets

Let

A

be the adjacency matrix of

G = (V, E)

^.

Let

v

_S be the indicator vector for a

S ⊆ V

^.

17

(38)

Some Applications

Let

A

G = (V, E)

^.

Let

v

S ⊆ V

^.

Proposition:

A · v

_S is the indicator vector for

N (S)

, the neighborhood of

S

^.

17-a

(39)

Some Applications

Let

A

G = (V, E)

^.

Let

v

S ⊆ V

^.

Proposition:

A · v

_S is the indicator vector for

N (S)

, the neighborhood of

S

^.

Corollary: After

O(n

^2+ε

)

preprocessing, can determine the neighborhood of any vertex subset in

O(n

²

/(ε log n)

²

)

^time.

(One level of BFS in

o(n

²

)

^time)

17-b

(40)

Graph Queries

Corollary: After

O(n

^2+ε

)

preprocessing, can determine if a given vertex subset is an independent set, a vertex cover, or a dominating set, all in

O(n

²

/(ε log n)

²

)

^time.

18

(41)

Graph Queries

Corollary: After

O(n

^2+ε

)

O(n

²

/(ε log n)

²

)

^time.

Proof: Let

S ⊆ V

^.

18-a

(42)

Graph Queries

Corollary: After

O(n

^2+ε

)

O(n

²

/(ε log n)

²

)

^time.

Proof: Let

S ⊆ V

^.

S

^is ^dominating

⇐⇒ S ∪ N (S) = V

^.

18-b

(43)

Graph Queries

Corollary: After

O(n

^2+ε

)

O(n

²

/(ε log n)

²

)

^time.

Proof: Let

S ⊆ V

^.

S

^is ^dominating

⇐⇒ S ∪ N (S) = V

^.

S

^is independent

⇐⇒ S ∩ N (S) = ∅

_.

18-c

(44)

Graph Queries

Corollary: After

O(n

^2+ε

)

O(n

²

/(ε log n)

²

)

^time.

Proof: Let

S ⊆ V

^.

S

^is ^dominating

⇐⇒ S ∪ N (S) = V

^.

S

^is independent

⇐⇒ S ∩ N (S) = ∅

_.

S

^{is a} vertex cover

⇐⇒ V − S

is independent.

18-d

(45)

Graph Queries

Corollary: After

O(n

^2+ε

)

O(n

²

/(ε log n)

²

)

^time.

Proof: Let

S ⊆ V

^.

S

^is ^dominating

⇐⇒ S ∪ N (S) = V

^.

S

^is independent

⇐⇒ S ∩ N (S) = ∅

_.

S

^{is a} vertex cover

⇐⇒ V − S

is independent.

Each can be quickly determined from knowing

S

^and

N (S)

^.

18-e

(46)

Triangle Detection

19

(47)

Triangle Detection

Problem: Triangle Detection

Given: Graph

G

^{and vertex}

i

^.

Question: Does

i

participate in a 3-cycle, a.k.a. triangle?

19-a

(48)

Triangle Detection

Given: Graph

G

^{and vertex}

i

^.

Question: Does

i

Worst Case: Can take

Θ(n

²

)

time to check all pairs of neighbors of

i

19-b

(49)

Triangle Detection

Given: Graph

G

^{and vertex}

i

^.

Question: Does

i

Θ(n

²

)

i

Corollary: After

O(n

^2+ε

)

preprocessing on

G

, can solve triangle detection for arbitrary vertices in

O(n

²

/(ε log n)

²

)

^time.

19-c

(50)

Triangle Detection

Given: Graph

G

^{and vertex}

i

^.

Question: Does

i

Θ(n

²

)

i

Corollary: After

O(n

^2+ε

)

preprocessing on

G

, can solve triangle detection for arbitrary vertices in

O(n

²

/(ε log n)

²

)

^time.

Proof: Given vertex

i

^{, let}

S

be its set of neighbors (gotten in

O(n)

^time).

S

is not independent

⇐⇒ i

participates in a triangle.

19-d

(51)

Conclusion

A preprocessing/multiplication algorithm for matrix-vector multiplication that builds on lookup table techniques

20

(52)

Conclusion

•

Is there a preprocessing/multiplication algorithm for sparse matrices? Can we do multiplication in e.g.

O(m/poly(log n) + n)

^,

where

m

= number of nonzeroes?

20-a

(53)

Conclusion

• O(m/poly(log n) + n)

^,

where

m

•

Can the algebraic matrix multiplication algorithms (Strassen, etc.) be applied to this problem?

20-b

(54)

Conclusion

• O(m/poly(log n) + n)

^,

where

m

•

Can the algebraic matrix multiplication algorithms (Strassen, etc.) be applied to this problem?

•

Can our ideas be extended to achieve non-subtractive Boolean matrix multiplication in

o(n

³

/ log

²

n)

^?

20-c

(55)

Thank you!

21