• No results found

woodruff_2014.ppt

N/A
N/A
Protected

Academic year: 2020

Share "woodruff_2014.ppt"

Copied!
30
0
0

Loading.... (view fulltext now)

Full text

(1)

Turnstile Streaming Algorithms

Might as Well Be Linear Sketches

(2)

Turnstile Streaming Model

• Underlying n-dimensional vector x initialized to 0

n

• Long stream of updates x

Ã

x + e

i

or x

Ã

x - e

i

for

standard unit vector e

i

• At end of the stream, x

2

{-m, -m+1, …, m-1, m}

n

for

some bound m

·

poly(n)

• Output an approximation to f(x) whp

• Goal

: use as little space (in bits) as possible

Only consider

1 pass over

data

(3)

Example: Norms

• Suppose you want |x|

pp

= Ʃ

i=1n

|x

i

|

p

• Want Z for which (1-

Ɛ

) |x|

pp

·

Z

·

(1+

Ɛ

) |x|

pp

• Many applications

• p = 2

– Geometry, linear algebra

• p = 1

(4)

Algorithm for 2-Norm

• Let r = 1/

Ɛ2

• Choose an r x n matrix A of i.i.d. N(0,1/r) normal

random variables (with precision 1/poly(n))

• Maintain Ax in the stream

• Output |Ax|

22
(5)

Algorithm for 1-Norm [Indyk]

• Let r = 1/

Ɛ2

• Choose an r x n matrix A of i.i.d. Cauchy random

variables (with precision 1/poly(n))

• Maintain Ax in the stream

• Output median(|Ax

1

|, …, |Ax

r

|)

• Proof: 1-stability of Cauchy distribution

(6)

Common Features

Algorithms for 2-norm and 1-norm have the following

form:

1. Choose a random matrix A independent of x

2. Maintain Ax in the stream

3. Output a function of Ax

Question (?!):

does the optimal algorithm for

approximating any function in the streaming

model have this form?

Some functions f(x) may be weird:

What is

x

xx1

Some functions f(x) may be weird:

What is

x

xx1 The state of these

algorithms only depends on the underlying vector x,

not on the specific stream of updates

The state of these

algorithms only depends on the underlying vector x,

(7)

Our Results

• Yes, up to a factor of log n

• Theorem: for any relation f(x) for x 2 {-m, -m+1, …, m}n , there

is a family F of O(n log m) matrices A with polynomially

bounded integer entries, and a correct (whp) algorithm in the streaming model which:

1. uniformly samples an A 2 F 2. maintains Ax in the stream 3. outputs a function of Ax

Logarithm of number of possibilities (“states”) of Ax, for x 2 {-m,-m+1, …, m}n is optimal up to a log n factor

- For earlier examples, the matrices F are n log m samples from matrices of

i.i.d. normals or Cauchys

- Can show n log m samples suffice by Newman’s Theorem

- For earlier examples, the matrices F are n log m samples from matrices of

i.i.d. normals or Cauchys

(8)

Consequences

a 2 {0,1}n

Create stream s(a)

b 2 {0,1}n

Create stream s(b)

Lower Bound Technique

1. Run Alg on s(a), transmit state of Alg(s(a)) to Bob

2. Bob computes Alg(s(a), s(b))

(9)

Consequences

a 2 {0,1}n b 2 {0,1}n

Our main theorem implies:

It suffices to look at simultaneous communication

complexity of g: weaker public-coin model in which Alice and Bob simultaneously send a message to a referee

If referee can solve g(a,b), then space of Alg at least the

simultaneous communication complexity of g • Use public coin to sample A

• Alice sends A*x(a) and Bob sends A*x(b) to referee

• Referee uses linearity to compute A*(x(a) + x(b)) Create stream s(a) with

underlying vector x(a)

(10)

The log n Factor Loss

• Main Theorem:

The logarithm of the number of

states of Ax, as x ranges over {-m, -m+1, …, m}

n

,

plus the amount of randomness to store A, is

optimal up to a log n factor

• The log n loss is necessary

(11)

Non-Uniformity Restriction

• Careful wording: “exists a family F of O(n log m)

matrices A with polynomially bounded integer

entries…”

• Algorithm is

non-uniform

– Each A is hardwired

– Output of each state for each A also hardwired

• Alternatively, allow algorithm to use more space

to process a stream update,

provided it only

retains Ax and its randomness

(12)

Comment on the Model

• For each random seed, algorithm is a deterministic automaton with a finite number of states

• Main theorem only requires correctness for x 2 {-m, -m+1, …, m}n

It counts the number of states as x varies in this range

• While processing the stream, may have |x|1 > m

(13)

Related Work

• Ganguly

– Deterministic algorithms

– Specific to heavy hitters problem

– Shows algorithm might as well be a linear

sketch over the reals

(14)

Talk Outline

Proof of Main Theorem

1. Reduce optimal automaton to

path-independent automata

2. From path-independent automata to linear

sketches

(15)

Start

+e1

-e1, +e2 … -en -e1 +e1 +en … +e5 … … … …

Stream Automaton for Fixed

Randomness

Want each state of

the automaton to

only depend on x,

not how it got there

Want each state of

the automaton to

only depend on x,

not how it got there

0n in two

different states

0n in two

(16)

Path-Independent Automaton

• Each x

2

Z

n

in a unique state

For each randomness, can we modify the

automaton to make it path-independent?

(17)

Path-Reversible Automaton

• Path-reversible:

8

states s, if σ is a stream

(+e

i1

, -e

i2

, -e

i3

, …,+e

ir

) of updates, resulting

in a state t, then from t the stream

σ

-1

= (-e

ir

, …,+e

i3

,+e

i2

, -e

i1

) returns us to s

s1 s2 s3 s4

+e2 -e1 +e

5

-e5 +e1

-e2

(18)

Strategy

Arbitrary Automaton

Path-Reversible Automaton

Path-Independent Automaton

For stream σ, freq(σ) 2 Zn is “net update” to each coordinate

Idea: 1. if in a state s, and update by a stream σ,

with freq(σ) = 0, answers ought to be similar

2. collapse all states s, s’ for which s+σ = s’ and freq(σ) = 0 for some stream σ

(19)

Zero-Frequency Graph

• Directed graph G = (V,E)

• V = states of old automaton Aold (for fixed randomness)

• (s,t) 2 E if there is a stream σ of length at most L with s+σ=t and freq(σ) = 0

– Finite bound on L

• Terminal equivalence class: strongly connected

component with no outgoing edge

– Path in G lands in a terminal equivalence class

(20)

New Transition Function

• Suppose in terminal equivalence class C

• Given an update ei

• Let v 2 C be an arbitrary node

• Compute v+ei using transition function of Aold

• Walk from v+ei until reach terminal equivalence class C’

– C’ is unique

• Does not depend on choice of v

• Only one terminal equivalence class reachable

(21)

Terminal equivalence

class

Terminal equivalence

class

u v

+ei +e

i

Terminal equivalence

class

Terminal equivalence

class

(22)

Output Function of

A

new

• In each terminal equivalence class C, sample node u from stationary distribution from random walk in C (add self-loops)

– Output of Anew on C = Output of Aold on u

• If v is starting vertex of Aold,

– take a random walk in G from v

– let starting vertex of Anew be terminal equivalence class C reached

(23)

Correctness

• Let ¦ be an arbitrary distribution on streams ¾

• Choose fixed randomness so Aold correct on ¦’:

– Long sequence of zero streams, – Followed by ¾ sampled from ¦,

– Followed by long sequence of zero streams

• Output of Anew on ¦ statistically close to output of Aold

on ¦’

(24)

Arbitrary Automaton

Path-Reversible Automaton

Path-Independent Automaton

(25)

Talk Outline

Proof of Main Theorem

1. Reduce optimal automaton to

path-independent automata

2. From path-independent automata to linear

sketches

(26)

Path Independent Automata and

Submodules

• Let o be the initial state

• M = {x

2

Z

n

such that x in o}

• 0

n

2

M

• If x

2

M, then –x

2

M

• If x, y

2

M, then x+y

2

M

• M is a free submodule of Z

n

(a lattice)

• M has a basis

(27)

• States of automaton are elements (cosets) of the

quotient module Z

n

/M

• Space of automaton is log of the number of

cosets containing an x

2

{-m, …, m}

n

• Z

n

/M examples:

– Zn/e

1 is free. It remembers all but first coordinate

– Zn/(2e

1, 2e2, …, 2en) not free. It remembers coordinate

parities

(28)

Smith Normal Form

• Smith Normal Form:

9

a basis y

1

, …, y

n

of Z

n

for

which the generators of M are q

i

¢

y

i

for i = 1, …,

r, where q

i

| q

i+1

are positive integers, and r =

rank(M)

• If q

1

= … = q

s

= 1 but q

s+1

> 1, the generators of

Z

n

/M are y

s+1

+ M, …, y

n

+ M

(29)

Remaining Issues

• Counting argument:

if we replace Bx mod q with Bx,

we get a linear sketch with a similar space

complexity

• Issue:

entries of B may be exponentially large

Compression:

reduce coefficients of random linear

(30)

Applications and Open Questions

• Simpler proof of ~(n

1-2/p

) bit lower bound

for estimating F

p

, p > 2

– No communication complexity

• Many dimension lower bounds known for

sketching norms

over the reals

– F

p

, matrix norms, adaptive sketching

References

Related documents

[r]

Tomatensaus, kaas, uien, paprika, salami, champignons en olijven Tomato sauce, cheese, onions, paprika, salami, mushrooms and olives. PIZZA FRUTTI DI MARE

By describing the intersection of masculinity and faith in college men’s identity as a process of accountability and affirmation, where men of faith negotiated masculinity and

*NOTE: This message will be displayed under the following conditions: • for one minute after the projector has been powered on • when the (POWER) button on the projector cabinet

The CELLULAR RADIO NETWORK ADMINISTRATION COMMAND (ZE__) group has the most commands that are used by Network planners.. Following is the main menu of the

• Theorem: Theorem: The majority valuation requires at least The majority valuation requires at least ( ( m m m m /2 /2 ) atoms in the OR* language. OR* can express

What I would like to discuss with you today are some attempts of mine to develop hypnotic suggestion or, more precisely, posthypnotic suggestion as a powerful and precise

Workplace bullying is repeated, unreasonable behaviour directed toward an employee or group of employees, that creates a risk to health and safety.. Employer’s Duty