IC.ppsx

(1)

Ran Raz (Weizmann & IAS)

joint work with

Anat Ganor (Weizmann) Gillat Kol (IAS)

Exponential Separation of

(2)

Alice has a string , chosen according to a publicly known distribution.

She wants to send a message to Bob, so that

Bob can retrieve with high probability. How many bits does Alice need to send?

Answer: [Shannon ‘48 , Huffman ‘52]: bits!

Message Compression

[S48,H52]

!

bits

A B

(3)

Interactive-Compression Problem [BBCR ‘09]:

What if Alice and Bob engage in an

interactive communication protocol?

Can the protocol’s transcript be compressed to its ``information content’’ ?

Message-Compression Theorem [S48,H52]:

(4)

Alice gets , Bob gets , where are chosen according to a joint distribution .

They want to compute . ( are publicly known)

How many bits do they need to exchange?

Communication Complexity

[Yao]

:

A B

𝒚

𝒙

𝒇

(

𝒙

,

𝒚

)

!

(5)

Players can use private and public random strings. They have to compute with

probability (over and the random strings). Communication complexity of a protocol : maximal communication complexity over and the random strings.

(6)

Can every protocol be compressed to its

information content?

But how do we measure information content

of a protocol…???

Answer: Information Complexity! […,CSWY ‘01, BYJKS ‘04, BBCR ‘09,…]

(7)

: random variables

= the information that a player who knows learns about by seeing

(on average)

Conditional Mutual Information

(8)

Information Complexity

IC(



,



)

=

 ( ; Y | X)

+

 ( ; X | Y)

are distributed according to

 denotes the transcript of

what Alice learns

about Y from  what about BobX from learns  The amount of information that the players learn

(9)

IC(



,



)

=

 ( ; Y | X)

+

 ( ; X | Y)

Information Complexity

what Alice learns

about Y from  what about BobX from learns 

IC(

f

,



)

=

inf

{

IC(,)

}

 computes f

over 

(10)

 ,: CC(,)  IC(,). Hence,

 f,: CC(f,)  IC(f,) .

Other Direction:

CC(,) can be much larger than IC(,).

The Compression Problem:

Given a protocol , can  be simulated by ’, s.t. CC(’,)  IC(,) ?

Information vs. Communication

(11)

A protocol  with CC(,) = , and IC(,) = , () can be simulated by ’, with

1) CC(’,)  [BBCR ‘09]

(at most quadratic compression)

2) over product distributions:

CC(’,)  [BBCR ‘09]

3) for protocols with const number of rounds:

CC(’,)  [BR ‘10] 4) CC(’,)  [Bra ‘12]

Hence,  f,: CC(f,) 

(12)

 f,: IC(f,)  CC(f,) 

No gap between IC and CC was known

[KLLRX ‘12]: Almost all known techniques for lower bounding CC give the same bound for IC New functions and techniques may be needed

Information vs. Communication

(13)

First gap between IC and CC: , explicit , such that

IC(f,) =

CC(f,)

Hence: Interactive protocols cannot always be compressed to their information content! By [Bra ‘12]: largest possible gap

Input size: triple exponential in

Protocol with IC has double exp CC

(14)

Alice gets . Bob gets (independently)

Goal: compute .

= CC of best protocol that answers correctly with prob on each coordinate.

Does ?

Equivalently, let =

Does ?

(15)

Connections known for a long time:

[R ‘94], [CSWY ‘01], [BBCR ‘09] [BR ‘10]: = I

Our result hence gives the first gap between

AC and CC: , explicit , such that

AC(f,) =

CC(f,)

Hence: a strong direct-sum theorem for CC does not hold!

(16)

(17)

Multilayer = 100k layers Depth: c = multilayers Alice gets x, Bob gets y.

Each input contains a bit for every vertex v in the tree.

Complete Binary Tree

100k

(18)

= Non-noisy vertices:

choose x_v= y_vat random

= Noisy vertices: choose xv,yv

independently at random Randomly select a multilayer i

Set all vertices in multilayers < i

to be non-noisy

Set all vertices in multilayer i

to be noisy

Select xv,yv for multilayers 1,..,i

The Distribution



: First

i

Multilayers

(19)

xv = 0

yv = 1

noisy multilayer i

multilayer c

Typical Vertices

Alice owns odd layers

Bob owns even layers The player who owns v

dictates the correct child of v: If Alice owns v and xv = 0, left

is correct, otherwise right

non-noisy

xv = yv

noisy

(20)

typical leaves

Typical Vertices

≥ 80% correct children noisy multilayer i non-noisy

xv = yv

noisy

xv,yv iid

Alice owns odd layers

Bob owns even layers The player who owns v

dictates the correct child of v: If Alice owns v and xv = 0, left

is correct, otherwise right

v in multilayer > i is typical if the sub-path in multilayer i

leading to v has ≥ 80%

(21)

= Non-noisy vertices:

choose xv = yv at random

= Noisy vertices: choose xv,yv

independently at random

i is randomly chosen

Multilayers < i are non-noisy

Multilayer i is noisy

Multilayers > i: Bursting noise:

Set all non-typical vertices to be noisy

The Distribution



noisy multilayer i

typical leaves

layer j

v is typical if the sub-path in multilayer i leading to v has

(22)

Player’s Goal: Find and output the same typical leaf

Remarks: c =, multilayer=

The Bursting Noise Game

noisy multilayer i

typical leaves

non-noisy

xv = yv

noisy

xv,yv iid

100k

v is typical if the sub-path in multilayer i leading to v has

(23)

Remarks: c =, multilayer=

Typical leaves are rare (prob < )

If the players know i, they can solve by exchanging O(k) bits

A binary search can find i by exchanging O(log c) bits.

That’s why we set c =

The Bursting Noise Game

noisy multilayer i

typical leaves

non-noisy

xv = yv

noisy

xv,yv iid

(24)

Why the CC seems high (> ):

Hard to guess a typical leaf (prob < )

Hard to find i (CC > )

Full lower bound proof is hard… makes use of the bursting noise

The Bursting Noise Game

noisy multilayer i

typical leaves

non-noisy

xv = yv

noisy

xv,yv iid

(25)

Update : After selecting (x,y):

- Randomly select a bit b.

- For every leaf v, add b to y_v Define: f(x,y) = b

Remark:

- For any typical leaf x_vy_v= b,

(as we started with xv = yv )

- For any non-typical leaf xvyv is random

Hence, to determine b it

suffices to find a typical leaf

Converting to a Function

noisy multilayer i

typical leaves

non-noisy

xv = yv

noisy

(26)

(27)

Starting from the root, on every vertex v,

the player who owns v

sends her bit w.p. 90% and sends the negation w.p. 10%

Both players move to the child indicated by this bit. When reaching a leaf v, players output xvyv

By Chernoff, they reach a typical leaf w.h.p.

First Attempt

noisy multilayer i

typical leaves

non-noisy

xv = yv

noisy

xv,yv iid

xv = 0

0 sent

90%

yv = 1

0

sent

10%

(28)

If the players always send their true bit, i is revealed, thus IC ≥ H(i) = log(c) =

Why

90%

and not

100%

??

(29)

Intuitively, a player learns very little information at non-noisy vertices, since both inputs are the same. W.h.p. the players reach only

noisy vertices (multilayer i)

Problem: with prob , players

reach a non-typical vertex at the end of multilayer i, and then reach additional noisy vertices.

Solution: we add a machinery to abort if the players reach a non-typical vertex .

Why the IC Seems Low

(30)

For let be the distribution of the bit sent by Alice. is either (0.9,0.1) or (0.1,0.9). Let be Bob’s best

estimation for , given by

It is known that,

.

We prove that this is at most

,

where is any distribution known to Bob at time.

Let be (0.9,0.1) or (0.1,0.9), based on Bob’s bit. Then , on every non-noisy vertex.

(31)

(32)

Rectangle based methods:

Lower bound CC using properties of large rectangles

[KLLRX ‘12]: Almost all known rect methods give the same bound for IC

Our contribution: New rectangle method powerful enough to separate IC and CC

Idea: Measure the size of a rectangle relative

to a new (arbitrary) distribution

(33)

Definition: (f,) has the (,)-Relative

Discrepancy Property if  distribution on

(x,y) s.t. R rectangle with (R) > :

(34)

Theorem: If (f,) has the (,)-RDP then

CC(f,)  log

(35)

Theorem: If (f,) has the (,)-RDP then

CC(f,)  log

We prove the (,)-RDP with 

(36)

The Distribution



RD: (R ∩ f-1(0)) ≥ (½ - ) (R)

Randomly select multilayer i, set its vertices to noisy

Set all vertices before multilayer i

to non-noisy

Set all vertices after multilayer i

to noisy

 is ``close’’ to  but ``simpler’’:

Before multilayer i: Same as 

After multilayer i: Only differs on typical vertices

(37)

Fix randomly and an assignment for vertices in

multilayers . Let be the restriction of the rectangle to inputs with .

Let

Show that is almost uniformly distributed.

Fix . Then the typical vertices are almost uniformly distributed.

If the measures according to and are significantly different, non-negligible amount of information is known on inputs on typical vertices.

Use Shearer’s inequality to show that the

information known about is very large, and hence is small.

(38)

(39)

First gap between IC and CC: , explicit , such that

IC(f,) =

CC(f,)

Hence: Interactive protocols cannot always be compressed to their information content By [Bra ‘12]: largest possible gap

By [BR ‘10]: Implies that a strong direct-sum theorem for CC does not hold