• No results found

New Constructions and Practical Applications for Private Stream Searching (Extended Abstract)

N/A
N/A
Protected

Academic year: 2021

Share "New Constructions and Practical Applications for Private Stream Searching (Extended Abstract)"

Copied!
19
0
0

Loading.... (view fulltext now)

Full text

(1)

1

New Constructions and Practical Applications for

Private Stream Searching

(Extended Abstract)

John Bethencourt

CMU

Dawn Song

CMU

Brent Waters

SRI

?

? ?

?

(2)

2

Searching for Information

Too much on-line info to download

– WWW pages

– Message boards

– Web email, USENET

Search

– User specifies search criteria

● Textual keywords

– Server returns relevant content

I'm looking for ...

(3)

3

Motivation for Private Searches

What if keywords are secret?

– Personal privacy

(example query: “lice removal”)

– Commercial interests

(“takeover bid”)

– Intelligence gathering

(“Al Qaeda”)

We want

private

searches

– Client gives server encrypted query

– Server runs search algorithm, returns data – Client recovers matching content

– Server does not know what client searched for

They'll know what I'm looking for!

(4)

4

Practical Example:

Google News Alerts

Google News

– Google continuously crawls 4,500 news sources – Estimated 135,000 news articles each day

– Alerts service

● User registers search keywords

(5)

5

Private

Google News Alerts

What about a

private

alerts service?

– User registers encrypted search keywords – Periodically receives matching articles

– Google remains oblivious!

Our techniques enable private alerts service

– Previous schemes not practical for this scenario – Our results later in talk

Google doesn't know what I want, but

they can still give it to me!

(6)

6

Contributions

New scheme for private stream searching

Several novel constructions

– Reconstructing matching documents by solving linear

systems

– Encrypted Bloom filters

Practical analysis demonstrating feasibility

– Orders of magnitude more efficient than previous work

(7)

7 See spot run.

Basic Architecture:

Overview

?

? ?

rover, spot See spot run. Foo bar who. Zim blip fop. See spot run. Oops that is. The red fox.

?

client server
(8)

8

Basic Architecture

Step 1: Query Construction

Client runs

QueryConstruction

algorithm to

produce encrypted query

– Queries are disjunctions of textual keywords

(simplest case)

– Variations and extensions possible

Sends encrypted query to server

rover, spot

raw keyword list encrypted query QueryConstruction

(9)

9

Basic Architecture

Step 2: Executing the Search

Server gets encrypted query

Server runs

StreamSearch

algorithm

– Processes documents to produce encrypted buffers – Server remains unaware of search keywords

?

See spot

run.

StreamSearch

opaque, encrypted results documents

(not encrypted)

(10)

10

Basic Architecture

Step 3: Recovering the documents

Client gets encrypted results

Runs

FileReconstruction

algorithm

– Uses private key

– Recovers original matching documents

?

spotSee

run.

FileReconstruction encrypted results

client's private key

(11)

11

Scheme Highlights:

Homomorphic Encryption

How is this accomplished?

– Scheme in full paper

– Just highlights, general flavor here

Homomorphic encryption

– Paillier cryptosystem – Public key system

– Additive homomorphism:

E

a

⋅

E

b

 =

E

a

b

– Allows rudimentary computations on encrypted data – Lets server meaningfully use encrypted query

(12)

12

rover, spot

Queries are encrypted hash table

– Each cell is encryption of zero or one – Probabilistic encryption – Homomorphic encryption E(1) E(0) E(0) E(0) E(1) E(0) E(0) h(“rover”) h(“spot”)

Scheme Highlights:

Encrypted Queries

raw keyword list hash function

(13)

13

See

spot

run.

Scheme Highlights:

Conducting the Search

E(1) E(0) E(0) E(0) E(1) E(0) E(0)

Server can then obtain encryption of number

of keyword matches

– Used to bootstrap rest of algorithm

E(0 + 1 + 0) = E(c)

Document doesn't match: Get an encryption of 0.

Document matches:

Get an encryption of the number matching keywords, c.

encrypted query

(14)

14 Singular with exponentially low probability with respect to size! [Komlós, Tao-Vu STOC 2005]

Scheme Highlights:

Recovering the Documents

Client decrypts returned

data with private key

Can construct linear system

in the matching documents

– Special metadata returned

from server

– Encrypted Bloom filter

Solves linear system for

documents

– System solveable if random

0,1 matrix is non-singular

– Almost always the case

d1 d2 d3 d4

=

1 0 1 1 0 1 1 0 1 0 0 1 1 1 0 1

−1 ⋅

a1 a2 a3 a4

contents of decrypted buffer random (0,1) matrix
(15)

15

Communication Complexity

Performance / robustness tradeoffs in size of

queries and results

New scheme

– O(n) in the size of the returned content for the bulk

content of the matching files

– Some metadata is O(n log(t/n)) or O(n log n)

Previous scheme

– O(n log n) for bulk content

(16)

16

Practical Analysis:

Private News Alerts

Private news alerts scenario

– 135,000 news articles searched each day – Retrieve results four times per day

– 5KB article size (text only, compressed) – Up to 2000 matching articles per day

Performance of our scheme

– Query size: 4-30MB

– Server processing time: 500ms per article

– Communication size: 500KB – 7MB per time period – Client reconstruction time: 0 – 15min

(17)

17

Communication Overhead:

Our Scheme

Low overhead

– Factor of 1.2 before inflation due to Paillier – 2.4 after

(18)

18

Communication Overhead:

Previous Scheme

Previous scheme impractical

– Communication overhead: 40 times – Reconstruction time: 4.7 hours

(19)

19

Conclusion

Efficient system for private stream searching

– Low communication overhead

● Factor of 2.4 over actual matching documents ● Previous system up to factor of 40

– Extensions also developed

● Arbitrary length documents ● More complex queries

● Other stuff

For more info see full version of paper

– http://www.cs.cmu.edu/~bethenco/search.ps – Includes details of actual scheme!

http://www.cs.cmu.edu/~bethenco/search.ps

References

Related documents

(E) qRT-PCR analysis of p70s6K1 mRna expression in MCF7 and MDa-MB-231 cells cotransfected with miR-145-5p mimics or miR-nC, and circZnF609 vector or control vector. (F) Western

Effects of dietary cobalt source and concentration on performance, vitamin B12 status, and ruminal and plasma metabolites in growing and finishing steers. A theoretical

different characters of the original parent plants A , a and B , b , four constant combinations are possible, and thus the hybrid produces the corresponding four forms of germ

Percentage inhibition (probit scale) of mycelial growth rates (mm/day) of the North Carolina (NC) and Louisiana (LA) isolates of Lagenidium giganteum on nutrient (PYG) agar

CALDECOTT and SMITH (1952a) have obtained somewhat similar results following heat treatments of dormant lbarley seeds before and after X-radiation. They reported

As shown in table 3, the activity of antioxidant enzyme Cu-Zn SOD level in the erythrocyte lysate from type 2 diabetic patients with retinopathy was significantly lower

aesthetics are important to management. It would be useful to continue monitoring these shrubs for several more years. Unfortunately, due to mining activities, they

With the great growth of the Web, it has become a massive challenge for the all-purpose single process crawlers (A crawler is a program that downloads and stores Web