• No results found

Scalable Private Database Querying for Arbitrary Formulas

N/A
N/A
Protected

Academic year: 2021

Share "Scalable Private Database Querying for Arbitrary Formulas"

Copied!
24
0
0

Loading.... (view fulltext now)

Full text

(1)

Scalable Private Database Querying

for Arbitrary Formulas

Vladimir Kolesnikov (Bell Labs)

Seung Geol Choi, Angelos Keromytis, Fernando Krell, Tal Malkin, Vasilis Pappas and Binh Vo (Columbia)

Wesley George (UToronto),

(2)

Outline

Problem description

The cost of secure computation and how to scale

Our system

(3)

3

COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.

IARPA SPAR: Security and Privacy Assurance

Research

(4)

Required features

100M records, 10TB DB

Preserve query and data privacy

Robust query support:

select * where NAME=Bob AND AGE >20

Boolean query expressions (including at least three conjunctions) Range queries and inequalities for integer numeric, date/time, etc Matching of keywords ―close to a specified value (stemming)

Text fields with many keywords (e.g. 100’s) Matching of values with wildcards

Matching of values with a specified subsequence m-of-n conjunctions

Ranking of results …

Allowed up to 2-10x overhead compared to MySQL

(5)

5

COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.

Basic Architecture

Database Owner Encrypted Database Client Index Server (S) S holds permuted encrypted indexed DB

(6)

Overview:

1.  Alice prepares encrypted version C’ of C

2.  Sends encrypted form x’ of her input x

3.  Allows Bob to obtain encrypted form y’ of his input y

4.  Bob can compute from C’,x’,y’ the “encryption” z’ of z=C(x,y)

5.  Bob sends z’ to Alice and she decrypts and reveals to him z

AND OR AND

NOT

OR AND

Alice’s inputs

Bob’s inputs

(7)

7

COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. AND OR

AND

NOT

OR AND

Alice’s inputs

Bob’s inputs

Secure Computation: Cost

Circuit encryption includes encryption of truth table of gates

For each gate of C, need to compute and send O(4) encryptions (AES needs 50-150 cycles to encrypt 128 bits)

Very fast for small problems

Does not scale for

large functions

(8)

Secure Computation: how to scale

If OK to have some security loss (as efficiency tradeoff):

Identify privacy-critical subroutines and implement them securely Insecure implementation of the rest

(9)

9

COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.

Natural Trade Offs

Deterministic encryption

Because of scale, comparison of encrypted values used in search must be very fast. Not clear how to approach with probabilistic encryption

Access patterns

Clearly not a bad leakage. Seems quite expensive to avoid, so natural to live with it.

(10)

Bloom Filter

Constant-time querying

Efficient storage (ca 10 bits per keyword)

Fixed access pattern (same for both match and non-match)

Encrypted BF:

(11)

11 | Columbia U / Bell Labs

Occluded BF

Idea:

Mask BF with a (pseudo-)random pad Let Client know the pad (via seed)

Then Client and Server run SFE for computing match, where C inputs pad. GC is very efficient: 10-20 gates per term, plus gates to implement formula.

Query: C sends Enc(kw), S computes match OK for single keyword searches

(12)

DB Search

DB records S C Solution: Evaluate via Secure Computation

(13)

13

COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.

Security Guarantee

We leak to S at most the following access patterns:

- the query pattern of a set of queries (e.g., S can distinguish between simple

and complex queries)

- tree search pattern of each query

- returned records access pattern

Above types of leakage seem necessary to achieve efficient sublinear performance.

(14)

Advanced Queries Based on AND/OR formulas:

Range Queries

We cover the range of our data type With a collection of intervals

(15)

15 | Columbia U / Bell Labs

Advanced Queries Based on AND/OR formulas:

Range Queries

(16)

Advanced Queries Based on AND/OR formulas:

Range Queries

To search for any value within a range,

we search for the smallest covering collection of intervals, using an OR formula

(17)

17 | Columbia U / Bell Labs

Advanced Queries Based on AND/OR formulas:

Negations

Note that the set of points other than some fixed value, has a small interval cover

(18)
(19)

19

COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.

Experimental Results

(20)

Policy Compliance

GC is strategically at the center of our approach because easy to compose. Requirement: secure policy checking:

Policy rejection should look like a query no-match to C and S

implement policy as a GC computation whose output is an input to BF tree node GC computation.

(21)

21

COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.

Subtlety 1: inexact data representation by BF

Let A, B, C collide under hash functions of BF, s.t. every index of C is an index for either A or B.

Then !∧"⇒#

Well-known “issue” – BF false positive

Does not reveal knowledge of underlying data, just representation.

A

B

(22)

Subtlety 1: inexact data representation by BF

Let A, B, C collide under hash functions of BF, s.t. every index of C is an index for either A or B.

Then !∧¬#⇒¬"

Issue: learn B without querying, even in secure eval of !∧¬#

Pertains to original data, not just BF representation We calculate advantage    Adv  ≤​*(+*/,)↑+ 

A

B

(23)

23

COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.

0-1 Result Set Size Indistinguishability

Goal: hide from S whether there was a 0 or 1 match.

S is an airline and C is gov’t querying for POI. Expect 0 hits S learning of a match can cause panic.

Def 1: Consider probability of bad event, prove it’s small

(24)

0-1 Result Set Size Indistinguishability

Goal: hide from S whether there was a 0 or 1 match.

Def 2: If distinguishable, guarantee that D’s confidence is not very high

-  if the a-priori probability of a 1-case is /, then conditioned on any possible

view, the a-posteriori probability of a 1-case is at most (1+0)/).

Solution: C adds p of fake tree-traversal paths. p is a random variable drawn from distribution like this

N paths

References

Related documents

However, although time to treatment discontinuation was signi fi cantly shorter in the RIS + OLZ group than in the RIS-ER group ( P = 0.050), it was not signi fi cantly shorter in

A telephone information service for energy efficiency advice is also provided by Eni UK Ltd, details of which can be found in section 51. This document has been prepared

(Students may transfer up to two courses from a graduate program at another University.) These documented experiences will be submitted in written form and reviewed

James Waller , Cohen Chair for Holocaust and Genocide Studies, Keene State College in New Hampshire; Auschwitz Institute for Peace and Reconciliation Affiliated Scholar. 9.30-9.40

Recently, the School performed a complete revision of its program goals and learning outcomes and started a new cycle of assessment data collection in Fall 2013 in order to

buildings deemed religious and permanent public art deemed religious religions resulting from mobile pastoralist social and agriculturally marginal environmental contexts shaped

With respect to the above mentioned “social assistance”, IBUSZ was of key importance: not only was the Hungarian direction of the travel service involved, but also IBUSZ

Examination of the decoding center in 70S ribosome complexes upon binding of cognate or near-cognate tRNA in A-site showed that the key nucleotides A1493, A1492 and G530 of 16S rRNA