Secure Mediation of Join Queries by Processing Ciphertexts
Joachim Biskup, Christian Tsatedem and Lena Wiese Lehrstuhl VI
Fachbereich Informatik Universit¨at Dortmund
Germany
SECOBAP’07
Marmara Hotel, Istanbul April 20, 2007
Overview
ñ Introduction and Problem Statement
ñ Encryption Scheme 1: Database-As-a-Service ñ Encryption Scheme 2: Commutative Encryption
ñ Encryption Scheme 3: Homomorphic Encryption/Private Matching ñ Comparison
ñ Conclusion
Introduction: Mediation
Basic mediation system
ñ A client directs a global query to a mediator
ñ Mediator gathers data by sending partial queries to datasources
ñ Mediator constructs a global result out of the partial results and sends it back to the client.
global result global query Client
Source 1 Source n
.. Mediator .
partial query 1
partial query n partial result n partial result 1
Introduction: Secure Mediation
Secure mediation with the Multimedia Mediator (MMM) system ñ Altenschmidt/Biskup/Flegel/Karabulut, 2003
ñ Confederations of clients, mediators and datasources (flexible: contract based, short-term, ...)
ñ Aims: Anonymity of clients and confidentiality of data
Introduction: Secure Mediation
MMM Protocol
ñ Preparatory phase: Client acquires credentials (public key & properties) and identity certificates (public key & identity)
ñ Request phase:
1. Client sends global query with appropriate credentials to mediator 2. Mediator forwards credentials with partial queries to datasources 3. Datasources execute access control based on client’s properties 4. Datasources execute partial queries to get partial results
ñ Delivery phase:
1. Datasources encrypt partial results
2. Mediator computes encrypted global result and returns it to client
Introduction: Secure Mediation
i
process
p: properties
id: client’s identity Certification
id
k p
kpub
p
pub
p k
pub
..
Source 1
.
global p
q kpub
scheme result
global encr.
Source n scheme
kpub
query
Client
k : client’s public key Authority
pub
scheme Mediator
partial result R SQL2-
Algebra
n
partial result R1
partial query q partial query q1
n
encryp- ted R
Problem Statement: Delivery Phase
How can the mediator compute a global result
if it is not eligible to access the data in the partial results?
Previous solution: mobile code (Biskup/Sprick/Wiese, 2005) ñ Mediator constructs executable that computes the global result
ñ Client executes mobile code on decrypted partial results
New solution: computation on encrypted data ñ Delivery phase:
1. Datasources encrypt partial results with appropriate encryption scheme 2. Mediator computes encrypted global result from encrypted partial results 3. Client decrypts global result (according to encryption scheme)
Notation
ñ One client, one mediator, two datasources S1 and S2 ñ Global query q: JOIN over two partial queries
ñ Partial queries q1 over relation R1 and q2 over relation R2 ñ Single, common attribute (“join attribute”) Ajoin
Client Mediator
q Source 1
Source 2
R1(...,Ajoin,...)
R2(...,Ajoin,...)
="select * from R1, R2 where R1.Ajoin=R2.Ajoin"
="select * from R1"
="select * from R2"
q
q
1
2
Overview
ñ Introduction and Problem Statement
ñ Encryption Scheme 1: Database-As-a-Service ñ Encryption Scheme 2: Commutative Encryption
ñ Encryption Scheme 3: Homomorphic Encryption/Private Matching ñ Comparison
ñ Conclusion
Encryption Scheme 1: DAS Model
“Database As a Service”
(Hacıg¨um¨u¸s et al., 2002)
ñ Data owner outsources data to service provider in encrypted format ñ Partitioning of attribute domains (“bucketization”)
ñ One distinct index value for each partition of a domain
ñ One query executed on index values at service provider site (server query:
superset of exact result)
ñ Second query executed on data owner site (client query: exact result)
Encryption Scheme 1: DAS Model
Delivery Phase based on “Database As a Service”
ñ Datasources encrypt partial results
ñ Datasources define partitions (“buckets”) and index values of join attribute ñ Client constructs server query for mediator
ñ Mediator executes server query on encrypted partial results (→ encrypted superset of global result)
ñ Client decrypts mediator’s result and executes client query (→ global result)
DAS kpub partitions & index values
DAS partial result R1
DAS partial result R2
Source 2
R2(...,Ajoin,...)
Source 1
R1(...,Ajoin,...)
Mediator Client
superset of global result query
server query
client
Overview
ñ Introduction and Problem Statement
ñ Encryption Scheme 1: Database-As-a-Service ñ Encryption Scheme 2: Commutative Encryption
ñ Encryption Scheme 3: Homomorphic Encryption/Private Matching ñ Comparison
ñ Conclusion
Encryption Scheme 2: Commutative Encryption
Commutative encryption function fe (as in Agrawal et al., 2003) ñ Polynomial-time computable function (with key e) such that:
1. [Commutativity] For all keys e1 and e2: fe1 ◦ fe2 = fe2 ◦ fe1 2. [Bijectivity] Each fe is a bijection
3. [Invertibility] The inverse fe−1 is polynomial-time computable given e 4. [Secrecy] Distributions of hx, fe(x), y, fe(y)i and hx, fe(x), y, zi are
indistinguishable
ñ Property 4 indispensable for security proofs
ñ Use hash values of original inputs to ensure randomness (random oracle model)
Encryption Scheme 2: Commutative Encryption
Delivery Phase based on two-party protocol for join (Agrawal et al., 2003)
1. Tuples with same join attribute value are encrypted with client’s key 2. Join attribute values are hashed: h(a)
3. S1 has key e1 and encrypts hashes: fe1(h(a)) / S2 has key e2: fe2(h(a0)) 4. Exchange and second encryption gives fe2(fe1(h(a))) and fe1(fe2(h(a0))) 5. Mediator checks if fe2(fe1(h(a))) = fe1(fe2(h(a0)))
Source 1
R1(...,Ajoin,...)
e2 h(a’)e2
e1 h(a) e1
e2 h(a’)e2 e1
?= R1
kpub
Client
h(a’)
Overview
ñ Introduction and Problem Statement
ñ Encryption Scheme 1: Database-As-a-Service ñ Encryption Scheme 2: Commutative Encryption
ñ Encryption Scheme 3: Homomorphic Encryption/Private Matching ñ Comparison
ñ Conclusion
Encryption Scheme 3: Homomorphic Encryption
Additively homomorphic encryption function E (as in Freedman et al., 2004)
ñ Semantically secure public key encryption function such that:
1. Given two ciphertexts E(a) and E(b),
there is a way to efficiently compute E(a + b).
2. Given a constant γ and a ciphertext E(a),
there is a way to efficiently compute E(γ · a).
ñ For a polynomial P (x) = Pnk=0 ckxk, given only encryptions E(ck) of the coefficients and
cleartext input value a (such that b = P (a)),
one can efficiently compute E(b) = E(P (a)) = E(Pnk=0 ckak)
ñ For a constant value γ and “payload data” p (|| means concatenation)
Encryption Scheme 3: Homomorphic Encryption
Delivery Phase based on Private Matching protocol for intersection (Freedman et al., 2004)
1. Client’s kpub is public key of homomorphic encryption scheme
2. Each datasource has polynomial with join attribute values as roots (P1 and P2) 3. Mediator exchanges encrypted coefficients
4. Each datasource evaluates encrypted polynomial on cleartext join attribute values plus tuples as payload data
5. Client decrypts data and finds either random values or matching tuples
Mediator
Source 1
R1(...,Ajoin,...)
Source 2
R2(...,Ajoin,...)
Source 2
R2(...,Ajoin,...)
R2 homom.
R1 homom.
R2 homom.
R1 homom.
R1
homom. kpub
kpub Client P1
P2
Overview
ñ Introduction and Problem Statement
ñ Encryption Scheme 1: Database-As-a-Service ñ Encryption Scheme 2: Commutative Encryption
ñ Encryption Scheme 3: Homomorphic Encryption/Private Matching ñ Comparison
ñ Conclusion
Comparison
Assumptions
ñ Cryptographic strength of encryption schemes as stated by security proofs in original articles
ñ Cryptographic models are respected (random oracle model, large domains, ...) ñ Datasources include only those data records in partial results
for which access permissions could be established (based on client’s credentials)
Comparison
Client’s extra knowledge ñ Database-As-a-Service:
client retrieves superset of global result (extra data records) and partitions and index tables
ñ Commutative Encryption:
no extra knowledge (client just retrieves exact global result) ñ Homomorphic Encryption/Private Matching:
client knows number of different join attribute values with each datasource
Comparison
Mediator’s extra knowledge
ñ All three Delivery Phase protocols:
confidentiality ensured
(data records are encrypted such that only the client can decrypt them) ñ Database-As-a-Service:
mediator learns sizes of partial results and
size of server query result (upper bound of size of global result);
partition sizes and domain sizes maybe crucial (trade-off confidentiality/efficiency)
ñ Commutative Encryption:
mediator learns number of join attribute values with each datasource and size of intersection (lower bound of size of global result)
ñ Homomorphic Encryption/Private Matching:
mediator learns number of join attribute values with each datasource
Overview
ñ Introduction and Problem Statement
ñ Encryption Scheme 1: Database-As-a-Service ñ Encryption Scheme 2: Commutative Encryption
ñ Encryption Scheme 3: Homomorphic Encryption/Private Matching ñ Comparison
ñ Conclusion
Conclusion
Secure mediation with ciphertext processing ñ Confidentiality of transmitted data
ñ Anonymity of client
ñ Reduced need for trust in the mediator (in comparison to mobile code) ñ Reduced workload for client (in comparison to mobile code)
Appendix
ñ Encryption Scheme 1: Database-As-a-Service ñ Encryption Scheme 2: Commutative Encryption
ñ Encryption Scheme 3: Homomorphic Encryption/Private Matching
Encryption Scheme 1: DAS Model
Delivery Phase based on “Database As a Service”
(Hacıg¨um¨u¸s et al., 2002)
1. Each database Si partitions active domain of join attribute domactive(Ri.Ajoin) and assigns each partition an index value in an index table IT ableRi.Ajoin.
R1:
. . . R1.Ajoin . . . . . . 100 . . . . . . 120 . . . . . . 150 . . .
⇒
index table for R1:
IT ableR1.Ajoin [100, 150) 1
[150, 200] 2
R2:
. . . R2.Ajoin . . . . . . 120 . . . . . . 150 . . . . . . 210 . . .
⇒
index table for R2:
IT ableR2.Ajoin [100, 200) 11 [200, 300] 12 2. Si encrypts Ri row-wise (with client’s keys) and adds column for index
R1S:
RS1.tS RS1.ASjoin 1010101 . . . 1 0101011 . . . 1 010111 . . . 2
RS2:
RS2.tS R2S.ASjoin 00101001 . . . 11
1101001 . . . 11 100111 . . . 12
3. Si sends encrypted partial result and encrypted index table to the mediator:
hRSi , encrypt(IT ableRi.Ajoin)i, where encrypt is encryption with client’s keys 4. Mediator forwards index tables to client
5. Client decrypts index tables and constructs:
a. Server query qS (selects tuples from overlapping partitions in partial results) RC := qS(R1S, R2S) = σCondS(RS1 × RS2)
where CondS = (R1S.ASjoin = 1 ∧ RS2.ASjoin = 11
∨R1S.ASjoin = 2 ∧ RS2.ASjoin = 11
∨R1S.ASjoin = 2 ∧ RS2.ASjoin = 12)
IT ableR1.Ajoin: [100, 150) 1
[150, 200] 2 IT ableR2.Ajoin: [100, 200) 11
[200, 300] 12 b. Client query qC (postprocesses mediator’s result to find correct join tuples)
qC(decrypt(RC)) = σ(R1.Ajoin=R2.Ajoin)(decrypt(RC))
7. Mediator executes qS on encrypted partial results; returns RC to client
RC:
RS1.tS R1S.ASjoin RS2.tS RS2.ASjoin 1010101 . . . 1 00101001 . . . 11 1010101 . . . 1 1101001 . . . 11 0101011 . . . 1 00101001 . . . 11 0101011 . . . 1 1101001 . . . 11 010111 . . . 2 00101001 . . . 11 010111 . . . 2 1101001 . . . 11 010111 . . . 2 100111 . . . 12 8. Client decrypts RC and executes client query qC
global result R:
. . . R1.Ajoin . . . R2.Ajoin . . . . . . 120 . . . 120 . . . . . . 150 . . . 150 . . .
Appendix
ñ Encryption Scheme 1: Database-As-a-Service ñ Encryption Scheme 2: Commutative Encryption
ñ Encryption Scheme 3: Homomorphic Encryption/Private Matching
Encryption Scheme 2: Commutative Encryption
Delivery Phase based on two-party protocol for join (Agrawal et al., 2003)
1. Datasource Si generates key ei for commutative encryption function f Si encrypts hash values of join attribute values (ideal hash function h)
R1:
R1.A1 R1.Ajoin
α 10
β 10
γ 15
⇒
encrypted hash values:
fe1(h(10)), fe1(h(15))
R2:
R2.A1 R2.Ajoin
δ 12
15
ζ 15
⇒
encrypted hash values:
fe2(h(12)), fe2(h(15)) 2. Si builds tuple sets for same join attribute value:
T upi(a) := {t ∈ Ri | t[Ajoin] = a}
T up1(10) = {hα, 10i, hβ, 10i}
T up1(15) = {hγ, 15i}
T up2(12) = {hδ, 12i}
T up2(15) = {h, 15i, hζ, 15i}
S encrypts them with client’s keys to ciphertexts encrypt(T up (a))
3. Si sends set of messages Mi := {hfei(h(a)), encrypt(T upi(a))i} to mediator M1 :=
{hfe1(h(10)), encrypt(T up1(10))i, hfe1(h(15)), encrypt(T up1(15))i}
M2 :=
{hfe2(h(12)), encrypt(T up2(12))i, hfe2(h(15)), encrypt(T up2(15))i}
4. Mediator exchanges message sets (sends M1 to S2 and M2 to S1) 5. For each hfe2(h(a)), encrypt(T up2(a))i from S2:
S1 computes hfe1(fe2(h(a))), encrypt(T up2(a))i and sends it to the mediator 6. For each hfe1(h(a)), encrypt(T up1(a))ifrom S2:
S2 computes hfe2(fe1(h(a))), encrypt(T up1(a))i and sends it to the mediator 7. Mediator looks for messages with identical first component
fe1(fe2(h(a))) = fe2(fe1(h(a))) (bijectivity and commutativity properties of f ) and sends result messages hencrypt(T up1(a)), encrypt(T up2(a))i to the client 8. Client decrypts result messages with his private keys and constructs result tuples
R1.A1 R1.Ajoin R2.A1 R2.Ajoin
Appendix
ñ Encryption Scheme 1: Database-As-a-Service ñ Encryption Scheme 2: Commutative Encryption
ñ Encryption Scheme 3: Homomorphic Encryption/Private Matching
Encryption Scheme 3: Homomorphic Encryption
Delivery Phase based on Private Matching protocol for intersection (Freedman et al., 2004)
1. Assumption: Client has one public key for homomorphic encryption scheme E 2. S1 forms a polynomial whose roots are the join attribute values a1, . . . , an:
P1(x) := (a1 − x) · (a2 − x) · ... · (an − x) = Xn
k=0ckxk
S1 encrypts coefficients using client’s key and sends all E(ck) to mediator 3. S2 forms a polynomial whose roots are the join attribute values a01, . . . , a0m:
P2(x) := (a01 − x) · (a02 − x) · ... · (a0m − x) = Xm
l=0
dlxl
S2 encrypts coefficients using client’s key and sends all E(dl) to mediator 4. Mediator exchanges coefficients (E(d ) to S and E(c ) to S )
5. S1 evaluates polynomial P2 on its cleartext join attribute values a1, . . . , an: ek := E(rk · P2(ak) + (ak||T up1(ak)))
(rk is a fresh random number and
T up1(ak) is set of tuples with join attribute value ak) S1 returns all ek values to mediator
6. S2 evaluates polynomial P1 on its cleartext join attribute values a01, . . . , a0m: e0l := E(rl0 · P1(al0) + (a0l||T up2(a0l)))
S2 returns all e0l values to mediator
7. Mediator forwards all ek and e0l values to client 8. Client decrypts them to either a random value,
a value (ak||T up1(ak)) or a value (a0l||T up2(a0l))
For values (ak||T up1(ak)) and (a0l||T up2(a0l)) where ak = a0l, the tuples are joined in the global result