• No results found

E6895 Advanced Big Data Analytics Lecture 8:! Encrypted Domain Data Mining

N/A
N/A
Protected

Academic year: 2021

Share "E6895 Advanced Big Data Analytics Lecture 8:! Encrypted Domain Data Mining"

Copied!
77
0
0

Loading.... (view fulltext now)

Full text

(1)

© CY Lin, 2015 Columbia University

E6895 Advanced Big Data Analytics — Lecture 9

E6895 Advanced Big Data Analytics Lecture 8:

!

Encrypted Domain Data Mining

Ching-Yung Lin, Ph.D.

Adjunct Professor, Dept. of Electrical Engineering and Computer Science

(2)

Encrypted Domain Recommendation

How can users contribute their private data

without compromising their privacy?

(3)

2011/11/11

3

(4)

2011/11/11

4

TOO MUCH

(5)

How Do Recommendations Work in End-to-End

Encrypted Scheme

?

End to End Encryption

Similarity Measure in Encrypted Domain

Encrypted Measurement Result Ranking

Encrypted Domain Operations

Secure Communication Channel

(6)

How Do Recommendations Work in End-to-End

Encrypted Scheme?

End to End Encryption

Similarity Measure in Encrypted Domain

Encrypted Measurement Result Ranking

Encrypted Domain Operations

Secure Communication Channel

(7)

1. Patients’ diagnosis results encrypted before doctor submits to database.

Alice

Bob

Carol

Doctors

Surgeons

2. Sends to server via secure channels

III. Privacy preserving ranking

I. Doctor submits encrypted query to system

(Private Information Retrieval) Encrypted

results returned to doctor Physicians II. Server measures encryption-domain similarity at encrypted database

(8)

1. Patients’ diagnosis results encrypted before doctor submits to database.

Alice

Carol

Doctors

Surgeons

2. Sends to server via secure channels

III. Privacy preserving ranking

I. Doctor submits encrypted query to system

(Private Information Retrieval) Encrypted

results returned to doctor Physicians II. Server measures encryption-domain similarity at encrypted database

Bob

(9)

How Do Recommendations Work in End-to-End

Encrypted Scheme?

End to End Encryption

Similarity Measure in Encrypted Domain

Encrypted Measurement Result Ranking

Encrypted Domain Operations

Secure Communication Channel

2011/11/11

(10)

1. Patients’ diagnosis results encrypted before doctor submits to database.

Alice

Bob

Carol

Doctors

Surgeons

2. Sends to server via secure channels

III. Privacy preserving ranking

I. Doctor submits encrypted query to provide treatment

(Private Information Retrieval) Encrypted

results returned to doctor Physicians II. Server measures encryption-domain similarity at encrypted database

(11)

( ) ( ) ( )

2 3 2 2 2 1

,

ε

m

,

ε

m

m

ε

( ) ( ) ( )

3 3 3 2 3 1

,

ε

m

,

ε

m

m

ε

∑ ∑

= = n k i ki ai i ki ai m m m m 1 ak ) ( . ) ( ) ( . ) ( ) (

W

ε ε ε ε ε P1

Item 1 Item 2 Item 3

……

Using homomorphic encryption algorithm to calculate similarity in encrypted domain according to doctor’s request.

P2 P3

Encrypted medical record database

( ) ( ) ( )

1 3 1 2 1 1

,

ε

m

,

ε

m

m

ε

(12)

How Do Recommendations in an End-to-End

Encrypted Domain?

End to End Encryption

Similarity Measure in Encrypted Domain

Encrypted Measurement Result Ranking

Encrypted Domain Operations

Secure Communication Channel

12

(13)

1. Patients’ diagnosis results encrypted before doctor submits to database.

Alice

Bob

Carol

Doctors

Surgeons

2. Sends to server via secure channels

III. Privacy preserving ranking

I. Doctor submits encrypted query to system

(Private Information Retrieval) Encrypted

results returned to doctor Physicians II. Server measures encryption-domain similarity at encrypted database

(14)

∑ ∑

= = n k i ki ai i ki ai m m m m 1 ak ) ( . ) ( ) ( . ) ( ) (

W

ε ε ε ε ε

Doctor chooses two private keys to conduct order-preserving

encryption on encrypted domain similarity results so proxy

can rank the encryption results. The proxy knows only the rank and not the true similarity measurements of each patient.

) ( ) ( ) ( ) (R1doctor ε

W

ak ε R2doctor εop

W

ak ε + = ) ( ), (R1doctor ε R2doctor ε Order-preserving encryption key

Doctor

Secure channel

(15)
(16)

1. Patients’ diagnosis results encrypted before doctor submits to database.

Alice

Bob

Carol

Doctors

Surgeons

2. Sends to server via secure channels

III. Privacy preserving ranking

I. Doctor submits encrypted query to system

(Private Information Retrieval) Encrypted

results returned to doctor Physicians II. Server measures encryption-domain similarity at encrypted database

(17)

How Do Recommendations Work in End-to-End

Encrypted Scheme?

End to End Encryption

Similarity Measure in Encrypted Domain

Encrypted Measurement Result Ranking

Secure

Communication Channel

2011/11/11

(18)

1. Patients’ diagnosis results encrypted before doctor submits to database.

Alice

Bob

Carol

Doctors

Surgeons

2. Sends to server via secure channels

III. Privacy preserving ranking

I. Doctor submits encrypted query to system

(Private Information Retrieval) Encrypted results returned to

doctor Physicians II. Server measures encryption-domain similarity at encrypted database

Secure Channel

(19)

Encrypted-Domain Operations

Ring homomorphic encryption

Privacy-preserving ranking

Secure channel

2011/11/11

(20)

=

a

×

b

+

c

a

×

c

b

a

×

+

b

+

c

Encrypted

Encrypted Encrypted

Decryption

Encrypted

Plaintext

Algebraic operation supports

both

addition

and

multiplication

2011/11/11

(21)

Ring Homomorphic Encryption

Polynomial rings setting

Key generation

Encryption scores

Encrypted domain similarity measure

Decryption results

2011/11/11

(22)

2011/11/11

22

An Example of Ring Homomorphic

Encryption

(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
(31)
(32)
(33)
(34)
(35)

Encrypted domain similarity

measure

(36)
(37)

Apply second private key for

decryption

(38)

Apply second private key for

decryption

(39)

Recover the exact plaintext

similarity measurement results

(40)

Encrypted-Domain Operations

Ring homomorphic encryption

Privacy-preserving ranking

Secure channel

2011/11/11

(41)

1. Patients’ diagnosis results encrypted before doctor submits to database.

Alice

Bob

Carol

Doctors

Surgeons

2. Sends to server via secure channels

III. Privacy preserving ranking

I. Doctor submits encrypted query to system

(Private Information Retrieval) Encrypted

results returned to doctor Physicians II. Server measures encryption-domain similarity at encrypted database

(42)

Using the affine transformation for

privacy-preserving ranking

2011/11/11

42

Toorani, M. and Falahati, A 2011 A secure cryptosystem based on affine transformation, Security and Communication Networks pp 207-215

(43)

Encrypted Domain Operation

Ring Homomorphic Encryption

Privacy Preserving Ranking

Secure Channel

2011/11/11

(44)

1. Patients’ diagnosis results encrypted before doctor submits to database.

Alice

Bob

Carol

Doctors

Surgeons

2. Sends to server via secure channels

III. Privacy preserving ranking

I. Doctor submits encrypted query to system

(Private Information Retrieval) Encrypted results returned to

doctor Physicians II. Server measures encryption-domain similarity at encrypted database

Secure Channel

(45)

Diffie-Hellman key exchange scheme for

secure channel

Secure

channel

(46)
(47)

Outline

Introduction

Privacy-Preserving Recommendation

System

Encrypted Domain Operations

Experiments

Conclusion

2011/11/11

(48)

Experiment Sets

2011/11/11

48

Prototype system

Microsoft Windows 7 (32-bit) operation system

Intel Core 2 Quad CPU, and 2 GB RAM.

Jester dataset from UC Berkeley

Book-Crossing dataset

(49)

Experiments

2011/11/11 49

Performance

Accuracy

Security

!

!

(50)

Experiments

2011/11/11 50

Performance

Accuracy

Security

!

!

(51)
(52)

Experiments

Performance

Accuracy

Security

!

!

(53)

Canny, J. 2002. Collaborative filtering with privacy. In proceedings of the IEEE Symposium on Security and Privacy (Oakland, CA, SA, May, 2002) 45-57.

Polat, H. and Du, W. 2006. Privacy-Preserving Collaborative Filtering on Vertically Partitioned Data. PAKDD 2005: 651-658.

(54)

Dot Products in

Random Perturbation

54

)

(

)

(

1 i i n i i i

r

b

v

a

B

A

!

!

=

+

+

=

= = = =

+

+

+

=

n i i n i i i n i i i n i i i i

b

a

v

r

b

r

v

a

1 1 1 1 i n i i

b

a

=

1

Zero Mean

(55)

Drawbacks of User Profile Distribution

55

Canny, J. 2002. Collaborative filtering with privacy. In proceedings of the IEEE Symposium on Security and Privacy (Oakland, CA, SA, May, 2002) 45-57.

Distribution

Clustering based approaches

!

!

(56)

Experiments

56

Performance

Accuracy

Security

!

!

(57)
(58)

Contributions

Ring homomorphism operation

Secret sharing key management

End-to-end encryption

2011/11/11

(59)

Conclusion

High accuracy recommendation

Lossless number theory approaches

Statistical approaches

Practical and implementable

Well-suited for cloud computing applications

Highly Secure

Post-quantum cryptography

2011/11/11

(60)

Future Work

More

complex mathematical operations in

the encryption domain

Robust privacy-preserving data mining

2011/11/11

(61)

Multimodality Intelligent Sensor for Homecare

S S

S

S

: sensors

Sensors in the home to detect

or prevent accident

Sleep

monitoring

Fall down

Stroke

Tension

(62)

Multimodality Intelligent Sensor for Homecare

S S

S

S

: sensors

Long-Term Monitoring and Logging

of Personal Activities for Chronicle

Disease and Health Maintenance

Watching TV

Talking on

the phone

Sleep

(63)

Intelligent Multimodality Sensor

p

Commodity:

Fixed Low-Cost Multimodality Recognition Sensor installed under the ceiling or mounted on the wall. A Replacement or extension of the smoke detector,

intruder detector, baby monitor, etc.

!

p

Technical Innovation:

Distributed Intelligence:

p Integrated recognition-driven feature extraction module

in the sensor

p Only transmitting required features through wireless

channel (e.g., MVs, Color histogram, Sound MFCC coefficients, etc.)

p Separate inference engines for behavior and event

recognition.

Benefits:

Low-Energy Consumption

Low Data Transmission Bandwidth

(64)

Intelligent Multimodality

Behavior-Recognition Software Engine

p Commodity:

Extensible software applications residing in PC or other

computing devices for behavior or event inference.

Standard-compliant wireless signal receiver for interfacing

with the wireless sensor.

!

p Technical Innovation:

Distributed Intelligence:

p Developing Machine Learning and Data Mining Algorithms for

Human Behavior or Environmental Context Recognition

p Recognition is based on the received feature signals

Benefits:

Scalability: Consumers can buy any combination of software modules for different applications – surveillance, healthcare, etc.

Low Maintenance Cost

!

Signal Receiver & Software Inference engines visual feature audio feature features from other data modality

fall down sleep smoke

(65)

Simple Multimodality Sensors for Sleep

Situation Inference

[Peng et al., ISCAS 2006] [Peng et al., D2H2 2006]

p Understand human night-time

activity – Sleep

!

p What we have achieved:

■ Using visual-audio sensors to

monitor a person’s sleep patterns

(66)

Develop Hardware Multimodality Semantic Sensor

p

Extract Features of Audio/Visual/PIR

Signals using FPGA.

(67)

Issues about Sensor Network

p The coverage range (required number) of sensors installed in

the target environment

!

p The power consumption of mobile sensors; Different

strengths (power fading) of transmitted signals from mobile sensors in different regions

!

p Hand-off of signals/communication in mobile sensor

networks

!

p Information Fusion of Sensors

!

(68)

Issues about Activity Understanding

p

How to learn human activity automatically – Need

to develop novel learning algorithms

!

p

Hybrid choices of supervised learning and

unsupervised learning

!

p

Frameworks for activity object tracking,

multimodality feature extraction.

!

p

Multimodality Joint Activity and Behavior

Understanding

!

(69)

Distributed Sensor Information Processing,

Mining and Management

p Distributed Signal Processing:

■ Decide the optimal solutions to distribute the stages

of recognition modules into multiple sensors and servers based on resource constraints

!

p Sensor Information Management:

■ Record, Index, and Manage the recognition results

and the original sensor signals for activity mining

!

p Sensor Network Information Aggregation:

■ Federated information reasoning based on aggregated

recognition results from sensors and sensor networks at multiple sites

(70)

Sleep-Monitoring and Sensors

!

!

!

■ Sleep Research Laboratory

equipped with polysomnographic sleep recording system

(http://www.son.washington.edu/ departments/bnhs/bnhs-tg/sleep-lab.asp)

(71)

Sleep Monitoring

Sleep occupies almost

1/3

of human

life!

!

Sleep deprivation due to

sleep-related disorders may introduce

severe health impairments

[WHO

2004]

!

Long-term sleep monitoring can

provide information for

detection of

early symptoms of sleep related

(72)

Previous Studies in Sleep Monitoring

Previous works in sleep monitoring

!

Static charge sensitive bed, heuristic thresholding,

detection of wake, quiet-sleep, and active-sleep

[Salmi et al. 1986]

!

EEG, HMM, detection of arousal

[Huang et al. 1996]

!

Air cushion, finite-state machine, detection of

6-stages of sleep

[Watanabe et al. 2004]

!

(73)

Traditional Methods to Measure Sleep (1)

Subjective approach

!

Self-rated questionnaires

and sleep diaries

!

Pittsburgh Sleep Quality

Index (PSQI)

How many hours of actual

sleep did you get at night?

Cough or snore loudly?

!

Limited capabilities

!

Less reliable than

(74)

Traditional Methods to Measure Sleep (2)

Objective approach

!

Polysomnography (PSG)

EEG, ECG, EMG, and EOG

Data scored by human examiners

~$1000/study

Intrusive

!

Actigraphy

Limb movement via accelerometer

Less-intrusive than PSG

(75)

Innovation and Advantage

p Single vs. Multimodality

!

■ Using multimodality data fusion, the accuracy can be increased or

the FA can be reduced compared to previous works using single data modality

■ Example: sleep-wake detection using PIR & HR sensors

!

!

!

!

!

p Specific vs. Generic framework

!

■ Compared to previous works which impose many conditions and

thus are specific to certain activity recognition, our method would be a framework for general activities understanding

Modality Fusion Motion only Heart-rate only

Miss Rate 0.1610 0.1867 0.0737

FA 0.0615 0.3604 0.2693

(76)

2012/6/3

76 Jyh-Ren Shieh, Ching-Yung Lin, Ja-Ling Wu, "Recommendation in End-to-End Encrypted Domain," the 20th ACM Conference on Information and Knowledge Management (CIKM 2011), Glasgow, Scotland, UK, (2011) acceptance rate 15% (Oral presentation ).

(77)

Publications

■ Y.-T. Peng, C.-Y. Lin, M.-T. Sun, and C. A. Landis, “Multimodality sensor system

for long-term sleep quality monitoring,” in IEEE Trans. on Biomedical Circuits and Systems, Fall, 2007 (to be appeared).

!

■ Y.-T. Peng, C.-Y. Lin, and M.-T. Sun, “Audio event classification using binary

hierarchical classifiers with feature selection for healthcare applications,” in IEEE ISCAS 2008 (submitted).

!

■ Y.-T. Peng, C.-Y. Lin, and M.-T. Sun, “Multimodality Sensors for sleep quality

monitoring and logging,” in IEEE Workshop on Electronic Chronicles

(eCHRONICLE), April 2006.

!

■ Y.-T. Peng, C.-Y. Lin, M.-T. Sun, and M.-W. Feng, “Sleep condition inferencing

using simple multimodality sensors,” in IEEE ISCAS 2006.

!

■ Y.-T. Peng, C.-Y. Lin, and M.-T. Sun, “A distributed multimodality sensor system

for home-used sleep condition inference and monitoring,” in 1st Transdisciplinary Conference on Distributed Diagnosis in Home Healthcare (D2H2), April 2006.

(http://www.son.washington.edu/ departments/bnhs/bnhs-tg/sleep-lab.asp

References

Related documents

(4) Pada kategori gaya belajar visual, auditorial, dan kinestetik, siswa yang diberi perlakuan dengan model pembelajaran kooperatif tipe TPS dengan pendekatan RME mempunyai

Thus, if misdivision of about the same frequency occurs on the female side, where there is no selection against deficient gametes, the large class of plants (14.5%) that

Serve as an officer of the Adventist Youth Society, Ambassador Club, Youth Sabbath School, Youth Council or Federation, or similar church or campus youth organization for a minimum

Poly(vinyl alcohol) (PVA) hydrogels are used to produce high fidelity models for surgical..

Arterial and mixed venous blood gases, functional residual capacity, cardiac output, vascular pressures, calculated venous admixture, and pulmonary vascular resistance

The new TF-Summation Method is similar to TF-SRSS except, after solving for each significant mode of the coherency matrix with a phase angle of zero, the contribution of the effects

Third, the logical thinking rules and corresponding operators employed in ILP area unit bestowed, leading to a “proof-theory” for ILP Fourth, since inductive logical thinking

They have compared performance of two algorithms differential evolution (DE) and Back propagation (BP) for training a Functional Link Artificial Neural Network