Practical Data Integrity Protection in
Network-Coded Cloud Storage
Henry C. H. Chen
Department of Computer Science and Engineering The Chinese University of Hong Kong
Outline
Introduction
FMSR in NCCloud
FMSR-DIP
Publications
Yuchong Hu, Henry C. H. Chen, Patrick P. C. Lee, and Yang Tang
NCCloud: Applying Network Coding for the Storage Repair in a Cloud of-Clouds
Proceedings of the 10th USENIX Conference on File and Storage Technology (FAST ’12)
Henry C. H. Chen, and Patrick P. C. Lee
Practical Data Integrity Protection in Regenerating-Coding-Based Storage
Outline
Introduction
FMSR in NCCloud
Cloud Storage
On-demand storage outsourcing
Supports RESTful APIs:
Data Integrity Protection
Corruption detection
• Addressed in this work
Fault-tolerance and repair
• Addressed in NCCloud
Desirable properties
• Minimize cost
• Works on thin clouds (i.e., clouds with only basic file access semantics)
Data Integrity Protection
Corruption detection
• Addressed in this work
Fault-tolerance and repair
• Addressed in NCCloud
Desirable properties
• Minimize cost
• Works on thin clouds (i.e., clouds with only basic file access semantics)
Data Integrity Protection
Corruption detection
• Addressed in this work (FMSR-DIP)
Fault-tolerance and repair
• Addressed in NCCloud
Desirable properties
• Minimize cost
• Works on thin clouds (i.e., clouds with only basic file access semantics)
Related Work
Single node, smart clouds
• PDP [Ateniese et al. ’07]
• POR [Juels et al. ’07]
Multi-node, different storage schemes
• MR-PDP [Curtmola et al. ’08]
Our Work
Build FMSR-DIP, a corruption detection scheme that allows byte-sampling
• Works on thin clouds
• Works on functional minimum storage regenerating (FMSR) code
Outline
Introduction
FMSR in NCCloud
NCCloud
Proxy
Cloud 1 Cloud 2 Cloud 3 Cloud 4 Users
file upload
download file
Contributions of NCCloud
Propose an implementable design of functional
minimum storage regenerating (FMSR) code
• Support basic read/write operations and the repair function on thin clouds
• Preserve storage requirements as in optimal erasure codes, while reducing repair traffic
Repairing a Failed Cloud
How to repair:
Proxy
Cloud 1 Cloud 2 Cloud 3
Cloud 4
Cloud 5 Repair traffic = + +
Reed-Solomon Codes
Conventional repair:
• Reconstruct whole file and generate data in new node
A B A+B A+2B B A+B A A A B File of size M Node 1 Node 2 Node 3 Node 4 Proxy
Reed Solomon codes Repair traffic = M
n = 4, k = 2
(n, k) MDS code: Any k out of n storage nodes (clouds) can rebuild original file.
FMSR in NCCloud
Code chunk Pi = linear combination of original native chunks
Repair in FMSR:
• Download one code chunk from each surviving node
• Reconstruct new code chunks (via random linear combination) in
new node P1 P2 P3 P4 P5 P6 P7 P8 P3 P5 P7 P1’ P2’ A B C D P1’ P2’ Node 1 Node 2 Node 3 Node 4 File of size M Proxy
n = 4, k = 2
F-MSR codes
FMSR Property
P1 P2 P3 P4 P5 P6 P7 P8 A B C D k(n-k) chunks Proxy partition encode P1 P2 P3 P4 P5 P6 P7 P8 n(n-k) chunks distribute File n=4, k=2 Storage nodesFMSR Property
P1 P2 P3 P4 P5 P6 P7 P8 A B C D k(n-k) chunks Proxy partition encode P1 P2 P3 P4 P5 P6 P7 P8 n(n-k) chunks distribute File n=4, k=2 Storage nodesc1,1 c1,2 c1,3 c1,4 .
. .
c8,1 c8,2 c8,3 c8,4
A B C D P1 P2 P3 P4 P5 P6 P7 P8 Encoding matrix
rank = k(n-k)
NCCloud: Experiments
Testbed environment
• Local cloud
• Openstack Swift 1.4.2
• 1 proxy node connected to 15 storage nodes (LAN) • NCCloud deployed on proxy node
• Commercial cloud • Microsoft Azure
Storage schemes
0 5 10 15 20 25 30 35
1 10 50 100 200 300 400 500
RAID-6(native) RAID-6(parity) F-MSR
Response time: Local Cloud
FMSR has higher response time due to encoding/decoding overhead
FMSR has slightly less response time in repair, due to less data download
20 0 10 20 30 40 50
1 10 50 100 200 300 400 500
RAID-6 F-MSR
File size (MB)
R es ponse tim e (s ) UPL OA D
File size (MB)
R es ponse tim e (s ) D OW N L OA D
File size (MB)
R es ponse tim e (s ) R EP A IR 0 2 4 6 8 10 12
1 10 50 100 200 300 400 500
RAID-6 F-MSR RS FMSR RS FMSR
RS (native chunk repair) RS (code chunk repair) FMSR
0 1 2 3 4 5 6
1 2 5 10
RAID-6(native) RAID-6(parity) F-MSR
Response time: Commercial Cloud
No distinct response time difference, as network fluctuations play a bigger role in actual response time
21 File size (MB)
R es ponse tim e (s ) UPL OA D
File size (MB)
R es ponse tim e (s ) D OW N L OA D R es ponse tim e (s ) R EP A IR
File size (MB) 0
2 4 6
1 2 5 10
RAID-6 F-MSR 0 0.5 1 1.5 2 2.5
1 2 5 10
RAID-6 F-MSR RS FMSR RS FMSR
RS (native chunk repair) RS (code chunk repair) FMSR
Outline
Introduction
FMSR in NCCloud
FMSR-DIP: Design Goals
Preserves advantage of FMSR
Works on thin clouds
Supports sampling to minimize cost
Works against a Byzantine, mobile adversary
• Exhibits arbitrary behaviors
FMSR-DIP: Overview
FMSR-DIP
Cloud 1 Cloud 2 Cloud 3 Cloud 4 Users
file upload
download file
Proxy
FMSR-DIP: Upload
FMSR-DIP: Upload
Apply
error-correcting code (ECC)
FMSR-DIP: Upload
XOR each byte with a
FMSR-DIP: Upload
For each chunk, calculate
the MAC of the first 3 bytes
FMSR-DIP: Upload
Upload the chunks to clouds
Encrypt the metadata from NCCloud (which contains the encoding matrix) Append all MACs to metadata
FMSR-DIP: Check
FMSR-DIP: Check
XOR with the previous pseudorandom
values, and check their consistency
Recall: FMSR Encoding
P1 P2 P3 P4 P5 P6 P7 P8 A B C D c1,1 c1,2 c1,3 c1,4c2,1 c2,2 c2,3 c2,4 c3,1 c3,2 c3,3 c3,4 c4,1 c4,2 c4,3 c4,4 c5,1 c5,2 c5,3 c5,4 c6,1 c6,2 c6,3 c6,4 c7,1 c7,2 c7,3 c7,4 c8,1 c8,2 c8,3 c8,4
Encoding matrix
rank = k(n-k)
FMSR-DIP: Download
Download chunks from any 2 nodes
and verify with their MACs
FMSR-DIP: Download
Remove pseudorandom values
and pass to NCCloud for decoding
FMSR-DIP: Repair
Download 1 chunk from all other nodes
and verify with their MACs
FMSR-DIP: Repair
Remove pseudorandom values
and pass to NCCloud
FMSR-DIP: Repair
FMSR-DIP: Repair
Process the newly generated
chunks as before
FMSR-DIP: Repair
Upload chunks and update metadata
on all nodes
FMSR-DIP: Experiments
Testbed environment
• Openstack Swift 1.4.2
• 1 proxy node connected to 15 storage nodes (LAN) • NCCloud and FMSR-DIP deployed on proxy node • NCCloud uses RAMDisk as storage
Storage scheme
Running Time vs. File Size
FMSR-DIP overhead comparable to network transfer time in a LAN environment U PL OA D D OW N L OA D R EP A IR 0 5 10 15 20 25
100MB 50MB 20MB 10MB 5MB 1MB
T ime taken (s) Transfer-Up DIP-Encode FMSR 0 2 4 6 8
100MB 50MB 20MB 10MB 5MB 1MB
T ime taken (s) Transfer-Down DIP-Decode FMSR 0 5 10 15 20
100MB 50MB 20MB 10MB 5MB 1MB
T ime taken (s) Transfer-Up Transfer-Down DIP-Encode DIP-Decode FMSR File size File size File size
The Check Operation
0 10 20 30 40 50 60 70 80256B 1KB 4KB 7KB 25KB 256KB
T ime taken (s) Misc. Transfer-Down Rank Checking PRF 0 5 10 15 20 25 30
100% 75% 50% 25% 10% 5% 1%
T ime taken (s) Misc. Transfer-Down Rank Checking PRF Download block size Checking percentage
Bottleneck in network transfer
1% check
256KB download block size
Conclusions
Propose a design for efficient data integrity protection using FMSR on thin clouds
Implement and evaluate the efficiency of the design
Source code:
• NCCloud
http://ansrlab.cse.cuhk.edu.hk/software/nccloud/
• FMSRDIP
Error Localization
Error Localization
Form a system with bytes from
k
other
nodes
Error Localization
Mark all involved bytes as correct if
system is consistent
Error Localization
Error Localization
Error Localization
Cloud Storage Pricing
S3 Rackspace AzureStorage (per GB) $0.125 $0.15 $0.125
Data transfer in (per GB) free free Free
Data transfer out (per GB) $0.12 $0.18 $0.12
PUT (per 10,000 requests) $0.10 free $0.01
GET (per 10,000 requests) $0.01 free $0.01