A Cloud Computing Approach
for Big DInSAR Data Processing
through the P-SBAS Algorithm
through the P-SBAS Algorithm
Zinno I.
1, Elefante S.
1, Mossucca L.
2,
De Luca C.
1,3, Manunta M.
1, Terzo O.
2,
Lanari R.
1, Casu F.
1(1) IREA - CNR, Napoli, Italy ([email protected]) (2) ISMB, Torino, Italy
Outline
•
Motivations
•
Parallel SBAS (P-SBAS) processing chain
•
P-SBAS deployment within Amazon Web Services (AWS) cloud
•
Analysis of the parallel performance and of the costs relevant
•
Analysis of the parallel performance and of the costs relevant
to the P-SBAS processing within AWS
•
Case study: P-SBAS processing of the South California area
within AWS
Advanced DInSAR: the SBAS algorithm
Earth’s surface displacement detection and deformation temporal evolution
In te rf e ro g ra m s Background [email protected] In te rf e ro g ra m s
SBAS Application Scenario
Earthquakes Motivations [email protected] Volcanoes Water Resources SAR SAR dataset dataset SBAS SBAS processing processingPast, present and future
SAR Satellite Constellations
Motivations
swath width: ≈ 40 km revisit time: 4 - 11 days
swath width: ≈ 250 km revisit time: 12 - 6 days
Sentinel
Time
swath width: ≈ 100 km revisit time: ≈ monthly
Motivations
Cloud platforms exploitation
huge amount of data a lot of applications
“Infinite” computing resources at
P-SBAS Cloud Deployment
Results
NFS* based computing architecture implemented within
Amazon Web Services (AWS) cloud
Napoli bay area
Area 100x100km2
#Images 64
#Pixel 18000x5000
#Interf 195
Parallel Performance Analysis in AWS:
ENVISAT benchmark data set
Results
[email protected] cm/year
<-1 cm >1 cm
#Interf 195
distribution in the temporal/perpendicular baseline plane of the employed SAR acquisitions
Results
Parallel Performance Analysis:
exploited Computational Platforms
CNR-IREA cluster nodes
m2.4.xlarge
instance & single EBS volume storage
c3.8.xlarge
Instance & 2 EBS volumes RAID storage HPC CLUSTER
(reference platform)
AMAZON WEB SERVICES (AWS) Elastic Compute Cloud (EC2)
nodes
EBS volume storage volumes RAID storage
Processor Cores RAM Network NFS storage bandwidth
Intel Xeon E5-2670 16 available, 8 used 384 GB Infiniband (56Gb/s) 300 MB/s Intel Xeon X5550 8 68 GB 1 Gb/s 128 MB/s
Intel Xeon E5-2680 32 available, 8 used 60 GB 10 Gb/s 256 MB/s Medium I/O performance High I/O performance
Parallel Performance Analysis:
HPC Cluster VS Cloud performances
CNR-IREA m2.4xlarge c3.8xlarge
CNR – IREA Cluster
AWS m2.4xlarge instance & single EBS volume storage configuration AWS c3.8xlarge instance & 2 EBS volumes RAID storage configuration
P-SBAS processing times 6060
very good scalability! bottleneck: [email protected] Nodes CNR-IREA cluster m2.4xlarge config. c3.8xlarge config. 1 40:55 h 53:05 h 39:55 h 2 22:42 h 30:00 h 21:30 h 4 13:43 h 17:35 h 12:40 h 8 9:12 h 11:43 h 8:00 h 16 6:55 h - 5:55 h 0 10 20 30 40 50 1 2 4 8 16 T im e ( h o u rs ) Nodes number 0 10 20 30 40 50 1 2 4 8 16 T im e ( h o u rs ) Nodes number bottleneck: I/O bandwidth
Results
Parallel Performance Analysis:
P-SBAS speedup within AWS Cloud
N N T T S = 1 Speedup: ,
N : number of computing nodes
T1: sequential implementation time
TN: parallel implementation time
16
Ideal Speedup Amdahl's law Experimental Speedup
AWS c3.8xlarge istances
& 2 EBS volumes RAID storage configuration
[email protected] Amdahl’s law: N f f S S S N − + = 1 1 1 0≤ fS ≤ , with
fS: fraction of the algorithm that has to be executed sequentially
1 2 4 8 1 2 4 8 16 S p e e d u p Computing Nodes
P-SBAS Cost Analysis within AWS
Nodes m2.4xlarge costs * (USD) c3.8xlarge costs * (USD) 1 87,8 113,4 2 85 97,6 16 Nodes 160 180 200Time/Cost Tradeoff on AWS Cloud
c3.8xlarge instances & 2 EBS volumes storage configuration m2.4xlarge & single EBS volume storage configuration
P-SBAS processing costs on AWS
Results [email protected] 2 85 97,6 4 87,2 114,2 8 110,6 134,2 16 188,2 1 Node 2 Nodes 4 Nodes 8 Nodes 1 Node 2 Nodes 4 Nodes 8 Nodes 0 20 40 60 80 100 120 140 0 10 20 30 40 50 60 C o st s (U S D )
P-SBAS Processing Elapsed Times (hours)
*The represented costs include both
Results
4 ENVISAT frames
172 SAR images
200x200 km
32 c3.8xlarge AWS instances
< 17 hours 853 USD
Time interval: 2004 - 2010
Southern California case study
frame SAR images per dataset times (hours) costs (USD) 1 47 16.7 242 2 44 15.5 227 3 43 11 192 4 38 10.5 192 TOT 172 16.7 853
Future goals
≈ 130 frames ≈ 3500 images ≈ 1000 Nodes
Future goal: DInSAR analysis from regional to national scale
ENVISAT coverage over Italy 2002-2010 (only ascending orbit) ENVISAT coverage over California and Nevada
2002-2010 (only ascending orbit)
≈ 130 frames ≈ 3500 images ≈ 1000 Nodes
32
NFS based P-SBAS
Ideal Speedup Amdahl's law Experimental Speedup
32
DFS based P-SBAS with reduced sequential processing
Ideal Speedup Amadahl's law Experimental Speedup
dataset: 64 COSMO-SkyMed images over the Napoli Bay area
computing platform: CNR-IREA cluster
Preliminary results on the advanced DFS
*based P-SBAS
Future goals
*Distributed File System
[email protected] 200 hours 117 hours 73 hours 50 hours 38.7 hours 34 hours 1 2 4 8 16 1 2 4 8 16 32 S p e e d u p Nodes number 200 hours 105.5 hours 59 hours 36 hours 24 hours 18.7 hours 1 2 4 8 16 1 2 4 8 16 32 S p e e d u p Nodes Number
81% speedup increase for 32 nodes!
Conclusions
The deployment of the P-SBAS algorithm within the Amazon Web Services (AWS) cloud has been presented.
A thorough analysis of the parallel performance of the P-SBAS algorithm within AWS cloud has been carried out.
The study of the costs related to the P-SBAS processing within AWS,
The study of the costs related to the P-SBAS processing within AWS, changing the employed cloud resources, has been accomplished.
As a case study, the P-SBAS processing of a large ENVISAT SAR dataset (172 images) acquired over the South California area has been performed within the AWS cloud.
Preliminary results regarding the parallel performance of the advanced DFS based implementation of the P-SBAS algorithm have been shown.