Interoperating Cloud-based
Virtual Farms
Stefano Bagnasco, Domenico Elia,
Grazia Luparello, Stefano Piano, Sara
Vallero, Massimo Venaruzzo
The STOA-LHC project 1
Improve the robustness and usability of the existing LHC Italian infrastructure
Funded as an Italian “PRIN” (research Project of Relevant National Interest) (See the summary poster in Poster Session B)
● Common effort to ease data and resource access for the
LHC Community
● This talk focuses on the ALICE-related activity:
■ Parallel and interactive analysis solutions (the Virtual Analysis
Facility)
■ Standard access to interactive resources in different local deployments
(e.g. centralised authentication system)
■ Federation among single analysis facilities to optimise distribution and
The STOA-LHC project 2
Improve the robustness and usability of the existing LHC Italian infrastructure
Funded as an Italian “PRIN” (research Project of Relevant National Interest) (See the summary poster in Poster Session B)
● Build a uniform environment for “last mile” of analysis: ■ Use familiar interfaces
■ Exploit existing tools
■ Benefit from Cloud Computing technologies locally (isolate
applications, elasticity)
■ Use high-level tools for federation (no Cloud federation or bursting)
● Extend the model to allow users outside high-energy
physics to re-use tools and exploit computing infrastructures
the infrastructure
Coming soon:
Catania and Cagliari
Trieste: ● Test deployment ● OpenStack ● 24 cores, 1.2 TB ● 3 Gbps WAN Padova-Legnaro: ● Test deployment ● OpenStack ● 100 cores, 5 TB ● 10 Gbps WAN Torino: ● Production Cloud ● OpenNebula ● 1.3k cores, 1.6 PB ● 10 Gbps WAN Bari: ● PRISMA testbed ● OpenStack ● 600 cores, 110 TB ● 10 Gbps WAN
the strategy
● Don’t write new tools!
■ Use existing tools and features
■ Exploit good GARR networking between sites
■ Explore Cloud Computing technologies
Workload management
● The Virtual Analysis Facility
■ Presented at CHEP2013 (see next slide) ■ Based on PROOF for interactive analysis
Data access
key component: the VAF
[email protected] - A grounds-up approach to High-Throughput Cloud Computing in High-Energy Physics The Virtual Analysis Facility
• Configured via a web interface: cernvm-online.cern.ch • Entire cluster launched with a single command
• User interacts only by submitting jobs
• Elastic Cluster as a Service: elasticity is embedded, no external tools • PoD and dynamic workers: run PROOF on top of it as a special case
26
PROOF+PoD CernVM HTCondor elastiq
What is the VAF?
• A cluster of CernVM virtual machines: one head node, many workers • Running the HTCondor job scheduler
• Capable of growing and shrinking based on the usage with elastiq
Dario
Berzano’s
talk @ CHEP20
13
key component: the vaf Dario Berzano’s talk @ CHEP20 13
ongoing activity summary
Activities:
● Benchmarking activities at all sites ■ Common analysis task and data-set
● Tests on local data storage access (Trieste)
● Application monitoring with the ElasticSearch
ecosystem (Torino, Padova)
■ See Sara Vallero’s talk on Monday
● Production use at the Torino site: ■ in operation since November 2013
■ 60 TB of dedicated storage (GlusterFS, Xrootd)
■ up to ~100 workers
■ mainly analysis on ntuples (TSelector)
● Data federation (Bari and all sites) ■ Check the poster in Poster Session A
Workers deploy time
● If new VMs need to be instantiated, workers deploy time ranges from 2.5
min to 3.5 min
● If VMs are already available, deploy time ranges from 16s to 3 min
● The “golden number” of 30 workers (see later) is reached in 2.5 min in the
first case and 25 s in the latter Optimal number
Wall-time for different analysis steps QAMultistrange:
• event selection • re-vertexing
• QAMultistrange analysis
• Simple pT spectrum analysis
Data sample:
• LHC10h (PbPb)
• run 139510 • ∼ 226k events
Wall-time for different analysis steps QAMultistrange:
• event selection • re-vertexing
• QAMultistrange analysis
• Simple pT spectrum analysis
Data sample:
• LHC10h (PbPb)
• run 139510 • ∼ 226k events
Results:
● For this type of
analysis and number of events, ∼ 30 workers is the optimal number
● Wall-time is
comparable for low and high CPU-intensive
Interoperating Cloud-based Virtual Farms - 12
CHEP2015| Okinawa, Japan — Apr 18, 2015
the storage federation blueprint
Ø Figure 1 schematically illustrates the XRootD configuration for a VAF Data
Federation developed in Bari using the VMs provided by the PRISMA
Openstack Infrastructure;
Ø The XRootD hierarchy tested includes a local redirector (Manager) for each VAF site - in which a number of servers, each carrying a block storage device, is provided – and a global Italian redirector (Meta-Manager) located in Bari; Ø Block storage devices for each server currently range from 10 to 100 TB; Ø The datasets for each analysis campaign are expected to be staged locally by
system administrators from the AliEn Catalogue: authentication issues and
Ø Figure 1 schematically illustrates the XRootD configuration for a VAF Data
Federation developed in Bari using the VMs provided by the PRISMA
Openstack Infrastructure;
Ø The XRootD hierarchy tested includes a local redirector (Manager) for each VAF site - in which a number of servers, each carrying a block storage device, is provided – and a global Italian redirector (Meta-Manager) located in Bari; Ø Block storage devices for each server currently range from 10 to 100 TB; Ø The datasets for each analysis campaign are expected to be staged locally by
Interoperating Cloud-based Virtual Farms - 13
CHEP2015| Okinawa, Japan — Apr 18, 2015
the storage federation blueprint
Ø Figure 1 schematically illustrates the XRootD configuration for a VAF Data
Federation developed in Bari using the VMs provided by the PRISMA
Openstack Infrastructure;
Ø The XRootD hierarchy tested includes a local redirector (Manager) for each VAF site - in which a number of servers, each carrying a block storage device, is provided – and a global Italian redirector (Meta-Manager) located in Bari; Ø Block storage devices for each server currently range from 10 to 100 TB; Ø The datasets for each analysis campaign are expected to be staged locally by
system administrators from the AliEn Catalogue: authentication issues and
Ø Figure 1 schematically illustrates the XRootD configuration for a VAF Data
Federation developed in Bari using the VMs provided by the PRISMA
Openstack Infrastructure;
Ø The XRootD hierarchy tested includes a local redirector (Manager) for each VAF site - in which a number of servers, each carrying a block storage device, is provided – and a global Italian redirector (Meta-Manager) located in Bari; Ø Block storage devices for each server currently range from 10 to 100 TB; Ø The datasets for each analysis campaign are expected to be staged locally by
system administrators from the AliEn Catalogue: authentication issues and
Work in progress
● Bari Meta-manager
deployed
● Ongoing tests on a
Distributed Storage and Data Federation ● Distribute and share data using a unique XRootD Italian redirector
■ This is an ongoing task!
● Two steps of a test analysis:
1. 75% I/O intensive and 25% CPU intensive
2. 17% I/O intensive and 83% CPU intensive
● Plot the ratio between wall time of jobs accessing files via
XROOTD-IT and locally
1: I/O intensive
analysis 2: CPU intensive analysis
Results:
● Difference within 10-20% at most, even for "
I/O intensive jobs
● Encouraging to further develop the VAF data federation using
such XRootD option
Interoperating Cloud-based Virtual Farms - 15
CHEP2015| Okinawa, Japan — Apr 18, 2015
VAF monitoring with the ELK stack ● Collect monitoring
and accounting data from both IaaS and application
● Investigation of the
ELK stack to handle heterogeneous and unstructured data sources
● Possible solution for
Monitoring-as-a-Service providing uniform extendable monitoring platform to applications TProofMon SenderSQL MySQL DB ● Also accounting "
INFN Grid services
● Dedicated DB tables
HTTP VAF
ELK stack
provisional conclusions and outlook
● The VAF model works well and can be easily
adapted to different use cases
■ Just need to package an end-to-end toolkit suited to different communities
■ E.g. without PROOF or PoD or other specific tools
● This needs to include a working accounting
system
■ The ELK stack can be used to build a flexible system: ■ …to provide accounting information…
■ …and Monitoring-as-a-Service for applications
● The Data Federation model also is feasible
■ Small performance penalty balanced by flexibility and “deduplication”
thanks!
The present work is supported by the Istituto Nazionale di Fisica Nucleare (INFN) of Italy and is partially funded
under contract 20108T4XTM of Programmi di Ricerca Scientifica di Rilevante Interesse Nazionale (PRIN, Italy).