Interoperating Cloud-based Virtual Farms

(1)

Interoperating Cloud-based

Virtual Farms

Stefano Bagnasco, Domenico Elia,

Grazia Luparello, Stefano Piano, Sara

Vallero, Massimo Venaruzzo

(2)

The STOA-LHC project 1

Improve the robustness and usability of the existing LHC Italian infrastructure

Funded as an Italian “PRIN” (research Project of Relevant National Interest) (See the summary poster in Poster Session B)

●  Common effort to ease data and resource access for the

LHC Community

●  This talk focuses on the ALICE-related activity:

■  Parallel and interactive analysis solutions (the Virtual Analysis

Facility)

■  Standard access to interactive resources in different local deployments

(e.g. centralised authentication system)

■  Federation among single analysis facilities to optimise distribution and

(3)

The STOA-LHC project 2

Improve the robustness and usability of the existing LHC Italian infrastructure

Funded as an Italian “PRIN” (research Project of Relevant National Interest) (See the summary poster in Poster Session B)

●  Build a uniform environment for “last mile” of analysis: ■  Use familiar interfaces

■  Exploit existing tools

■  Benefit from Cloud Computing technologies locally (isolate

applications, elasticity)

■  Use high-level tools for federation (no Cloud federation or bursting)

●  Extend the model to allow users outside high-energy

physics to re-use tools and exploit computing infrastructures

(4)

the infrastructure

Coming soon:

Catania and Cagliari

Trieste: ●  Test deployment ●  OpenStack ●  24 cores, 1.2 TB ●  3 Gbps WAN Padova-Legnaro: ●  Test deployment ●  OpenStack ●  100 cores, 5 TB ●  10 Gbps WAN Torino: ●  Production Cloud ●  OpenNebula ●  1.3k cores, 1.6 PB ●  10 Gbps WAN Bari: ●  PRISMA testbed ●  OpenStack ●  600 cores, 110 TB ●  10 Gbps WAN

(5)

the strategy

●  Don’t write new tools!

■  Use existing tools and features

■  Exploit good GARR networking between sites

■  Explore Cloud Computing technologies

Workload management

●  The Virtual Analysis Facility

■  Presented at CHEP2013 (see next slide) ■  Based on PROOF for interactive analysis

Data access

(6)

key component: the VAF

[email protected] - A grounds-up approach to High-Throughput Cloud Computing in High-Energy Physics The Virtual Analysis Facility

• Configured via a web interface: cernvm-online.cern.ch • Entire cluster launched with a single command

• User interacts only by submitting jobs

• Elastic Cluster as a Service: elasticity is embedded, no external tools • PoD and dynamic workers: run PROOF on top of it as a special case

26

PROOF+PoD CernVM HTCondor elastiq

What is the VAF?

• A cluster of CernVM virtual machines: one head node, many workers • Running the HTCondor job scheduler

• Capable of growing and shrinking based on the usage with elastiq

Dario

Berzano’s

talk @ CHEP20

13

(7)

key component: the vaf Dario Berzano’s talk @ CHEP20 13

(8)

ongoing activity summary

Activities:

●  Benchmarking activities at all sites ■  Common analysis task and data-set

●  Tests on local data storage access (Trieste)

●  Application monitoring with the ElasticSearch

ecosystem (Torino, Padova)

■  See Sara Vallero’s talk on Monday

●  Production use at the Torino site: ■  in operation since November 2013

■  60 TB of dedicated storage (GlusterFS, Xrootd)

■  up to ~100 workers

■  mainly analysis on ntuples (TSelector)

●  Data federation (Bari and all sites) ■  Check the poster in Poster Session A

(9)

Workers deploy time

●  If new VMs need to be instantiated, workers deploy time ranges from 2.5

min to 3.5 min

●  If VMs are already available, deploy time ranges from 16s to 3 min

●  The “golden number” of 30 workers (see later) is reached in 2.5 min in the

first case and 25 s in the latter Optimal number

(10)

Wall-time for different analysis steps QAMultistrange:

•  event selection •  re-vertexing

• QAMultistrange analysis

• Simple pT spectrum analysis

Data sample:

• LHC10h (PbPb)

• run 139510 • ∼ 226k events

(11)

Wall-time for different analysis steps QAMultistrange:

•  event selection •  re-vertexing

• QAMultistrange analysis

• Simple pT spectrum analysis

Data sample:

• LHC10h (PbPb)

• run 139510 • ∼ 226k events

Results:

●  For this type of

analysis and number of events, ∼ 30 workers is the optimal number

●  Wall-time is

comparable for low and high CPU-intensive

(12)

Interoperating Cloud-based Virtual Farms - 12

CHEP2015| Okinawa, Japan — Apr 18, 2015

the storage federation blueprint

Ø  Figure 1 schematically illustrates the XRootD configuration for a VAF Data

Federation developed in Bari using the VMs provided by the PRISMA

Openstack Infrastructure;

Ø  The XRootD hierarchy tested includes a local redirector (Manager) for each VAF site - in which a number of servers, each carrying a block storage device, is provided – and a global Italian redirector (Meta-Manager) located in Bari; Ø  Block storage devices for each server currently range from 10 to 100 TB; Ø  The datasets for each analysis campaign are expected to be staged locally by

system administrators from the AliEn Catalogue: authentication issues and

(13)

the storage federation blueprint

Work in progress

●  Bari Meta-manager

deployed

●  Ongoing tests on a

(14)

Distributed Storage and Data Federation ●  Distribute and share data using a unique XRootD Italian redirector

■  This is an ongoing task!

●  Two steps of a test analysis:

1.  75% I/O intensive and 25% CPU intensive

2.  17% I/O intensive and 83% CPU intensive

●  Plot the ratio between wall time of jobs accessing files via

XROOTD-IT and locally

1: I/O intensive

analysis 2: CPU intensive _analysis

Results:

●  Difference within 10-20% at most, even for "

I/O intensive jobs

●  Encouraging to further develop the VAF data federation using

such XRootD option

(15)

VAF monitoring with the ELK stack ●  Collect monitoring

and accounting data from both IaaS and application

●  Investigation of the

ELK stack to handle heterogeneous and unstructured data sources

●  Possible solution for

Monitoring-as-a-Service providing uniform extendable monitoring platform to applications TProofMon SenderSQL MySQL DB ●  Also accounting "

INFN Grid services

●  Dedicated DB tables

HTTP VAF

ELK stack

(16)

provisional conclusions and outlook

●  The VAF model works well and can be easily

adapted to different use cases

■  Just need to package an end-to-end toolkit suited to different communities

■  E.g. without PROOF or PoD or other specific tools

●  This needs to include a working accounting

system

■  The ELK stack can be used to build a flexible system: ■  …to provide accounting information…

■  …and Monitoring-as-a-Service for applications

●  The Data Federation model also is feasible

■  Small performance penalty balanced by flexibility and “deduplication”

(17)

thanks!

The present work is supported by the Istituto Nazionale di Fisica Nucleare (INFN) of Italy and is partially funded

under contract 20108T4XTM of Programmi di Ricerca Scientifica di Rilevante Interesse Nazionale (PRIN, Italy).