• No results found

Software, Computing and Analysis Models at CDF and D0

N/A
N/A
Protected

Academic year: 2021

Share "Software, Computing and Analysis Models at CDF and D0"

Copied!
17
0
0

Loading.... (view fulltext now)

Full text

(1)

Software, Computing and Analysis Models

Software, Computing and Analysis Models

at CDF and D0

at CDF and D0

Donatella Lucchesi CDF experiment INFN-Padova Outline Introduction

CDF and D0 Computing Model GRID Migration

Summary

III Workshop Italiano sulla fisica di Atlas e CMS

(2)

Introduction: a thought

Introduction: a thought

“A complex system that works is found to have evolved from a simple system that worked…

A complex system designed from scratch never

works and cannot be patched up to make it work. You have to start over, beginning with a working simple system”

G.Booch OO Analysis & Design 2nd ed. Pag. 13

(3)

Introduction: Some Numbers

Introduction: Some Numbers

CDF

CDF D0D0

Raw data size* (Kbytes/event)

Raw data size* (Kbytes/event) 150150 250250--300300 Reco data size (Kbytes/event)

Reco data size (Kbytes/event) 120120 200200 User format (Kbytes/event)

User format (Kbytes/event) 2525--180180 2020--4040 Reco time** (GHz

Reco time** (GHz--sec/event)sec/event) 5(10)5(10) 50(120)50(120) User analysis time (GHz

User analysis time (GHz--sec/event)sec/event) 1(3)1(3) 11 Peak data rate (Hz)

Peak data rate (Hz) 130(360)130(360) 50(100)50(100) Persistent data format

Persistent data format RootIORootIO TMB,RootTMB,Root

*Raw event size depends upon trigger type and luminosity **Reconstruction time depends upon raw data size

(4)

Introduction: Integrated Luminosity

Introduction: Integrated Luminosity

Base

(5)

CDF & D0 Computing Model: Data Production flow

CDF & D0 Computing Model: Data Production flow

CDF Level 3 Trigger CDF~100 Hz D0~50 Hz 7MHz beam Xing 0.75 Million channels Robotic Tape Storage Production Farm Disk cache D0

Data Handling Services DH

DH

(6)

Data Handling Services

Data Handling Services

SAM

SAM: : SSequential equential AAccess via ccess via MMetadataetadata CDF

CDF: used dCache then moved to SAM. : used dCache then moved to SAM. D0

D0: use SAM since several years: use SAM since several years SAM

SAM provides:provides: ƒ

ƒ Local and wide area data transportLocal and wide area data transport

ƒ

ƒ Batch adapter support (PBS, Condor, lsf, .. )Batch adapter support (PBS, Condor, lsf, .. )

ƒ

ƒ Caching layerCaching layer

ƒ

ƒ Comprehensive metaComprehensive meta--data to describe collider and data to describe collider and

Monte Carlo data

Monte Carlo data

o

o Simple tools to define userSimple tools to define user--datasetsdatasets ƒ

ƒ File tracking informationFile tracking information

o

o Location, delivery and “consumption” statusLocation, delivery and “consumption” status ƒ

(7)

Production Farms

Production Farms

D0:

D0:

Data:

Data: compute capacity 1550 GHz compute capacity 1550 GHz ⇒⇒ ~2390 GHz soon ~2390 GHz soon

Efficiency ~80% ~24 M events/week

Efficiency ~80% ~24 M events/week

MC

MC: 14M events/month : 14M events/month ⇒⇒ 2 M events if data reprocessing 2 M events if data reprocessing

CDF:

CDF:

New SAM based farm, standard CDF code

New SAM based farm, standard CDF code

Data

Data: compute capacity 1200 GHz : compute capacity 1200 GHz

~78 M events/week

~78 M events/week

(8)

CDF & D0 Computing Model: Analysis Data flow

CDF & D0 Computing Model: Analysis Data flow

Central Analysis System

Robotic

Tape Storage

Remote Analysis System CDF=CAFs D0=Remote Farms

Disk cache DH DH DH DH DH User Desktop

(9)

Central Analysis System

Central Analysis System

CDF:

CDF: CAFCAF ƒ

ƒ Primary analysis platformPrimary analysis platform

o

o User analysis: Ntuple creation and analysisUser analysis: Ntuple creation and analysis o

o SemiSemi--coordinated activities: secondary and tertiary coordinated activities: secondary and tertiary dataset, Monte Carlo Production

dataset, Monte Carlo Production

ƒ

ƒ User experience:User experience:

o

o MonitoringMonitoring o

o Control: hold, resume job, copy output to any machineControl: hold, resume job, copy output to any machine o

o QuasiQuasi--interactive feature: Look a log file on worker nodeinteractive feature: Look a log file on worker node D0:

D0: CABCAB ƒ

ƒ CPU intensive analysis jobCPU intensive analysis job

ƒ

ƒ Direct analysisDirect analysis

ƒ

ƒ Fixing and skimmingFixing and skimming

ƒ

ƒ Group organized translation to root formatGroup organized translation to root format

D0:

D0: clued0clued0 ƒ

(10)

Remote Analysis System

Remote Analysis System

D0

D0

ƒ

ƒ Farms based upon Farms based upon SAMSAMGridGrid ==SAMSAM+ + JIMJIM

ƒ

ƒ SAMSAM: file storage, delivery and metadata cataloging, : file storage, delivery and metadata cataloging,

analysis bookkeeping.

analysis bookkeeping.

ƒ

ƒ JIMJIM : Job Manager : Job Manager (D0 specific installations: SAM station, (D0 specific installations: SAM station, DB proxy servers, job

DB proxy servers, job manager)manager)

ƒ

ƒ

10 remote sites in operation:10 remote sites in operation: CCIN2P3 (Lyon) , CMS

CCIN2P3 (Lyon) , CMS--FNAL, FZU (Prague), GridKa (Karlsruhe), FNAL, FZU (Prague), GridKa (Karlsruhe), Imperial OSCER (Oklahoma), SPRACE (Sao Paolo),

Imperial OSCER (Oklahoma), SPRACE (Sao Paolo),

UTA (Texas, Arlington), WestGrid (Canada), Wisconsin

UTA (Texas, Arlington), WestGrid (Canada), Wisconsin

Job Information Monitoring

Job Information Monitoring

ƒ

ƒ Monte Carlo production and data reprocessingMonte Carlo production and data reprocessing

CDF

CDF

ƒ

ƒ dCAFdCAF a replica of CAF (specific CDF installation)a replica of CAF (specific CDF installation)

ƒ

(11)

CDF Submission GUI

CDF Submission GUI

User’s experience User’s experience Select site Select site Specify dataset Specify dataset Startup script Startup script Output location Output location Press “submit” Press “submit”

User’s context tarballed

User’s context tarballed

sent to execution site

sent to execution site

Same interface used

Same interface used

for GRID submission

(12)

CDF & D0 Migration to GRID

CDF & D0 Migration to GRID

Reason to move to a grid computing model

Reason to move to a grid computing model

ƒ

ƒ Need to expand resources luminosity expect to increaseNeed to expand resources luminosity expect to increase

by factor 8

by factor 8

ƒ

ƒ Resource are in dedicated pools, limited expansion andResource are in dedicated pools, limited expansion and

need to be maintained need to be maintained ƒ

ƒ Resource at large:Resource at large:

o

o in Italy access to ~3X dedicated resourcesin Italy access to ~3X dedicated resources o

o ~30 THz in USA fro GRID~30 THz in USA fro GRID CDF

CDF has three projects:has three projects: 9

9 GlideCAFGlideCAF: MC production on GRID, data processing at T1: MC production on GRID, data processing at T1

¾

¾ gLiteCAFgLiteCAF: MC production on GRID: MC production on GRID

¾

¾ OSGCAFOSGCAF: “ (USA): “ (USA)

D0:

D0:

¾

(13)

CDF Migration to GRID: GlideCAF

CDF Migration to GRID: GlideCAF

Dedicated resources Condor protocol Dynamic resources Gatekeeper

(14)

CDF Migration to GRID: gLiteCAF

CDF Migration to GRID: gLiteCAF

Secure connection via Kerberos Head node ≅ User Interface Resource Broker Gatekeeper Gatekeeper

(15)

DO Migration to GRID: SAMGrid

DO Migration to GRID: SAMGrid

LCG

LCG

SAMGrid Forwarding Node

LCG Cluster

SAM Station LCG Cluster

Job Flow

(16)

Backup

Backup

(17)

Introduction: Data volume

Introduction: Data volume

Data volume vs. time Total 1.2 PB

Estimated volume of about 5 PB by 2009

Data logging rate triples from 2004 to 2006

Event rate quadruples due to Increase compression

Computing problem not static More difficult with time

References

Related documents

Programs that implement variance components, Markov Chain Monte Carlo (MCMC), Haseman – Elston (H – E) and penetrance model-based linkage analyses are discussed, as are programs

In CDF setup, Parrot uses also a local cache on the worker node to store the files downloaded from the code server, and is configured to use a proxy server by choosing the closest

In this application ANN models are used for LSF approximation and combined with Monte Carlo simulation (MCS), first order reliability methods (FORM) and

For a model elliptic problem, we provide a full convergence and complexity analysis of the ratio estimator in the case where Monte Carlo, quasi-Monte Carlo or multilevel Monte

Figure 7 shows the energy distribution of data and Monte Carlo.. The agreement between data and Monte Carlo is quite

Keywords : Markov chain Monte Carlo, Bayesian analysis, mixture model, Skew- Normal distributions, Loss distribution, Danish data....

In this application ANN models are used for LSF approximation and combined with Monte Carlo simulation (MCS), first order reliability methods (FORM) and

Also, security based on the analysis of grid- based Monte Carlo applications, we may take advantage of the statistical nature of Monte Carlo calculations and the