• No results found

GridKa: Roles and Status

N/A
N/A
Protected

Academic year: 2021

Share "GridKa: Roles and Status"

Copied!
34
0
0

Loading.... (view fulltext now)

Full text

(1)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

GridKa: Roles and Status

Forschungszentrum Karlsruhe GmbH

Institute for Scientific Computing

P.O. Box 3640

D-76021 Karlsruhe, Germany

Holger Marten

(2)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

History

History

10/2000: First ideas about a German Regional Centre for LHC Computing - planning and cost estimates

05/2001: Start a BaBar-Tier-B with Univ. Bochum, Dresden, Rostock

07/2001: German HEP communities send “Requirements for a Regional Data and Computing Centre in Germany (RDCCG)”

- more planning and cost estimates

12/2001: Launching committee establishes RDCCG

(renamed to “Grid Computing Centre Karlsruhe, GridKa” later)

04/2002: First prototype

(3)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

Atlas

(SLAC, USA)

(CERN) (FNAL ,USA)

(FNAL ,USA)

LHC experiments

non-LHC experiments

• Comm

itted

to Gri

d Com

puting

• Have

real

data

alread

y today

Other sciences later

High Energy Physics experiments served by GridKa

(4)

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft GridKa Technical Advisory Board Overview Board Board

BMBF

Physics Committees

HEP Experiments

LCG

FZK Management

Head FZK Comp. Centre

Chairman of TAB

Project Leader

Alice

Atlas

CMS

LHCb

BaBar

CDF

D0

Compass

Physics Committees

DESY

Project Leader

GridKa Project Organization

GridKa Project Organization

Planning

Development

Technical realization

(5)

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft

22 institutions

44 user groups

350 scientists

Aachen (4)● Bielefeld (2)● Bochum (2)● Bonn (3)● Darmstadt (1)▲ Dortmund (1)● Dresden (2)● Erlangen (1)● Frankfurt (1)● Freiburg (2)● Hamburg (1)▲ Heidelberg (1)▲(6)● Karlsruhe (2)● Mainz (3)● Mannheim (1)● München (1)●(5)▲ Münster (1)● Rostock (1)● Siegen (1)● Wuppertal (2)● ▲

German Users

German Users

of GridKa

of GridKa

(6)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

France: IN2P3, Lyon

Germany: Forschungszentrum Karlsruhe Italy: CNAF, Bologna

Japan: ICEPP, University Tokio Spain: PIC, Barcelona

Switzerland: CERN, Genf

Taiwan: Academia Sinica, Taipei

UK: Rutherford Laboratory, Chilton USA: Fermi Laboratory, Batavia, IL

USA: BNL

GridKa in the network of international Tier-1 centres

GridKa in the network of international Tier-1 centres

(7)

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft

CMS

ATLAS

LHCb

CERN

Tier 0 Centre at CERN

Working Groups Virtual Organizations Tier 2 (Uni-CCs, Lab-CCs) Lab y Uni a Lab i Uni b Lab z Lab x Uni c Uni d Uni e Tier 3 (Institute computers)    Tier 4 (Desktop) The global LHC Computing Centre Germany (FZK) Tier 1 USA (Fermi, BNL) UK (RAL) France (IN2P3) Italy (CNAF) ………. CERN Tier 1 ………. Tier 0

The fifth LHC

The fifth LHC

subproject

subproject

(8)

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft RAL IN2P3 BNL FZK CNAF PIC ICEPP FNAL

LHC Computing Model (

simplified!!

)

Tier-0 – the accelerator centre

– Filter  raw data

– Reconstruction  summary data (ESD)

– Record raw data and ESD

– Distribute raw and ESD to Tier-1

Tier-1 –

– Permanent storage and management

of raw, ESD, calibration data, meta-data, analysis data and databases  grid-enabled data service

– Data-heavy analysis

– Re-processing raw  ESD – National, regional support

USC NIKHEF Krakow CIEMAT Rome Taipei TRIUMF CSCS Legnaro UB IFCA IC MSU Prague Budapest Cambridge Tier-1 small centres desktops portables Le s R ob ert son , G D B, M ay 2004 Santiago WeizmannTier-2

“online” to data acquisition process -- high availability (24h x7d) -- managed mass storage -- long-term commitment

-- resources: 50% of “average Tier-1”

(9)

GridKa School 2004, September 20-23, 2004, Karlsruhe, Germany Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft RAL IN2P3 BNL FZK CNAF PIC ICEPP FNAL USC NIKHEF Krakow CIEMAT Rome Taipei TRIUMF CSCS Legnaro UB IFCA IC MSU Prague Budapest Cambridge Tier-1 small centres desktops portables Santiago WeizmannTier-2

Tier-2 –

– Well-managed disk storage – grid-enabled

– Simulation

– End-user analysis – batch and interactive

– High performance parallel analysis (PROOF?)

Each Tier-2 is associated with a Tier-1 that

– Serves as the primary data source

– Takes responsibility for long-term storage and management of all of the data generated at the Tier-2 (grid-enables mass storage)

– May also provide other support services (grid expertise, software distribution, maintenance, …)

CERN will not provide these services for Tier-2s

except by special arrangement

Les R

ob ert son , G D B, M ay 2004

(10)

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft

0

2000

4000

6000

8000

2002 2003 2004 2005 2006 2007 2008 2009

Tb

yt

e

4000 3000 2000 1000 0

kS

I9

5

LCG Phase I Phase II Phase III

CPU Disk Tape

GridKa planned resources

GridKa planned resources

(11)

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft 0% 20% 40% 60% 80% 100% 2002 2003 2004 2005 2006 2007 2008 2009 0% 20% 40% 60% 80% 100% 2002 2003 2004 2005 2006 2007 2008 2009

Distribution of planned resources at GridKa

Distribution of planned resources at GridKa

0% 20% 40% 60% 80% 100% 2002 2003 2004 2005 2006 2007 2008 2009 CPU Disk Tape non-LHC non-LHC non-LHC LHC LHC LHC

Signifi

cant c

ontribu

tions t

o non-L

HC !!

• BaBa

r Tier-A

• D0,

CDF R

egional

Centre

Jan-2004

(12)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

GridKa Environment

(13)

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft IWR 441,442 Tape Storage Main building

(14)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

Worker Nodes & Test beds

Worker Nodes & Test beds

Production environment

97x dual PIII, 1,26 GHz

97 kSI2000

1 GB mem, 40 GB HD

64x dual PIV, 2,2 GHz

102 kSI2000

1 GB RAM, 40 GB HD

72x dual PIV, 2,667 GHz 130 kSI2000

1 GB RAM, 40 GB HD

267x dual PIV, 3,06 GHz

534 kSI2000

1 GB RAM, 40/80 GB HD

36x dual Opteron 246

90 kSI2000

2 GB RAM, 80 GB HD

Σ 536 nodes, 1072 CPUs, 953 kSI2000

installed with RH7.3, LCG 2.2.0

(except for Opterons)

Test environment

additional 30 machines in several test beds

Next OS

(15)

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft 50 283 50 210 56 140 150 143 kSI2000 5 000 28 300 5 000 21 000 5 600 14 000 15 000 14 300 share 4.6 Compass 26.2 Dzero 4.6 CDF 19.4 BaBar 5.2 LHCb 12.9 CMS 13.9 Atlas 13.2 Alice percentage experiment 1-oct-2004

The default (test) queue is not handled by the fair share. These 20-30 CPUs are kept free for test jobs.

PBSPro fair share according to requirements

PBSPro fair share according to requirements

45% LHC 55 % nLHC

(16)

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft 0 10 20 30 40 50 60 A L IC E A T L A S C M S L H C b B a B a r C D F D 0 C o m p a s s T B y te

Oct 04

29 % LHC 71 % nLHC

Disk Space available for HEP experiments: 202 TB

(17)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

about 40 TB stored in NAS (better: DAS)

dual CPU, 16 EIDE disks, 3Ware controller

Online Storage I

Online Storage I

Experience

hardware cheap, but not very reliable

RAID software & management messages not always useful

good throughput for a few simultaneous jobs,

but doesn’t scale to a few hundred simultaneous file accesses

Workarounds

disk mirroring

“management software” (“managed disks”): file copies on multiple boxes)

(18)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

TCP/IP/NFS Expansion

Compute nodes

Online Storage: I/O Design with NAS (DAS)

Online Storage: I/O Design with NAS (DAS)

Alice Atlas

~ 30 MB/s r/w bottleneck disk access

(19)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

about 160 TB stored in a SAN

SCSI disks (rpm 10k) with redundant controllers

parallel file system on a file server cluster exported via NFS on a cluster of file server to the WNs

Online Storage II

(20)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

SAN/SCSI Fibre Channel TCP/IP/NFS

file server cluster

RAID 5 storage

Expansion

Compute nodes

Online Storage: Scalable I/O Design

Online Storage: Scalable I/O Design

Alice

Atlas striping + parallel file system;

(21)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

about 160 TB stored in a SAN

SCSI disks (rpm 10k) with redundant controllers

parallel file system on a file server cluster exported via NFS on a cluster of file server to the WNs

Online Storage II

Online Storage II

Advantages

high availability through multiple redundant servers

load balancing via automounter program map

Experience

many teething problems (bugs, learn how to configure,...)

ratio (CPU/Wall clock) near to 1 in some applications

(22)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

Why telling all this?

Why telling all this?

Because we need

Because we need

your

your

experience and feedback as users !

experience and feedback as users !

(23)

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft 0 20 40 60 80 100 120 A L IC E A T L A S C M S L H C b B a B a r C D F D 0 C o m p a s s T B y te

Oct 04

27 % LHC 73 % nLHC

Tape Space available for HEP experiments: 374 TB

(24)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

tape library IBM 3584 LTO Ultrium

8 drives LTO-1, 4 drives LTO-2

375 TB native (uncompressed)

Tivoli Storage Manager (TSM) for Backup and Archive

installation of dCache in progress

-

tape backend interfaced to Tivoli Storage Manager

-

installation with 1 head and 3 pool nodes currently tested by CMS & CDF

other

-

SAM station caches for D0 and CDF

-

JIM (Job information management) station for D0

-

tape connection via scripts (D0)

-

CORBA Naming service (for CDF)

Tape Storage

(25)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

GridKa – Plan for WAN connectivity

GridKa – Plan for WAN connectivity

2001 2002 2003 2004 2005 2006 2007 2008 34 Mbps 155 Mbps 2 Gbps 10 Gbps 20 Gbps Start discussion with Dante !

Sept 2004 DFN upgraded the capacity from Karlsruhe to Géant to 10 Gbps; tests have been started !

Routing (full 10 Gbps): GridKa – DFN (Karlsruhe) – DFN (Frankfurt) – Géant (Frankfurt) – Géant (Milano) – Géant (Geneva) – CERN

Start 10 Gbps

(26)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

Further services & sources of information

(27)

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft

GGUS

GGUS

(Global Grid User Support) www.ggus.org

(28)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

User information

User information

www.gridka.de → GridKa Info

-

user registration

-

globus installation

-

batch system PBS

-

backup & archive

-

getting a certificate from GermanGrid CA

-

listserver / mailing lists

-

monitoring status with Ganglia

www.gridka.de → HEP experiments

-

experiment specific information

www.ggus.org

-

FAQ

-

Documentaion

-

...

(29)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

Tools

Tools

(30)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

Final remarks

(31)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

EU-Project EGEE

April 2004 to March 2006 32 Mio. Euro f. personnel

70 partner institutes in 27 countries organized in 9 federations

applications LHC grid, Biomed,....

Russland

Operations Management Centre (OMC)

Core Infrastructure Centre (CIC)

Regional Operations Centre (ROC) Russland

Russland

Operations Management Centre (OMC)

Core Infrastructure Centre (CIC)

Regional Operations Centre (ROC) Operations Management Centre (OMC)

Core Infrastructure Centre (CIC)

Regional Operations Centre (ROC)

„Provide distributed European research communities with a common market of computing, offering round-the-clock access to major

computing resources, independent of geographic location, ..“

Europe on the way to e-science

(32)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

http://goc.grid-support.ac.uk/lcg2

Status of LCG / EGEE

(33)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

Last but not least

Last but not least

We want to help

-

our users on our systems

-

support/discuss cluster installations at other institutes

-

support/discuss middleware installations at other centres

-

creating a German Grid Infrastructure

We will continue the balancing act between

-

testing & Data Challanges

-

production with real data

and...

(34)

Forschungszentrum Karlsruhe

in der Helmholtz-Gemeinschaft

We appreciate the continuous interest and support by the Federal Ministry of Education and Research, BMBF.

References

Related documents

A conceptual design of an 8 seater business jet was completed. Vibrational and linear elastic analysis on its carbon fibre composited wing was also done. A procedure has

hArIDWAr name of school Goodwill, legacy & reputation rank academic excellence rank Future ready infrastructure rank enrichment of learning experience in sports,

We find in our comparison of sign languages of the Middle East region that two geographically distant sign languages can have a somewhat higher base level of similarity when

The probabilistic model integrates the output of the classifier with the world knowledge coded in a probabilistic ontology that is expressed in terms of the probability of a

Suncorp, for example, has a gearing plan where, if you’re able to put aside just $250 a month (and pay the interest as it accrues), you can get started with as little as $1,000 of

The purpose of this article is to examine recent literature about five of the high-impact educational practices (cap- stone experiences, learning communities, service learning

Specifically, the study was designed to address the following question with respect to common vascular surgery procedures: In Canada, are there significant differ- ences in

found that eighth- and ninth-grade struggling readers (defined as reading below the 40 th percentile on an end-of- grade standardized reading comprehension test) scored