• No results found

Managing managed storage

N/A
N/A
Protected

Academic year: 2021

Share "Managing managed storage"

Copied!
20
0
0

Loading.... (view fulltext now)

Full text

(1)

Managing managed storage

CERN Disk Server operations

HEPiX 2004 / BNL

(2)

Outline

Which are our “Data Services”?

Disk server hardware @ CERN

Management tools

(3)

A lot of hardware

Disk storage

350 “storage in a box” Linux diskservers 6700 disks

550 TeraBytes of raw disk space

Tape storage

2 robotic installations

each with 5 STK 9310 silos

(4)

Many applications

repair/spare 11% CASTOR 55% AFS 3% CDR 6% R&D 14% Oracle 11% 200 CASTOR! 40 Oracle 20 CDR 10 AFS scratch dCache, LHC@home, … LCG, OpenLab, EGEE, data challenges 40 in repair/spare

A very heterogeneous environment!

(5)

Players

Many teams involved:

Application responsibles / Users Service managers

System administrators team Suppliers

Software often not redundant…

need to minimize downtime!

(6)

“Storage in a box”

13 different hardware configurations:

8 – 26 IDE disks, hot-swappable trays

2 – 4 3-Ware RAID controllers

2 CPUs

2 – 3 power supplies

GigE network card

(7)

hardware interventions

55 interventions

since Sep 1 disk replacements (70%)

trays, cables, fans, PSU

33% involve (un)scheduled downtime

Older hardware harder to maintain

One supplier out of business

Incidents to spice up life…

(8)

Disk replacement

10 months before case agreed: Head instabilities 4 weeks to execute

1224 disks exchanged (=18%); And the cages as well

0.0% 0.5% 1.0% 1.5% 2.0% 2.5% 3.0% 3.5% 4.0% 4.5%

Dec-03 Jan-04 Feb-04 Mar-04 Apr-04 May-04 Jun-04 Jul-04 Aug-04 Sep-04

% B

roken Mirrors

1224 disks replaced

Jumbo servers

(9)

65 Jumbo’s

1 – 1.5 TB raw disk space 6800 3-Ware controllers 600 MHz PIII No PXE Becoming hard to maintain

Many still under warranty

(10)

175 4U servers

4U (5U) rack mounted 1 – 1.5 TB

2 * 3-Ware 7000 series

currently upgrading firmware

2 * 1 GHz PIII’s

No PXE (yet)

Various maintenance issues

(11)

115 8U servers

8U rack mounted 2 – 2.5 Tb 3 – 4 * 3-Ware 7500(6)-8 2 * 2.4 GHz Xeon Well controlled, well maintained, well behaved,

(12)

Diskserver evolution

0 500 1000 1500 2000 2500 3000

Oct-00 Apr-01 Oct-01 Apr-02 Oct-02 Apr-03 Tender date C a p a c ity [G B ] 0 5 10 15 20 25 30 35 40 45 CHF /G B Gross capacity Usable capacity Price/usable GB

(13)

That was then…

HW RAID1

Ext2 filesystems many of them

13 different kernels! RedHat 6.1/6.2, 7.2/7.3, 2.1ES

Need for automation + standardization

ELFms toolsuite

Quattor – installation + configuration

LEMON – performance + exception monitoring

(14)

…this is now

RedHat 7.3,

preparing for SLC3

Oracle: RHEL 2.1,

preparing RHEL 3

kernel has old 3-Ware driver

HW RAID5 + hot spare disk

Up to 50% more usable space

On 3-Ware 7000 controller with up-to-date firmware

SW RAID0 + XFS

Improved performance expected iozone benchmark

Old XFS version

(15)

Updating the toolbox

SMART – to predict disk failure

daily and weekly self-tests, on every disk

IPMI v1.5

HW monitoring and event control Power control, resets

Lm_sensors – temperature monitoring

(16)
(17)

This is now

Quattorized + Lemonized

Rely on Operator and SysAdmin teams Operated in same way as PC farms

Getting more out of suppliers

(18)

What’s next?

New hardware

360 TB “SATA in a box”, 2 different suppliers

140 TB FC attached external SATA disk arrays

New software

SLC3, RHEL 3

New CASTOR stager

New challenges

Oracle SAN setup Alice data challenge

(19)

Conclusions

A lot of work has been done to

Stabilize Hardware and Software

Automate + hand over basic operations

Integrate into standard work flows

Get more out of available hardware

(20)

Useful links

“Standing on the shoulders of giants”

Tim Smith CHEP 2004

http://indico.cern.ch/contributionDisplay.py?contribId=374&sessionId=10&confId=0

Helge Meinhard CHEP 2004

http://indico.cern.ch/contributionDisplay.py?contribId=325&sessionId=10&confId=0

Peter Kelemen CERN IT “After C5” http://cern.ch/Peter.Kelemen/talk/2004/C5/diskserver

Jan Iven HEPiX 2004 Edinburgh http://hepwww.rl.ac.uk/hepix/nesc/iven.pdf

References

Related documents

– Work with practice groups to help manage electronic and paper records, and implement processes that enable compliance with firm policy.. what is

hierarchical storage management system developed at CERN for physics data files.. Storage in

Oklahoma store had a specific complaints dollar general district manager because you were in the basic email address richard is.. Against me on, email dollar district manager and i

CIs managed by Provider Server Storage VLAN O/S VM Server Storage CPU client devices client devices Server Storage CPU client devices database webtools app/email

The proposed application implements a user centered approach focusing on three usage context attributes: location, gender and native language.. Considering the user’s location as

The primordial objective of West Africa’s two main regional organizations – the Eco- nomic Community of West African States (ECOWAS) and the West African Economic and Monetary Union

Here different priority queue (high, mid, low) are implemented in round-robin fashion as per weights assign to them .We recompile the CloudSim and simulate the

My intention in this paper is to describe the doctorate in nursing practice scholarly project I developed to evaluate whether educating medical assistants (MAs) in the health