Data Information and Management System (DIMS)
The New DIMS Hardware
Wilhelm Wildegger, DFD-IT; Jens Pollex, DFD-BN
04. Mai 2007
The New DIMS Hardware
Outline
DIMS Overview
Reasons for Renewal of DIMS Hardware
New Servers
Disk Storage Area Network (Disk SAN) and Storage System
Tape SAN and Tape Library
New archive Tape Drives and Media
Data Migration to New Media
DIMS NZ Order Management Orders Reports Ingestion System 11111111 00010101 Level 0 10110011 00010101Raw Product Library Archive Inventory Processing System User Information Services
(incl. Pickup Point)
EOWEB Production Control Request Trees Op eratin g Too l Requests Online/Offline Product Generation & Delivery
Post-Processing Post-Processing Processing System Product Library Archive Inventory Processing System
Processing System Ingestion System 11111111 00010101 Level 0 10110011 00010101Raw Production Control Request Trees Op eratin g Too l Requests DIMS OP
Data Information and Management System (DIMS)
The New DIMS Hardware
Reasons for Renewal
old hardware (servers, disk systems, tape drives) approaching end of service
life
old archive tape media / drives start having read problems
necessary to copy data to new media for long term archiving
old hardware not powerful enough to support new projects like TerraSAR-X
with respect to data throughput and disk / tape capacity
The New DIMS Hardware
New Servers
old servers were Sun E6500, E3500, E450
new servers are Sun Fire V4900, Sun Fire V890 and Sun Fire V490
V4900: 8 CPUs à 1.8 GHz, 32 GByte main memory
V890: 4 or 8 CPUs à 1.8 GHz, 16 or 32 GByte main memory
V490: 2 or 4 CPUs à 1.8 GHz, 8 or 16 GByte main memory
ca. 10 times more CPU power with respect to old servers
ca. 10 times more main memory with respect to old servers
all servers connected to network with 1 Gbit/s Ethernet,
archive server with 2* 1 Gbit/s Ethernet
Solaris 10
SAM-FS 4.5.33
User Inform. Services Order Manag ement DIMS UIS Server Name/W eb/ SW Service DIMS Operations Server Productio n Control Production Control Server Ordering Control Server
The New DIMS Hardware
Servers (Oberpfaffenhofen & Neustrelitz)
Post Processin g Post Processing Server
Online/Offline Product Gen. & Deliver
y Online/ Offline Product Gen. Server CD/ DVD Writing System Product Librar y Product Library / Inventory Server Archive Server
The New DIMS Hardware
Disk SAN
servers only have disks for operating system and COTS SW installation
all data partitions are located on one central storage system per site
(OP and NZ)
Sun/HDS StorEdge 9985: 25 TByte in OP, 13 TByte in NZ
interconnection between servers and storage system via 2 SAN switches
(Brocade 4100) and 2 Gbit/s fiber channel links (“Disk SAN”)
redundancy / high availability
RAID 5 and RAID 6 used
every server is connected to both SAN switches using different host bus adapters (HBA) (per switch one or several links in parallel)
storage system is connected to every switch with 8 links
even if one SAN switch, one HBA, one link or one controller of the SAN storage system fails, all partitions are still visible, performance is reduced
User Inform. Services Order Manag ement DIMS UIS Server Name/W eb/ SW Service DIMS Operations Server Productio n Control Production Control Server SAN Storage System Ordering Control Server
The New DIMS Hardware
Disk SAN
Post Processin g Post Processing ServerOnline/Offline Product Gen. & Deliver
y Online/ Offline Product Gen. Server CD/ DVD Writing System Product Librar y Product Library / Inventory Server Archive Server FC Switch FC Switch
The New DIMS Hardware
Tape SAN and 2nd Copy Tape Library
with the old hardware both primary and secondary copies were written in the same tape library (AML/2 in OP, AML/J in NZ); secondary copy tapes were off-loaded once full
now secondary copy tapes are written in a “secondary copy” tape library (Quantum / ADIC i2000); this tape library is located at some distance from the primary copy tape library, in an other building
interconnection between servers and tape drives within the tape libraries is via 2 SAN switches (Brocade 4100) and 2 Gbit/s fiber channel links (“Tape SAN”) redundancy / high availability
DIMS archive server connected to both Tape-SAN switches with several links using several HBAs (Host Bus Adapters)
2 extra Tape SAN switches (Brocade 4100) are co-located with secondary copy library
half of the tape drives are connected to one pair of switches, the other half is connected to the other pair of switches
User Inform. Services Order Manag ement DIMS UIS Server Name/W eb/ SW Service DIMS Operations Server Productio n Control Production Control Server SAN Storage System Ordering Control Server
The New DIMS Hardware
Tape SAN and Tape Library
Post Processin g Post Processing Server
Online/Offline Product Gen. & Deliver
y Online/ Offline Product Gen. Server CD/ DVD Writing System Product Librar y Product Library / Inventory Server Archive Server FC Switch FC Switch FC Sw. AMU Primary Copy Robot Library 9940B 2* LTO-2
...
Secondary Copy Robot Library (remote) LTO-3...
FC Sw. 2* 5* Single Mode Fibers FC Sw. FC Sw.The New DIMS Hardware
New Archive Tape Drives and Media
Purpose Type Native Capacity Read / Write Speed Connection
old 1st copy DLT4000 DLT7000 Sony AIT-2 Sony MOD 20 GByte 35 GByte 50 GByte 5 GByte 2 MByte /s 3 MByte /s 5 MByte /s 1 MByte /s SCSI SCSI SAN / FC SAN / FC SAN / FC old 2nd copy DLT4000 DLT7000 20 GByte 35 GByte 2 MByte /s 3 MByte /s new 1st copy (where necessary)
SAN disk system (disk archiving)
as much as necessary
80 MByte / s
new 2nd/1st copy StorageTek 9940B 200 GByte 30 MByte /s
new 2nd/3rd copy IBM LTO-3 400 GByte up to 80 MByte /s
Capacity Library 9940B > 11.000 slots ~2.2 PByte / 540 (1500) slots ~ 110 (300) TByte OP/NZ
The New DIMS Hardware
Data on Old Tape Drives/Media and Migration to New Media
new data (archived from Oct. 2006) are written to new media
old data (archived until Sept. 2006) are still on old media
(and operationally accessible)
to enable reading the old data, half of the old tape drives are still connected to
the new servers
to preserve the old data, they will be migrated (re-archived) to new media
ca. 100 TByte in OP
ca. 45 TByte in NZ
migration in OP started April 2007
The New DIMS Hardware
Migration to new Media in Neustrelitz
Migration was started in December 2006
Split into two steps, DIMS and ‘old world’
Using FMRT (HMK) in 19 different streams (each new file-system one stream)
ca. 45 TByte
Migration of first and second copy at the same time
Using 4 ait-2 drives for reading
Cache for migration 7GB
Max. 2 streams in parallel
The New DIMS Hardware
High Availability (1)
Sun Cluster
vital functions for DIMS run on DIMS Operation server
e.g. DNS slave, Werum Corba Name Service
if one of these services fails, all DIMS functions are inoperable
to avoid losing vital functions due to a DIMS operation server failure, DIMS
Operation server and Production Control server are configured as a Sun Cluster, i.e. if the DIMS Operation server fails, the Production Control server provides the DIMS vital functions
switching to the “backup” server and starting the services is fully automatic (without human intervention); it happens within a few minutes
(Production Control functions are not taken over by the DIMS Operation server, if Production Control server fails)
User Inform. Services Order Manag ement DIMS UIS Server Name/W eb/ SW Service DIMS Operations Server Productio n Control Production Control Server AMU SAN Storage System 25 TByte Ordering Control Server
The New DIMS Hardware
High Availability
Post Processin g Post Processing ServerOnline/Offline Product Gen. & Deliver
y Online/ Offline Product Gen. Server CD/ DVD Writing System High Ava ila bil ity High Availability / Upgrade Server Product Librar y Product Library / Inventory Server Archive Server FC Switch FC Switch FC Sw. FC Sw. FC Sw. FC Sw. Secondary Copy Robot Library (remote) Primary Copy Robot Library 13* 9940B 2* LTO-2 8* LTO-3
...
...
Sun ClusterThe New DIMS Hardware
High Availability (2)
High Availability Server
DIMS functions must have an availability of 98% relating to one month, i.e. max. 14 h of unavailability per month (TerraSAR-X requirement)
hardware problems or upgrade procedures of OS or COTS SW might last longer than 14 hours
therefore there is a “High Availability” server (HA server) it can serve any DIMS function (including archive)
thanks to Disk SAN it is possible to mount the disk partitions / file systems of the failed server at the HA server (prerequisite is that all data being modified reside on the SAN storage system and not on the local disk)
thanks to Tape SAN it is possible to attach the tape drives of the failed archive
server to the HA Server (of course not the old SCSI connected drives)
no physical change of any connection is possible
switch over is performed manually (with the help of a written procedure) bringing up a “normal” DIMS service on the HA server takes ca. 20 min,
The New DIMS Hardware
Administration Network and SunRay Network
Administration Network and ALOM (Advanced Lights Out Management)
no console terminals connected to the servers
all servers are equipped with a service processor for “advanced lights out management”
administration network allows access to ALOM service processors console of each server can be seen on any workstation / PC
allows issuing of all commands, monitoring the boot process and even switching on and off the server
fiber channel switches, storage system and secondary copy library are also connected to administration net to allow remote administration
access to administration net is controlled by C-AF firewall + password on every service processor
User Inform. Services Order Manag ement DIMS UIS Server Name/W eb/ SW Service Operations Server Productio n Control Production Control Server AMU SAN Storage System Ordering Control Server C-AF Production Network
The New DIMS Hardware
Administration Network and SunRay Network
CAF Firewall Post Processin g Post Processing Server
Online/Offline Product Gen. & Deliver
y Online/ Offline Product Gen. Server CD/ DVD Writing System High Ava ila bil ity High Availability / Upgrade Server Product Librar y Product Library / Inventory Server Archive Server FC Switch FC Switch FC Sw. FC Sw. FC Sw. FC Sw. Secondary Copy Robot Library (remote) Primary Copy Robot Library 9940B 2* LTO-2 LTO-3
...
...
SunRay Server SunRay Network Administration Network personal workstation only OberpfaffenhofenUser Inform. Services Order Manag ement DIMS UIS Server Name/W eb/ SW Service DIMS Operations Server Productio n Control Production Control Server AMU SAN Storage System 25 TByte Ordering Control Server
The New DIMS Hardware
with Processing Systems (Oberpfaffenhofen)
MSG Proc. System(s) ERS-2 GOME L2/L3 Proc. System(s) SRTM X-SAR Proc. Systems Terra/Aqua MODIS Proc. Systems AIR OS Proc. System(s)
...
TerraSAR-X TVSP Proc. System Processing Systems Post Processin g Post Processing ServerOnline/Offline Product Gen. & Deliver
y Online/ Offline Product Gen. Server CD/ DVD Writing System High Ava ila bil ity High Availability / Upgrade Server Product Librar y Product Library / Inventory Server Archive Server FC Switch FC Switch FC Sw. FC Sw. TerraSAR-X TMSP Proc. System FC Sw. FC Sw. Secondary Copy Robot Library (remote) Primary Copy Robot Library 13* 9940B 2* LTO-2 8* LTO-3
...
...
DIMS Name/W eb/ SW Service DIMS Operations Server Productio n Control Production Control Server AMU SAN Storage System 13 TByte
The New DIMS Hardware
with Processing Systems (Neustrelitz)
C-AF Production Network ENVISAT MERIS Proc. System(s) Champ Proc. System(s) BIRD Proc. Systems GRACE Proc. Systems xxx Proc. System(s)
...
TerraSAR-X TMSP Proc. System Processing Systems CAF Firewall Post Processin g Post Processing Server High Ava ila bil ity High Availability / Upgrade Server Product Librar y Product Library / Inventory Server Archive Server FC Switch FC Switch FC Sw. FC Sw. TerraSAR-X TMSP Proc. System FC Sw. FC Sw. Secondary Copy Robot Library (remote) Primary Copy Robot Library 10* 9940B 6* LTO-3...
...
Oberpfaffenhofen (OP) DLR OP Backbone Switch VPN + W A N Route r Packet Sh ap e r CAF Firewall 1G & 100M Ethernet Switches CAF Infrastruc tu re Net CAF DMZ other Institutes protected DLR LAN
DIMS in Oberpfaffenhofen, Neustrelitz and Birlinghoven
Neustrelitz (NZ) Packet Sh ap e r VPN + W A N Route r CAF Infrastruc tu re Net CAF DMZ 1G & 100M Ethernet Switches protected DLR LAN DLR NZ Backbone
Switch & Router CAF Firewall CAF Prod. N e t wi th D IM S S e rv e rs CAF Prod. N e t wi th D IM S S e rv e rs 155 Mbit/s 155 Mbit/s VPN Router VPN-Tunnels over X-WIN 1 Gbit/s DLR OP to DLR NZ VPN Tunnel
The New DIMS Hardware
Status
all new hardware operational (including all connections and all COTS software)
all of the DIMS services already migrated to new hardware
WERUM Corba Name Service (OP and NZ)
Product Library (incl. Archive) (OP and NZ)
Production Control (OP and NZ)
User Information Services Interface / Loading (UI / UL) (OP)
Order Control (OP)
Online / Offline Product Generation (OPG) (OP)
EOWEB “backend,” EOWEB UI/UL (on new HW from the very beginning) (OP)
EOWEB “frontend” (Bih)
Post-Processing (NZ)
Still to be migrated
Post-Processing (OP)