• No results found

Factors rebuilding a degraded RAID

N/A
N/A
Protected

Academic year: 2021

Share "Factors rebuilding a degraded RAID"

Copied!
13
0
0

Loading.... (view fulltext now)

Full text

(1)

Factors rebuilding a degraded RAID

Whitepaper

Abstract

(2)

Table of Content

PRODUCT FAMILIES COVERED BY THIS DOCUMENT ... 3

1.

WHAT HAPPENS IF ONE OR MORE DISKS IN A RAID ARRAY FAIL? ... 4

2.

2.1 DISK FAILURE IN A RAID1 OR RAID1+0 ... 4

2.2 DISK FAILURE IN A RAID3 OR RAID3+0(DEDICATED PARITY RAID) ... 5

2.3 DISK FAILURE IN A RAID5 OR RAID6(DISTRIBUTED PARITY RAID) ... 5

2.4 DISK FAILURE IN A RAID5 WITH HOT-SPARE DISK ... 6

THE KEY FACTORS INFLUENCING THE REBUILD TIME ... 6

3.

3.1 THE COMPUTING POWER OF THE RAID CONTROLLER ... 6

3.1.1 The computing power bottleneck ... 7

3.1.2 Recommended configurations for EonStor DS G6 and G7 storage systems ... 7

3.2 THE WRITE PERFORMANCE OF A SINGLE DRIVE IN A RAID ARRAY ... 8

3.2.1 NL-SAS vs. SATA III hard drive ... 8

3.2.2 SATA vs. SATA III hard drive with Native Command Queuing (NCQ) enabled ... 9

3.2.3 The drive bottleneck ... 9

3.3 THE CAPACITY OF A FAILED HARD DRIVE ... 10

ESTIMATING THE REBUILD TIME ... 11

4.

4.1 REBUILD TIME, IF COMPUTING POWER IS THE BOTTLENECK ... 11

4.2 REBUILD TIME, IF WRITE THROUGHPUT IS THE BOTTLENECK ... 12

APPENDIX: ... 13

5.

5.1 INFORTREND WEB LINKS ... 13

(3)

Product families covered by this document

1.

This document applies to the following Infortrend storage system families:

EonStor DS G6 Family

EonStor DS G7 Family

(4)

What happens if one or more disks in a RAID array fail?

2.

Single disks or single level striped arrays will lose all data in case of a disk failure. You have to replace the failed disk and then restore the lost data from a backup before you can continue operation.

Opting for a mirrored (RAID 1 and RAID 1+0) or parity array (RAID 5 / 6 and RAID 5+0 / 6+0); despite lower performance and higher TCA / TCO can solve this problem.

There is an extremely small chance that 2 disks

fail simultaneously, but if data protection and integrity are highest priority and cost and performance minor, consider using a RAID 6 or RAID 6+0 to be prepared for that eventuality.

2.1 Disk failure in a RAID 1 or RAID 1+0

The failed disk has to be replaced by a new disk first. The RAID controller will write then all data on the mirror to the newly installed disk (Rebuild), to be considered as a disk copy from the mirror to the new disk. There is no need to restore a previous made back-up, with possible outdated data. The RAID controller is copying the data in the background, so operation will be continued.

Figure 1: RAID 0, disk failure

(5)

2.2 Disk failure in a RAID 3 or RAID 3+0 (Dedicated Parity RAID)

If one disk fails, the data can be rebuild by reading all remaining disks (all but the failed one) and writing the rebuilt data to the newly replaced disk. Writing to the newly replaced single disk is enough to rebuild the array. There are actually two factors that can impact the rebuild of a degraded RAID (A RAID will be “degraded”, if one or more disks fail). If the dedicated parity disk fails, the rebuilding process is a matter of recalculating the parity info by reading all remaining data and writing the parity to the new dedicated disk. If a data disk fails, the data need to be rebuild, based on the remaining data and the parity. This is the most time-consuming part of rebuilding a degraded RAID.

2.3 Disk failure in a RAID 5 or RAID 6 (Distributed Parity RAID)

If a disk fails, the data can be rebuild by reading all remaining disks, rebuilding the data, recalculating the parity information and writing the data and parity information to the new disk. This is time-consuming. The rebuild time is related to drive capacity and number of drives in the RAID array or RAID sub-arrays (RAID 5+0 or RAID 6+0), further to the computing power of the controller.

Figure 3: RAID 3, disk failure

(6)

2.4 Disk failure in a RAID 5 with hot-spare disk

When an RAID array is not protected by a hot spare disk, the failed disk has to be removed and replaced by a new one. The controller will detect the new disk and start rebuilding the RAID array.

Using a hot-spare disk will overcome the replacement procedure. In case of disk failure a hot spare disk is automatically incorporated into the RAID array and takes over the failed disk.

The key factors influencing the rebuild time

3.

There are 3 main factors that affect the duration of rebuild time:

 The computing power of the RAID controller

 The write performance of a drive in a RAID array

 The capacity of the failed drive

3.1 The computing power of the RAID controller

The computing power of a RAID controller is the maximum throughput per second that can be run in XOR. As shown in the table below, EonStor DS G7 storage systems reduce the rebuild time to 62% of the rebuild time of an EonStor DS G6 storage system in a RAID 5 configuration with 15 hard drives.

Model Computing

power

Disk Write Performance(*1)

Configuration

(RAID 5) Rebuild time

Time saving in % G6 850 MB/s NL SAS Toshiba 1TB (122 MB/s) 7 HDDs 2 hr. 35 min. 15 HDDs 4 hr. 54 min. G7 1350 MB/s 7 HDDs 2 hr. 35 min. 15 HDDs 3 hr. 5 min.

38%

Table 1: Computing power of the RAID controller

(7)

Figure 6: ESDS G6 Number of HDD recommendation

3.1.1 The computing power bottleneck

The computing power of a RAID controller is limited (see Table 1). The work load of the RAID controller will increase as more drives will be added to a RAID array / RAID sub-array. The rebuild time will increase significant too then.

Example:

1. Configuration:

EonStor DS G6 computing power = 850 Mbps EonStor DS G7 computing power = 1350 Mbps

RAID 5 array = 6 x SAS 6G 400 GB SSD w/ 253.04 MB/s max. seq. WRITE throughput

2. Calculation:

Max. total READ throughput = (drive # - failed drive #) * Max. WRITE throughput Max. total READ throughput = (6 - 1)* 253.04 MB/s = 1265.20 MB/s

Computing power ≡ Total READ throughput (in rebuild process)

3. Result:

EonStor DS G6:

850 Mbps < 1265.20 MB/s

EonStor DS G7:

1350 Mbps > 1265.20 MB/s

3.1.2 Recommended configurations for EonStor DS G6 and G7 storage systems

1. EonStor DS G6:

We recommend using less or equal 8 hard drives in a RAID array / sub-array for a suitable rebuild time. If you want to use more than 8 hard drives in a RAID 5 or RAID 6, we recommend using RAID 5+0 or RAID 6+0 and balance the hard drive number among the RAID sub-arrays (≤ 8 hard drives per

(8)

Figure 7: ESDS G7 Number of HDD recommendation

2. EonStor DS G7:

A suitable rebuild time will be achieved using equal or less than 16 hard drives in a RAID array or a RAID sub-array. If you want to use a RAID 5 with more than 10 hard drives, we recommend using RAID 5+0. Even the difference of rebuild time using 10 or 16 hard drives is quite small; we suggest using less than 10 drives in a RAID 5 array or RAID 5+0 sub-array to minimize the risk of a 2nd hard drive failure. Using RAID 6, we recommend using less or equal 16 hard drives as the RAID protects against 2 drive failures. Consider using RAID 6+0 when using more than 16 hard drives.

3.2 The write performance of a single drive in a RAID array

The maximum write performance of a single drive in the RAID array is the maximum throughput per second written to the Hot-Spare disk.

3.2.1 NL-SAS vs. SATA III hard drive

The table below displays the time saving using the same EonStor DS G7 storage system in RAID 5, once with NL-SAS hard drives and once with SATA III hard drives. Using NL-SAS hard drives will reduce the rebuild time to 43% of the time using SATA III hard drives.

Model Computing

power Write performance

(*1) Configuration

(RAID 5) Rebuild time

Time saving in % G7 1350 MB/s SATA Hitachi 2TB (43 MB/s w/o NCQ) 7 HDDs 13 hrs 30 mins NL SAS Toshiba 2TB (122 MB/s) 7 HDDs 5 hrs 50 mins 57%

Table 2: Write performance of NL-SAS vs. SATA hard drive

(9)

3.2.2 SATA vs. SATA III hard drive with Native Command Queuing (NCQ) enabled

A RAID controller that supports Native Command Queuing (NCQ) for SATA hard drives is able to reduce the rebuild time significant compared to a RAID controller that is not supporting NCQ. The table shows an EonStor DS G6 storage system without NCQ support compared to an EonStor DS G7 storage system with NCQ enabled. The computing power in this case is not relevant, as the rebuild time using 7 hard drives is the same on EonStor DS G6 and G7 (see 3.1).

Model Computing

power Write performance

(*1) Configuration

(Raid 5) Rebuild time

Time saving in % G6 850 MB/s SATA Hitachi 2TB (43 MB/s w/o NCQ) 7 HDDs 13 hrs 30 mins G7 1350 MB/s SATA Hitachi 2TB (130 MB/s w/ NCQ) 7 HDDs 4 hrs 54 mins 64%

Table 3: Write performance of SATA hard drive w/o NCQ vs. w/ NCQ

3.2.3 The drive bottleneck

In general the maximum READ throughput of a single hard drive is higher than the max. WRITE throughput.

NL SAS Max. Seq. Read Throughput Max. Seq. Write Throughput(*1)

Toshiba 1TB

MK1001TRKB 147 MB/s 122 MB/s

Table 4: READ throughput vs. WRITE throughput

In Rebuild process the write throughput is equal to the read throughput. If one data is read from a single disk per second, one data is written to the Hot-Spare disk per second. If two data are read from a single disk per second, two data are written to the Hot-Spare disk per second.

Example:

1. Configuration:

A.) EonStor DS G7 w/ 7 x NL-SAS 6G HDD (1TB), max. seq. WRITE throughput 122 MB/s

(10)

2. Calculation:

Max. total READ throughput = (drive # - failed drive #) * Max. WRITE throughput A.) Max. total READ throughput = (7 - 1) * 122 MB/s = 732 MB/s

B.) Max. total READ throughput = (7 - 1) * 43 MB/s = 258 MB/s

3. Result:

A.): 1350 Mbps > 732 MB/s B.): 1350 Mbps > 258 MB/s

Configuration Drive Max. Write

Throughput(*1) Rebuild Time Test Environment

A.) NL SAS 6G

Toshiba 1TB 122 MB/s 2 hours 09 minutes EonStor DS G7

Raid 5 LD: 1 Hard Drive: 7

Hot-Spare: 1

B.) SATAIII

Hitachi 1TB 43 MB/s 6 hours 09 minutes

Table 5: WRITE throughput SATA III hard drive vs. NL-SAS hard drive

3.3 The capacity of a failed hard drive

The capacity of a failed hard drive is another key factor affecting the rebuild time. As higher the capacity of the failed hard drive, as longer is the rebuild time. Using a hard drive with smaller capacity (ex. 1TB hard drive) can decrease, the rebuild time to 64% using a 2 TB hard drive.

Drive Type Model Configuration

(RAID 5) Rebuild time Time saving in %

NL SAS Toshiba 2TB G7 7 HDDs 5 hrs 49 mins NL SAS Toshiba 1TB 7 HDDs 2 hrs 35 mins 56%

Table 6: Capacity of a failed hard drive

Max. total WRITE throughput = Bottleneck

(11)

Estimating the Rebuild Time

4.

The rebuild time can be estimated using the following formulas.

4.1 Rebuild time, if computing power is the bottleneck

The following formula is suitable using one of the following configurations:

Drives per RAID array / sub-array EonStor DS G7 EonStor DS G6

SAS 6G SSD ≤ 6 ≤ 3

NL-SAS HDD ≤ 11 ≤ 7

SATA II / III HDD ≤ 11 ≤ 7

Table 7: Suitable configuration

DC = Drive capacity in MB (#GB * 10003 ÷ 10242) DW = Drive Write Throughput (*1)

Example:

1. Configuration:

Drive Max. Write Throughput Test Environment

NL SAS 6G Toshiba 1TB 122 MB/s EonStor DS G7 Raid 5 LD: 1 Hard Drive: 7 Hot-Spare: 1 Table 8: Rebuild Time example configuration

2. Calculation:

Rebuild Time in minutes = (953,674 ÷ 122) ÷ 60 3. Result:

EonStor DS G7, Rebuild Time = 130 min.

Rebuild time = DC ÷ DW

(12)

4.2 Rebuild time, if write throughput is the bottleneck

In all other cases the rebuild time can be estimated by using the following formula.

DC = Drive capacity in MB DN = Number of drives DF = Number of failed drives

CP = Computing power (EonStor DS G6 = 850 MB/s, EonStor DS G7 = 1,350 MB/s)

Example:

1. Configuration:

Configuration EonStor Model Max. Computing

Power Test Environment

A.) G7 1350 MB/s Hitachi SSD 400GB

Raid 5 LD: 1 Drives: 7 Hot-Spare: 1

B.) G6 850 MB/s

Table 9: Rebuild Time example configuration 2. Calculation:

A.) Rebuild Time in minutes = (381,470 ÷ (1350 ÷ (7-1))) ÷ 60 B.) Rebuild Time in minutes = (381,470 ÷ (850 ÷ (7-1))) ÷ 60

3. Result:

A.) EonStor DS G7, Rebuild Time = 28 min. B.) EonStor DS G6, Rebuild Time = 45 min.

(13)

Appendix:

5.

5.1 Infortrend Web links

Infortrend Home:

http://www.infortrend.com

Infortrend Support and Resources:

http://www.infortrend.com/global/Support/Support

EonStor DS overview:

http://www.infortrend.com/global/products/families/ESDS

EonStor DS G7 overview:

http://www.infortrend.com/Event2011/2011_Global/201112_ESDS_G7/ESDS_G7.html

5.2 Drive Vendor links

Links to drive vendors, whose drives were used for testing:

References

Related documents

A functioning RAID 5 RAID group: encounters a drive failure, forcing the RAID group into a degraded state: The failed drive is either replaced with a Hot Spare or

Rebuilding a Show Store RAID with a replacement drive If the RAID controller does not see the failed drive during the Rebuilding a Show Store RAID with the original drives

 To configure a large-capacity RAID array, it is recommended to configure in RAID 6 or RAID 60 in order to minimize the risk of becoming multiple hard drives

This buzzer will sound when the RTX enclosure is in degraded RAID mode (a drive has failed and is in need of rebuild or the RAID is rebuilding).. This buzzer is

SATA RAID hard disk storage system's RAID controller has RAID-0 and RAID-0+1 and NRAID modes which can provide up to 2TB hard disk space at RAID-0 and 1TB hard disk space at

RAID with striping (like RAID 0 and RAID 5) when using on-motherboard SATA

RAID 0 offers the best performance increase over a single disk, followed by RAID 3, then RAID 5 and finally RAID 6. RAID 1 does not offer any

ThinkServer RAID 100 provides the following features: • RAID 0, 1, and 10 Support • RAID 5 with optional upgrade key • Mixed capacity drives in RAID array • Patrol Read