• No results found

Triple Parity Codes

6.5 Placement, Storage Efficiency, and Reliability

6.5.3 Triple Parity Codes

Triple parity (l, m)-MDS codes correspond to the case where m − l = 3. When l = 1, this corresponds to four-way replication. Plugging l = m − 2 in (6.67)

100 101 102 106

108 1010 1012 1014 1016

Number of nodes

MTTDL (in days)

(2, 5)-MDS code 1/λ = 30000 h 1/µ = 30 h

clustered, deterministic rebuild declustered, deterministic rebuild clustered, exponential rebuild declustered, exponential rebuild

Figure 6.8: MTTDL of a (2, 5)-MDS code vs. the number of nodes for mean time to node failure 1/λ = 30000 h and mean time to read all contents of a node during rebuild 1/µ = 30 h.

and (6.114), we obtain the MTTDL values for triple parity codes for clustered and declustered placement schemes, respectively:

MTTDLclus. ≈ µ34

6

(m − 1)(m − 2)(m − 3)

M13(Gµ)

M3(Gµ) for m = 4, · · · , n (triple parity codes). (6.128)

MTTDLdeclus. ≈ (n − 1)2(n − 2)µ34

6

(m − 1)2(m − 2)4 M13

Gn−1

m−2µ



M3

 Gn−1

m−2µ



for m = 4, · · · , n (triple parity codes). (6.129) From (6.128) and (6.129), it is observed that the MTTDL of triple parity codes under both placement schemes are directly proportional to the fourth power of the mean time to node failure, 1/λ, and inversely proportional to the cube of the mean time to read all contents of a node during rebuild, 1/µ.

As was the case in double parity codes, the MTTDL depends on the rebuild distribution. For deterministic rebuild times, the ratios M13(Gµ)/M3(Gµ) and

M13 Gn−1

m−2µ /M2 Gn−1

m−2µ become one. However, for random rebuild times, these ratios are upper-bounded by one by Jensen’s inequality. As an example, if the rebuild time distribution was exponential, these ratios are equal to 1/6 and therefore

MTTDLclus. ≈ µ34

1

(m − 1)(m − 2)(m − 3) for m = 4, · · · , n (triple parity codes, exponential rebuild times). (6.130)

MTTDLdeclus. ≈ (n − 1)2(n − 2)µ34

1

(m − 1)2(m − 2)4 for m = 4, · · · , n (triple parity codes, exponential rebuild times). (6.131) Comparing (6.131) with (6.129), it is observed that the rebuild time distri-bution scales down the MTTDL, but leaves the behavior with respect to the number of nodes, n, unaffected. This can be seen in the plots of MTTDL of a system using a (2, 5)-MDS code against the number of nodes in the system for clustered and declustered placements, as well as for deterministic and ex-ponential rebuild times, in Figure 6.8. Also, as in the case of double parity codes, the difference in MTTDL between the two schemes can be significant, depending on the number of nodes, n, in the system. This is because, as seen from (6.128) and (6.129), the MTTDL of clustered placement is inversely proportional to n, whereas the MTTDL of declustered placement is roughly proportional to the square of n. This is illustrated in Figure 6.8 in which MTTDL is plotted against the number of nodes, n, in a log-log scale. The lines corresponding to clustered placement have a slope of −1 indicating that the MTTDL is inversely proportional to n, whereas the lines corresponding to declustered placement have a slope of roughly 2 indicating that the MTTDL is proportional to the square to n.

For a symmetric placement scheme with spread factor k > m, the MTTDL of triple parity codes follows from (6.116) by substituting l = m − 2:

MTTDL(k) ≈ (k − 1)2(k − 2)µ3 Spread factor k = m corresponds to clustered placement scheme, and so its MTTDL is given by (6.128):

MTTDLclus.≈ µ3

4 8 12 16 20 24 28 32 36 40 4

8 12

16 20 1.0e+007

1.0e+009 1.0e+011

Spread Factor Codeword Length

MTTDL (in days)

Triple Parity Codes 1/λ = 30000 h 1/µ = 30 h n = 40 nodes

Figure 6.9: MTTDL of triple parity codes as a function of codeword length m and spread factor k for a system with number of nodes n = 20.

It is observed from (6.132) that, like in the case of double parity codes, increasing the spread factor k improves the MTTDL. The improvement, how-ever, is much larger because the MTTDL is roughly proportional to the cube of k. The reason for this is the same as in the case of double parity codes, namely, due to the spreading of codewords over more number of nodes, the amount of most-exposed data at each successive exposure level decreases rapidly, thereby reducing the chances of data loss. Figure 6.9 shows how the MTTDL varies as a function of both the codeword length m and the spread factor k for triple parity codes, for a given number of nodes, n. In Figure 6.9, four-way replicated systems correspond to the case where the codeword length is 3, clustered place-ment corresponds to the cases where the spread factor is equal to the codeword length, and declustered placement corresponds to the case where the spread factor is equal to the number of nodes. It is observed that increasing the spread factor increases the MTTDL significantly, and that increasing the codeword length decreases the MTTDL significantly.

The storage efficiency of a system using a triple parity code with codeword length m is equal to (m − 3)/m, as each set of m − 3 user data blocks requires storing m codeword blocks in the system. Therefore, Figure 6.9 is easily trans-formed into Figure 6.10 to show the MTTDL as a function of storage efficiency,

4 8 12 16 20 24 28 32 36 40 0.25

0.63 0.75

0.81 0.85 1.0e+007

1.0e+009 1.0e+011

Spread Factor Storage Efficiency

MTTDL (in days)

Triple Parity Codes 1/λ = 30000 h 1/µ = 30 h n = 40 nodes

Figure 6.10: MTTDL of triple parity codes as a function of storage efficiency (m − 2)/m and spread factor k for a system with number of nodes n = 20.

(m − 3)/m, and spread factor, k.

Reliability Simulations 7

Event-driven simulations are used to verify the theoretical estimates of MTTDL of replication-based systems for two placement schemes, namely, clustered and declustered, under various rebuild and failure time distributions, and under network rebuild bandwidth constraints. The simulations are more involved than the theoretical analysis as they do not make any of the approximations made in theory. Despite this fact, it is found that the simulations match the-oretical estimates for a wide range of parameters, including the parameters generally observed in practice, thereby validating the applicability of the re-liability analysis to real-world storage systems. A detailed description of the simulation method used and a comparison of simulation results and theory for a variety of storage system models is presented in this chapter.

7.1 Simulation Method

The storage system is simulated using an event-driven simulation with three types of events that drive the simulation time forward: (a) failure events, (b) rebuild-complete events, and (c) node-restore events. The state of the system is maintained by the following variables: time, the simulated time,nActiveNodes, the number of active (surviving) nodes in the system, failTimes, the times of next failure of each active node generated according to the distribution Fλ, failedNodes, the indices of all failed nodes,exposureLevel, the exposure level, and a vector of length (r + 1), dataExposure= (D0, · · · , Dr), where De is the amount of user data that have lost e replicas, e = 1, · · · , r. The values of these variables are updated at each event, and when Dr > 0, data loss is said to have occurred and the simulation ends.

For each set of parameters, the simulation is run 100 times, and the MTTDL and its 95% confidence intervals are computed. Whereas for declustered

place-129

ment, the simulation is run for n nodes, for clustered placement, the simula-tions are run only for one cluster, that is, r nodes, and the obtained MTTDL of the cluster is divided by n/r to obtain the MTTDL of the system. This is because clusters are independent of each other and the number of clusters is n/r.