Code Construction - Rebuilding Any Single-Node Erasure

Rebuilding Any Single-Node Erasure

5.3 Code Construction



 P_jⁱ¹

1 · · · P_jⁱ¹ .. t

. ...

P_jⁱ^t

1 · · · P_jⁱ_t^t







(5.3)

Its determinantdet(G)is a polynomial with unknowns λ_j₁, . . . , λ_j_t. All terms have highest degree r^k(i₁+ · · · +i_t). One term with highest degree is∏^ts=1λⁱ^s^r

js with non-zero coefficient1 or−1. So det(G)is a non-zero polynomial. Up to now we only showed one possible case of erasures. For anyr erasures, we can find the corresponding det(G)as a non-zero polynomial. The product of all these polynomials is again a non-zero polynomial. Hence by [Alo99] for a large enough field there exist assignments of{λ_j}such that the value of the polynomial is not 0. Then for any case of r erasures, the corresponding matrixG is invertible, and the code is MDS.

5.3 Code Construction

The code in Construction 4.24 has optimal rebuilding for systematic nodes. However, in order to rebuild a parity node, one has to access all the information elements. In this section we construct MDS array codes with optimal rebuilding ratio for rebuilding both the systematic and the parity nodes. The code hask−1 systematic nodes and r parities nodes, for any k, r.

Consider the permutation f_j = f_j¹ in Construction 5.1. It is clear that f_j is a permutation of orderr, i.e., f_i^r is the identity permutation. For i ∈ [0, r−1], define X_i as the set of vectors of weighti, namely, X_i = {v ∈_Z^k_r : v· (1, . . . , 1) =i}. X₀is a subgroup ofZ^k_r andX_i = X₀+ie_k is its coset, wheree_k = (0, . . . , 0, 1). Assume the elements inX_iare ordered,i∈ [0, r−1], and the ordering is

X₀ = (v₁, . . . , v_rk−1),

X_i = (v₁+ie_k, . . . , v_rk−1+ie_k).

Since the ordering of the elements in each node does not matter, we can reorder them as(X₀, X₁, . . . , X_r−1), with each X_i ordered as above. We are going to write our generator matrix of the code in this new ordering. One can check that f_j(X_i) =X_i+1, where the subscript is added modr. So the

matrixP_j corresponding to f_j can be written as

Next we assign coefficients inP_j. Whenr =2, 3, modify the 1 entries of p_iintoc if its correspond-ing columnl satisfies l·_∑^j_t₌₁e_t = 0. Here c is a primitive element in F3, F4. Whenr ≥_{4, modify} 1 entries into λ_j.

In the following, we will use block matrices the same as single elements. When referring to row or column indices, we mean block row or column indices. We refer to p_jas a small block, and the corresponding block row or column as a small block row or column. AndP_j is called a big block with big block row or column. Moreover, we assume the elements in each column are in order (X₀, . . . , X_r₋₁). And define Aⁱ_j by cyclically shifting the rows and columns of A⁰_j to the right and bottom byi

posi-ܫ

Figure 5.2: Parity matricesAⁱforr=2 (left) and r =3 (right) parities. When the first parity node is erased, the underlined elements are accessed from systematic nodes. The remaining unknown elements are recovered by the shaded elements from parity nodes.

tions: rowx. Construct the code as follows. Let the first k−1 nodes be systematic, and the last r nodes be parities. Parityi is defined by Aⁱ₁, . . . , Aⁱ_k₋₁. The generator matrix of the code is

ܽ_଴ǡଵ ܽ_଴ǡଶ ܽ_଴ǡଵ൅ ܽ_଴ǡଶ ܽ_ହǡଵ൅ ܽ_ସǡଵ൅ ܽ_ଷǡଶ൅ ܽ_ଶǡଶ

ܽ_ଵǡଵ ܽ_ଵǡଶ ܽ_ହǡଵ൅ ʹܽ_ସǡଵ൅ ܽ_ଷǡଶ൅ ʹܽ_ଶǡଶ ܽ_ଵǡଵ൅ ܽ_ଵǡଶ

ܽଶǡଵ ܽଶǡଶ ܽ଺ǡଵ൅ ʹܽ଻ǡଵ൅ ʹܽ଴ǡଶ൅ ܽଵǡଶ ܽଶǡଵ൅ ܽଶǡଶ

ܽ_ଷǡଵ ܽ_ଷǡଶ ܽ_ଷǡଵ൅ ܽ_ଷǡଶ ܽ_଺ǡଵ൅ ܽ_଻ǡଵ൅ ʹܽ_଴ǡଶ൅ ʹܽ_ଵǡଶ

ܽ_ସǡଵ ܽ_ସǡଶ ʹܽ_଴ǡଵ൅ ܽ_ଵǡଵ൅ ʹܽ_଺ǡଶ൅ ܽ_଻ǡଶ ܽ_ସǡଵ൅ ܽ_ସǡଶ

ܽ_ହǡଵ ܽ_ହǡଶ ܽ_ହǡଵ൅ ܽ_ହǡଶ ʹܽ_଴ǡଵ൅ ʹܽ_ଵǡଵ൅ ʹܽ_଺ǡଶ൅ ʹܽ_଻ǡଶ

ܽ_଺ǡଵ ܽ_଺ǡଶ ܽ_଺ǡଵ൅ ܽ_଺ǡଶ ʹܽ_ଷǡଵ൅ ʹܽ_ଶǡଵ൅ ܽ_ହǡଶ൅ ܽ_ସǡଶ

ܽ_଻ǡଵ ܽ_଻ǡଶ ʹܽ_ଷǡଵ൅ ܽ_ଶǡଵ൅ ܽ_ହǡଶ൅ ʹܽ_ସǡଶ ܽ_଻ǡଵ൅ ܽ_଻ǡଶ

Figure 5.3: An MDS array code with two systematic and two parity nodes by Construction 5.3. The finite field used isF₃. The shaded elements are accessed to rebuild the first parity node.

Sometimes we will omit the subscriptj when it is not important, and the superscript is computed modr.

Example 5.4 For two and three parities, the matricesAⁱ are shown in Figure5.2. Whenr =_{2, as} finite fieldF₃is used, we can take α = 2 6=1. Coefficient α = 2 is multiplied to only the second diagonal in A⁰. Whenr = 3, finite field F₄is used and we choose some α 6=0, 1. We multiply α to one diagonal block in eachAⁱ. It can be seen that A¹, A²are simply shifted versions of A⁰. An example of a code with2 parities is shown in Figure 5.3.

It can be seen from Construction 5.3 and Figure 5.3 that this code is not optimal update. In fact each information element appears2r−1 times in the parities.

Next we show that the code in Construction 5.3 has optimal ratio. We first observe that in Aⁱ, thex-th row (x6=i) is

i x

x in Aⁱ · · · p^x⁻ⁱ · · · β pⁱ⁻^x · · ·

where the values above are the column indices and omitted blocks are all zero. Here β = α if x−i < ^r₂ orx−i = ₂^r, i < ₂^r_{, and β} = 1 otherwise. Therefore, suppose i⁰−i< ^r₂ ori⁰−i =

2, i< ^r₂, then thei⁰-th small block row inAⁱand thei-th small block row in Aⁱ⁰are the same except for the coefficients:





i i⁰

i⁰inAⁱ · · · pⁱ⁰⁻ⁱ · · · α pⁱ⁻ⁱ

0 · · · i in Aⁱ⁰ · · · pⁱ⁰⁻ⁱ · · · pⁱ⁻ⁱ⁰ · · ·



. (5.5)

Theorem 5.5 The code has optimal ratio1/r for rebuilding any node. More specifically, when the systematic nodee_i is erased,i∈ [1, k−1], we only need to access rowsY_i = {v ∈ _Z^k_r : v·e_i = 0}. When the parity i is erased, i ∈ [0, r−1], we only need to access rows X_i = {v ∈ _Z^k_r : v· (1, 1,· · · , 1) =i}.

Proof: Systematic rebuilding: W.l.o.g. assume column e₁ is erased. Access equations Y= {v∈_Z^k_r : v·e₁=0}from each parity. We will show that all the unknowns(x₀, . . . , x_rk−1)in columne₁are solvable from these equations. First notice thatY is a subgroup ofZ_r^k, andY−te_k = Y for any t∈ [0, r−1]. So any index inZ^k_rcan be written as one of the following three cases:

l, l−te₁, l+t(e₁−e_k),

wherel ∈ Y and 1 ≤ t ≤ b^r₂c. So we need to show that an unknown element indexed by these three cases is solvable.

For anyl∈Y, assume l∈Y∩X_i⁰ for somei⁰. Thenx_lis contained in equation

x_l

because of thei⁰-th small row block[· · ·I· · · ]in Aⁱ₁⁰. Notice thatl+ (i−i⁰)e_k ∈ Y∩X_i for all i∈ [0, r−1]. In (5.5) consider rowl in Aⁱand rowl+ (i−i⁰)e_kinAⁱ⁰, and writet=i⁰−i≤ b₂^rc. Then we have equations

bx_l₋_te₁+αcx_l₊_t₍_e₁₋_e_k₎ = g bx_l₋_te₁+cx_l₊_t₍_e₁₋_e_k₎ = h

for some coefficients α 6= _{0, 1, b, c} 6= 0 and g, h. These equations are obviously independent.

Hence all unknowns are solvable.

Next we show that the fraction of elements accessed in the remaining columns is 1/r. For a parity nodeAⁱ, only rowsY are accessed, which is a fraction of 1/r. The corresponding columns inAⁱ of theses equations are accessed from the systematic nodes. For a surviving systematic node j ∈ [2, k−1]and parity i, by definition of pⁱ_j, rowsY in Aⁱ_j are mapped to columns Y⁰ = Y+ i(e_k−e_j) +se_k for some s. However, Y⁰ is a coset ofY and since i(e_k−e_j) +se_k ∈ Y, we have Y⁰ =Y. Thus only elements with indices Y are accessed from each node.

Parity rebuilding: Since the parities are all symmetric, w.l.o.g. suppose the 0-th parity is erased. Access X₀ from each node, which is the set of vectors of weight0. Need to show this is sufficient to recover

A= [A⁰₁, A⁰₂, . . . , A⁰_k₋₁],

where A⁰_j is defined in Construction 5.3. Since X₀ is sent from the systematic nodes, the 0-th column in each A⁰_j is known, and we can remove them from the equations. By (5.5), from parityi⁰ we can access row

[β⁰pⁱ₁⁰· · ·p⁻₁ⁱ⁰· · ·β⁰pⁱ₂⁰· · ·p⁻₂ⁱ⁰· · · ·β⁰pⁱ_k⁰· · ·p⁻_kⁱ⁰· · · ],

where the underlined elements are known from the systematic nodes and can be treated as 0. Here β⁰is1 or α. Multiplying this row by β, we can rebuild the i⁰-th row ofA:

[pⁱ₁⁰· · ·β p⁻ⁱ

1 · · ·pⁱ₂⁰· · ·β p⁻ⁱ

2 · · · ·pⁱ_k⁰· · ·β p⁻ⁱ

0 k · · · ],

where ββ⁰ = α, and again the underlined elements are known and treated as 0. So far we have rebuilt rowi⁰inA, with i⁰ =1, 2, . . . , r−1. The 0-th row in A is

[I· · ·I· · · ·I· · · ]

and can be rebuilt from the systematic nodes directly. Thus the erased node is rebuilt by accessing X₀, which is1/r of the elements.

It can be seen from the above proof that the rebuilding of any single erasure can be easily implemented. If a systematic node is eared, we only need to solve at most two linear equations at a time, and the computation can be done in parallel. If a parity is erased the rebuilding is even simpler:

we only need to subtract/add information elements, and multiply by β. Also, the rebuilding above is different from Construction 4.24, where only one linear equation is solved at a time.

Example 5.6 Consider the code with two or three parities in Figure 5.2. When the first parity node is erased, one can accessX₀ from the systematic nodes, and the underlined elements are known.

Then access the shaded elements from the surviving parity nodes. It is easy to see that the first parity can be rebuilt from the accessed elements.

For the specific example of Figure5.3, when the first systematic node is erased, one can access

rowsY₁ = {v : v·e₁=0} = {0, 1, 2, 3}from all the surviving nodes. When the first parity node is erased, one can access rowsX₀ = {0, 3, 5, 6}from all the remaining nodes (the shaded elements).

Then it is easy to check that in both cases it is sufficient to rebuild the erased column.

Next we show the construction is indeed an MDS code. We prove this by reducing this problem to the fact that Construction 5.1 is MDS. First we make an observation on the small blocks.

Lemma 5.7 Construction 5.1 is MDS iff anyt×t sub-block matrix of

H⁰ =







p⁰₁ · · · p⁰_k ... ... p^r₁⁻¹ · · · p^r_k⁻¹







k×k

is invertible, for allt∈ [1, r].

Proof: First define at×t sub-block matrix of H⁰:







p⁰₁ · · · p⁰_t ... ... p^t₁⁻¹ · · · p^t_t⁻¹





 .

We showed in the appendix that Construction 5.1 is MDS iff anyG in (5.3) is invertible. W.l.o.g.

suppose{i₁, . . . , i_t} = {0, . . . , t−1},{j₁, . . . , j_t} = {1, . . . , t}. By (5.4),G can be rewritten as

where each big block is composed ofr×r small blocks. We can see that the shaded small blocks are the only non-zero blocks in their corresponding rows and columns, and they form the submatrixH.

ThereforeG being invertible is equivalent to H and the remaining submatrix both being invertible.

Moreover the remaining submatrix has a similar form as G and we can again find t rows and t columns corresponding toH. Continue this we get

det(G) 6=0⇔ (det(H))^r 6=0⇔det(H) 6=0.

The same conclusion holds for any submatrix ofH⁰. Thus completes the proof.

The method of taking out sub-block matrices to compute the determinant as above is also used in the proof of the following theorem, which shows that Construction 5.3 is indeed an MDS code.

Theorem 5.8 If the coefficients in the linear combinations of the parities are chosen such that Construction5.1 is MDS, then Construction 5.3 is also MDS.

Proof: Similar to the proof of Theorem 5.2 in the appendix, Construction 5.3 being MDS

means any of the following matrix is invertible: columnx ∈ I. Only small block rows x in each big block are non-zero. Thus we can take out this t×t sub-block matrix: at the remaining submatrix. Again, we can take out another small block column and row from I from each big block, and it is invertible by Lemma 5.7. Continue this process, we are left with only columns and rows of I in each big block. For all i, i⁰ ∈ I, 1 ≤ i⁰−i < ^r₂ ori⁰−i = ₂^r, i < ₂^r, consider rowi⁰ in Aⁱ and rowi in Aⁱ⁰. They are shown in (5.5). One can do row operations and keep the invertibility of the matrix, and get



Proceed this for alli, i⁰ ∈ I, we are left with block diagonal matrix in each big block and the matrix left is of sizet²×t². Taking out thei₁-th column and row in each big block, we have the following

which is invertible by Lemma 5.7. Similarly, we can take out thei2-th column and row, and so on, and each submatrix is again invertible. Thus, any matrix A is invertible and Construction 5.3 is

MDS.

For example, one can easily check that the code in Figure 5.3 is able to recover the information from any two nodes. Therefore it is an MDS code.

Theorem 5.8 says that once we have an MDS code in Construction 5.1, we can use its coeffi-cients and design a new code by Construction 5.3. And the new code is guaranteed to be an MDS code.

5.4 Summary

In this chapter, we presented constructions of MDS array codes that achieve the optimal rebuilding ratio1/r, where r is the number of redundancy nodes. The new codes are constructed based on our previous Construction 4.24 and improve the efficiency of the rebuilding access.

Now we mention a couple of open problems. First, in our construction each information element is contained in2r−1 parity elements. This means if we update this element, we need to update 2r−1 times in the parities. But an optimal-update code will require only r updates. So are there codes that achieve optimal rebuilding ratio and also optimal update?

Besides, if there are k−1 systematic nodes and r parity nodes, then our code has r^k rows.

Namely, the code length is limited. Given the number of rows, are there codes that are longer? For example, whenr=2, we know a construction with r^k rows andk systematic nodes:

A⁰_j =



 I 0 p_j I



, A¹_j =



 I p_j 0 I



.

HereA⁰_j, A¹_j are the matrices that generate the parities, and we can take allj∈ [1, k]. This code has one more information column than Construction 5.3, and also achieves optimal ratio. On the other hand, however, given r^k rows, it can be proven that any systematic and linear code with optimal ratio has no more thank+1 systematic nodes. Thus the code length k−1 can be improved by at most2 nodes.

Finally, using Construction 4.24 one is able to rebuild any e, 1 ≤ e ≤ r, systematic erasures with an access ratio ofe/r. However, it is an open problem to construct a code that can rebuild any e erasures with optimal access.

Chapter 6

In document Coding for Information Storage (Page 89-99)