Problem Settings and Properties - Coding for Information Storage

In this section, we prove some useful properties related to MDS array codes with optimal update.

Let A = (a_i,j) be an array of size p×k over a finite field F, where i ∈ [0, p−1], j ∈ [_{0, k}−1], and each of its entries is an information element. LetR = {_R₀_{, R}₁_{, ..., R}_p₋₁}andZ =

{Z₀, Z₁, ..., Z_p₋₁}be two sets such thatR_l, Z_l are subsets of elements in A for all l ∈ [0, p−1]. Then for all l ∈ [0, p−1], define the row/zigzag parity element as r_l = _∑_a_∈_R_lαaa and z_l =

∑a∈Z_lβ_aa, for some sets of coefficients{α_a},{β_a} ⊆F. We call R and Z as the sets that generate the parity columns.

An MDS array code over Fq withr parities is said to be optimal-update if in the change of any information element onlyr+1 elements are changed in the array. It is easy to see that r+1 changes is the minimum possible number because if an information element appears onlyr times in the array, then deleting at mostr columns will result in an unrecoverable r-erasure pattern and will contradict the MDS property. A small finite-field size is desirable because we can update a small amount of information at a time if needed, and also get low computational complexity. Therefore we assume that the code is optimal-update, while we try to use the smallest possible finite filed.

When r = 2, only 3 elements in the code are updated when an information element is updated.

Under this assumption, the following theorem characterizes the setsR and Z.

Theorem 4.7 For a(k+2, k)MDS code with optimal update, the setsR and Z are partitions of A into p equally sized sets of size k, where each set in R or Z contains exactly one element from each column.

Proof: Since the code is a (_k+_{2, k}) MDS code, each information element should appear at least once in each parity column C_k, C_k+1. However, since the code has optimal update, each element appears exactly once in each parity column.

Let X ∈ R, note that if X contains two entries of A from the systematic column C_i, i ∈ [_{0, k}−₁], then rebuilding is impossible if columnsC_iandC_k₊₁are erased. ThusX contains at most one entry from each column, therefore|X| ≤k. However each element of A appears exactly once in each parity column, thus if|X| < k, X ∈ R, there is Y ∈ R, with|Y| > k, which leads to a contradiction. Therefore, |X| = k for all X ∈ R. As each information element appears exactly once in the first parity column,R= {R₀, . . . , Rp−1}is a partition ofA into p equally sized sets of sizek. Similar proof holds for the sets Z= {Z₀, . . . , Z_p−1}.

By the above theorem, for the j-th systematic column (a_0,j, . . . , a_p₋_1,j)^T, its p elements are contained inp distinct sets R_l,l∈ [0, p−1]. In other words, the membership of thej-th column’s elements in the sets {R_l}defines a permutationg_j : [0, p−1] → [0, p−1], such thatg_j(i) = l iffa_i,j ∈ R_l. Similarly, we can define a permutation f_j corresponding to the second parity column, where f_j(i) =l iff a_i,j ∈ Z_l. For example, in Figure 4.2 each systematic column corresponds to a

permutation of the four symbols.

Observing that there is no significance in the elements’ ordering in each column, w.l.o.g. we can assume that the first parity column contains the sum of each row of A and g_j’s correspond to identity permutations, i.e.,r_i =_∑^k_j₌⁻₀¹α_i,ja_i,j for some coefficients{α_i,j}.

First we show that any set of zigzag sets Z = {Z₀, ..., Z_p−1}defines a(k+2, k)MDS array code over a fieldF large enough.

Theorem 4.8 LetA = (a_i,j)be an array of sizep×k and the zigzag sets be Z = {Z₀, ..., Z_p−1}, then there exists a(k+2, k)MDS array code forA with Z as its zigzag sets over the fieldF of size greater thanp(k−1) +1.

In order to prove Theorem 4.8, we use the well-known Combinatorial Nullstellensatz by Alon [Alo99]:

Theorem 4.9 (Combinatorial Nullstellensatz) [Alo99, Th 1.2] LetF be an arbitrary field, and let f = f(x₁, ..., x_q)be a polynomial inF[x₁, ..., x_q]. Suppose the degree of f is deg(f) = _∑^q_i₌₁t_i, where eacht_iis a nonnegative integer, and suppose the coefficient of∏^q_i=1x^t_iⁱin f is nonzero. Then, ifS₁, ..., S_nare subsets ofF with|S_i| >t_i, there ares₁∈ S₁, s₂ ∈S₂, ..., s_q∈S_qso that

f(s₁, ..., s_q) 6=0.

Proof: [Proof of Theorem 4.8] Assume the information of A is given in a column vector W of length pk, where column i ∈ [0, k−1]of A is in the row set[(ip,(i+1)p−1]ofW. Each systematic nodei, i∈ [0, k−1], can be represented asQ_iW where Q_i = [0_p_×_pi, I_p×p, 0_p_×_p₍_k₋_i₋₁₎]. Moreover defineQ_k = [Ip×p, Ip×p, ..., Ip×p], Q_k+1 = [x₀P₀, x₁P₁, ..., x_k−1P_k−1]where theP_i’s are permutation matrices (not necessarily distinct) of size p×p, and the x_i’s are variables, such that C_k = Q_kW, C_k+1 = Q_k+1W. The permutation matrix P_i = (p⁽_l,mⁱ⁾)is defined as p⁽_l,mⁱ⁾ = 1 if and only if a_m,i ∈ Z_l. In order to show that there exists such MDS code, it is sufficient to show that there is an assignment for the intermediates {x_i}in the fieldF, such that for any set of integers {s₁, s₂, ..., s_k} ⊆ [0, k+1] the matrix Q = [Q^T_s₁, Q^T_s₁, ..., Q^T_s_k] is of full rank. It is easy to see that if the parity column C_k+1 is erased, i.e., k+1 /∈ {s₁, s2, ..., s_k} then Q is of full rank. If k /∈ {s₁, s₂, ..., s_k}andk+1 ∈ {s₁, s₂, ..., s_q}then Q is of full rank if none of the x_i’s equals to zero. The last case is when bothk, k+1∈ {s₁, s2, ..., s_k}, i.e., there are0≤i<j≤k−1 such that

i, j /∈ {s₁, s₂, ..., s_k}. It is easy to see that in that case Q is of full rank if and only if the submatrix

B_i,j =





x_iP_i x_jP_j Ip×p Ip×p





is of full rank. This is equivalent todet(B_i,j) 6=0. Note that deg(det(B_i,j)) = p and the coefficient ofx_i^pisdet(P_i) ∈ {_1,−₁}. Define the polynomial

T=T(x0, x₁, ..., x_k−1) =

∏

0≤i<j≤k−1

det(B_i,j),

and the result follows if there are elements a₀, a₁, .., a_k₋₁ ∈ F such that T(a₀, a₁, ..., a_k₋₁) 6= 0.

T is of degree p(₂^k)and the coefficient of∏^k_i=⁻0¹x_i^p⁽^k⁻¹⁻ⁱ⁾ is∏^k_i=⁻0¹det(P_i)^k⁻¹⁻ⁱ 6= 0. Set for any i, S_i =_F\0 in Theorem 4.9, and the result follows.

The Theorem 4.8 states that there exist coefficients such that the code is MDS, and thus we will focus first on finding proper zigzag permutations{f_j}. The idea behind choosing the zigzag sets is as follows: assume a systematic column(a_0,j, a_1,j, ..., a_p−1,j)^T is erased. Each element a_i,j is rebuilt either by row or by zigzag. The setS = {S₀, S₁, ..., S_p−1}is called a rebuilding set for column(a0, a₁, ..., ap−1)^T if for eachi, S_i ∈ R∪Z and a_i ∈ S_i. In order to minimize the number of accesses to rebuild the erased column, we need to minimize the size of

| ∪_i^p₌⁻₀¹S_i|, (4.6)

which is equivalent to maximizing the number of intersections between the sets{S_i}_i^p₌⁻₀¹. More specifically, the intersections between the row sets inS and the zigzag sets in S.

For a (k+2, k) MDS code C with p rows define the rebuilding ratio R(C) as the average fraction of accesses in the surviving systematic and parity nodes while rebuilding one systematic node, i.e.,

R(C) = ^∑^j^min^S⁰^,...,S^p⁻¹^rebuilds^j| ∪_i^p₌⁻₀¹S_i| p(k+₁)k .

Notice that in the two parity nodes, we access p elements because each erased element must be rebuilt either by row or by zigzag, however∪_i^p₌⁻₀¹S_i contains p elements from the erased column.

Thus the above expression is exactly the rebuilding ratio. Define the ratio function for all(k+2, k)

MDS codes withp rows as

R(k) =min

C R(C),

which is the minimal average portion of the array needed to be accessed in order to rebuild one erased column. By (4.1), we know thatR(k) ≥1/2. For example, the code in Figure 4.4 achieves the lower bound of ratio1/2, and therefore R(3) = 1/2. Moreover, we will see in Corollary 4.17 thatR(k)is almost1/2 for all k and p=2^m, wherem is large enough.

So far we have discussed the characteristics of an arbitrary MDS array code with optimal update.

Next, let us look at our code in Construction 4.1.

Recall that by Theorem 4.8 this code can be an MDS code over a field large enough. The ratio of the constructed code will be proportional to the size of the union of the elements in the rebuilding set in (4.6). The following theorem gives the ratio for Construction 4.1 and can be easily derived from Lemma 4.4 part (i). Recall that given vectorsv₀, . . . , v_k₋₁, we write f_i = f_v_i andX_i =X_v_i. Theorem 4.10 The code described in Construction 4.1 and generated by the vectorsv₀, v₁, ..., v_k₋₁ is a(k+2, k)MDS array code with ratio

R= ¹ 2+ ^∑

k−1

i=0 ∑j6=i|f_i(X_i) ∩f_j(X_i)|

2^mk(k+₁) ^. ^(4.7)

Note that different orthogonal sets of permutations can generate equivalent codes, hence we define equivalence of two sets of orthogonal permutations as follows. LetF = {f₁, f₂, . . . , f_k−1, f₀} be an orthogonal set of permutations over integers[0, p−1], associated with subsetsX₁, X₂, . . . , X_k−1, X₀. And letΣ = {σ₁, σ₂, . . . , σ_k−1, σ₀}be another orthogonal set over[0, p−1]associated with subsetsY₁, Y₂, . . . , Y_k₋₁, Y₀. ThenF andΣ are said to be equivalent if there exist permutations g, h such that∀i∈ [0, k−1],

h f_ig=σ_i, g⁻¹(X_i) =Y_i.

Note that multiplyingg on the right is the same as permuting the rows of the systematic nodes, and multiplyingh on the left permutes the rows of the second parity node. Therefore, codes constructed usingF orΣ are essentially the same.

In particular, let us assume that the permutations are over integers [0, 2^m−1], and the set of permutationsΣ and the subsets Yi’s are the same as in Theorem 4.3: σi = f_e_i,Y_i = {x ∈ [0, 2^m−

1]: x·e_i =0}, andY₀ = {x∈ [0, 2^m−1]: x· (1, 1, . . . , 1) =0}. Next we show the optimal code in Theorem 4.3 is optimal in size, namely, it has the maximum number of columns given the number of rows. In addition any optimal-update, optimal-access code with maximum size is equivalent to the construction using standard-basis vectors.

Theorem 4.11 LetF be an orthogonal set of permutations over the integers[0, 2^m−1], (i) the size ofF is at most m+1;

(ii) if|F| =m+1 then it is equivalent toΣ defined by the standard basis and zero vector.

Proof: We will prove it by induction onm. For m = 0 there is nothing to prove. (i) We first show that |F| = k ≤ m+1. It is trivial to see that for any permutations g, h on [0, 2^m − 1], the set hFg = {h f₀g, h f₁g, ..., h f_k₋₁g} is also a set of orthogonal permutations with sets g⁻¹(X0), g⁻¹(X₁), ..., g⁻¹(X_k−1). Thus w.l.o.g. we can assume that f0is the identity permutation andX₀= [0, 2^m⁻¹−1]. From the orthogonality we get that

∪^k_i₌⁻₁¹f_i(X₀) =X₀ = [2^m⁻¹, 2^m−1].

We claim that for anyi6=0,|X_i∩X₀| = ^|^X₂⁰^| = 2^m⁻². Assume the contrary, thus if |X_i∩X₀| >

2^m⁻², then for any distincti, j∈ [1, k−1]we get that

f_j(X_i∩X₀), f_i(X_i∩X₀) ⊆X₀, (4.8)

|f_j(X_i∩X₀)| = |f_i(X_i∩X₀)| >2^m⁻²= |X₀|

2 . (4.9)

From equations (4.8) and (4.9) we conclude that f_j(X_i∩X₀) ∩f_i(X_i∩X₀) 6= ∅, which contradicts the orthogonality property. If|X_i∩X0| > 2^m⁻² the contradiction follows by a similar reasoning.

Define the set of permutations F^∗ = {f_i^∗}^k_i₌⁻₁¹ over the set of integers[0, 2^m⁻¹−1]by f_i^∗(x) = f_i(_x) −₂^m⁻¹, which is a set of orthogonal permutations with setsX_i^∗ = {_X_i∩X0}, i =_{1, ..., k}−1.

By inductionk−1≤m and the result follows.

(ii) Next we show that if |F| = m+1 then it is equivalent toΣ associated with {Y_i}. Let F= {f₁, f₂, . . . , f_m, f₀}. Take two permutationsg⁰, h⁰such that

g⁰⁻¹(X₁) =Y₁

andh⁰f₁g⁰(Y₁) =Y₁. Define f_i⁰ =h⁰f_ig⁰ for alli∈ [0, m]. Then

f₁⁰(Y₁) =Y₁, f₁⁰(Y₁) =Y₁.

The new set of permutations{f_i⁰}^m_i₌₀ is also orthogonal with subsets{g⁰⁻¹(X_i)}_i^m₌₀, so f_i⁰(Y₁) ∩ f₁⁰(Y₁) =∅. Hence for all i6=1

f_i⁰(Y₁) =Y₁ = [0, 2^m⁻¹−1],

f_i⁰(Y₁) =Y₁ = [₂^m⁻¹_{, 2}^m−₁]_.

By similar argument of part (i), we know{f₂⁰, . . . , f_m⁰, f₀⁰}restricted toY₁(or toY₁) is an orthogonal set of permutations, associated with subsetsY₁∩g⁰⁻¹(X_i)(or withY₁∩g⁰⁻¹(X_i), respectively), i6=1. By the induction hypothesis, there exist permutations p, q over Y1such that fori6=₁

σ_i = p f_i⁰q,

q⁻¹(Y₁∩g⁰⁻¹(X_i)) = Y₁∩Y_i, (4.10)

where σ_i, f_i⁰ are restricted toY₁. Similarly, there exist permutationsr, s over Y₁such that fori6=1

σ_i = r f_i⁰s,

s⁻¹(Y₁∩g⁰⁻¹(X_i)) = Y₁∩Y_i, (4.11)

where σi, f_i⁰ are restricted toY₁. Define permutation g⁰⁰ over[0, 2^m−1]as the union of q and s:

g⁰⁰(x) =q(x)ifx∈Y₁, andg⁰⁰(x) =s(x)ifx∈Y₁. Also defineh⁰⁰over[_{0, 2}^m−₁]as the union ofp and r. So g⁰⁰, h⁰⁰mapY₁(orY₁) to itself. We will show that {f_i}^m_i₌₀is equivalent to Σ using g=g⁰g⁰⁰andh =h⁰⁰h⁰. Fori6=1, this is obvious from (4.10)(4.11). For i =_{1, we have}

g⁻¹(X₁) =g⁰⁰⁻¹g⁰⁻¹(X₁) =g⁰⁰⁻¹(Y₁) =Y₁.

We know σ_i = h f_ig, for i 6= 1. Let f = h f₁g and we will show f = σ₁. By orthogonality f(Y_i) ∩σ_i(Y_i) =_{∅ for i}6=1. It is easy to see that for i∈ [2, m]_{, σ}_i(Y_i) =Y_i. Hence fori∈ [2, m]

f(Y_i) =Y_i, f(Y_i) =Y_i. (4.12)

standard basis duplication of standard basis constant weight vectors

# sys. nodes m+1 s(m+1) O(m^c)

ratio ¹₂ ¹₂+ ^s⁻¹

2s(m+1)+2 ≈ ¹₂+ ¹

2(m+1)

1 2 +_2m^c²

field size 3 s+₂ ₂^c+₁

Figure 4.5: Comparison among codes constructed by the standard basis and zero vector, by s-duplication of standard basis and zero vector, and by constant weight vectors. The number of systematic nodes, the rebuilding ratio, and the finite-field size are listed. We assume that all the codes have2 parities and 2^m rows. For the duplication code, the rebuilding ratio is obtained when the number of copiess is large. For the constant weight code, the weight of each vector is equal to c, which is an odd number and relatively small compared to m.

Moreover, by construction f(Y₁) =h⁰⁰f₁⁰g⁰⁰(Y₁) =h⁰⁰f₁⁰(Y₁) =h⁰⁰(Y₁) =Y₁, so

f(Y₁) =Y₁, f(Y₁) =Y₁. (4.13)

Any integerx ∈ [0, 2^m−1]can be written as the intersection ofY_iorY_i, for alli∈ [m], depending on its binary representation. For example,x=1 means{x} = ∩^m_i₌⁻₁¹Y_i∩Y_m. For another example ifx=_{0 then}{x} = ∩^m_i₌₁Y_i, and f({₀}) = f(∩^m_i₌₁Y_i) = ∩^m_i₌₂Y_i∩Y₁ = {₂^m⁻¹}by (4.12)(4.13) and since f is a bijection. Thus f(0) =2^m⁻¹. By a similar argument, f(x) =2^m⁻¹+x for all x and

f = σ₁. Thus the proof is completed.

Note that by similar reasoning we can show that if |F| = m, it is equivalent to {σ₁, . . . , σm} defined by the standard basis. Part (ii) in the above theorem says that if we consider codes with optimal update, optimal access, and optimal size, then they are equivalent to the standard-basis construction. In this sense, Theorem 4.3 gives the unique code. Moreover, if we find the smallest finite field for one code (as in Construction 4.5), there does not exist a code using a smaller field.

Part (i) of the above theorem implies that the number of rows has to be exponential in the number of columns in any systematic code with optimal ratio and optimal update. Notice that the code in Theorem 4.3 achieves the maximum possible number of columns,m+1. An exponential number of rows can be practical in some storage systems, since they are composed of dozens of nodes (disks) each of which has size in an order of gigabytes. However, a code may corresponds to only a small portion of each disk and we will need the flexibility of the array size. The following example shows a code of flatter array size with a cost of a small increase in the ratio.

Example 4.12 LetT = {v ∈ _F^m₂ : kvk₁ = 3}be the set of vectors with weight 3 and lengthm.

Notice that|T| = (^m₃). Construct the codeC by T according to Construction 4.1. Given v ∈ T,

|{u∈ T :|v\u| =3}| = (^m⁻₃³), which is the number of vectors with 1’s in different positions than v. Similarly, |{u ∈ T : |v\u| = 2}| = 3(^m⁻₂³)and|{u ∈ T : |v\u| = 1}| = 3(m−3). By Theorem4.10 and Lemma 4.4, for largem the ratio is

1 2 + ²

m−1(^m₃)3(^m⁻₂³) 2^m(^m₃)((^m₃) +1) ≈ ¹

2 + ⁹ 2m.

Note that this code reaches the lower bound of the ratio asm tends to infinity, and has O(m³) columns. More discussions on increasing the number of columns is presented in the next section.

In document Coding for Information Storage (Page 50-58)