Multi-layer Structure of Data Center Based on Steiner Triple System

(1)

Available at http://www.Jofcis.com

Multi-layer Structure of Data Center Based on Steiner Triple System

Jianfei ZHANG

¹

, Zhiyi FANG

¹

, S. Q. ZHENG

²

, Guannan QU

^1,

∗

1College of Computer Science and Technology, Jilin University, Changchun 130012, China

2Department of Computer Science, University of Texas at Dallas, Richardson TX 75083, USA

Abstract

Topology structure is considered as a primary factor which largely inﬂuences the performance of data center (DC). There are lots of results which are deployed graph theories or block design theories. In this paper, we develop an innovate DC topology structure named Multi-layer Steiner Triple System, abbr.

M ST S. We will prove that MSTS is able to be scaled in faster than double exponential speed and with short diameter. This signiﬁcant property makes MSTS be built with large number of commercial connection equipments without sacriﬁcing communication performance.

Keywords: Data Center; Network Topology; Steiner Triple System; Scalable

1 Introduction

In recent years, as Internet getting popular, online application services, such as on-line games, e-businesses, on-line videos, SNS, are becoming a hot spot. All of these services are running around a core of data. Enterprises build their own data center (DC) for storing and processing huge amount of data. With data increasing, data centers have to be extended. It is no doubt that the topology structure of data center is precisely the key point to meet this demands. A well designed topology structure is able to connect as many computation units as possible, with high scalability and short diameter. These requirements raise challenges of designing DC topology structure.

Numbers of works were proposed. Bus interconnection network [1] is a networks that the buses provide communication medium between computation units. The solutions of Bus networks are reduce to solving hypergraph. Since it is a point-to-point networks, when it is scaled, its diameter increases rapidly. Hypercube [2] structure C_n is a multi-dimension structure which includes 2ⁿ computation units where n is the number of the dimensions and the diameter of the interconnection networks. But in a C_n, each node has to connect with n other nodes. Every node is required to deploy n communication ports, such that it is not suitable for a large scale

∗Corresponding author.

Email address: [email protected] (Guannan QU).

June 1, 2013

(2)

networks. Complete graph structure K_n includes n computation units, and these units link to each other. In this way, the distance between any pair of computation units is exactly one.

However, every computation unit has to deploy (n− 1) communication port. For Kn with large size n, n! connection wires are hard to be implemented. N-star structure also faces the scalable issue. In N-star structure S_n[3, 4], n! computation units by n!(n−1)/2 switch equipments. DCell structure [5, 6] is a novel data center topology structure which is constructed with mini-switches.

It can be scaled in doubly exponential speed as the node degree increase and keep small diameter.

In this paper, we present an innovate topology structure called Multi-layer Steiner Triple System (M ST S). We will prove that, the number of computation units connected by M ST S increases even faster than doubly exponentially, however, the diameter of a M ST S_lis only 2^l+1−1. Compar- ing to DCell structure, with the same diameter, M ST S can connect times number of computation units. It improves the performance of the communication signiﬁcantly. And computation units can be connected by low-end switch equipments. It largely reduces the implementation cost.

2 Preliminaries

2.1 Block design

Combinatorial Design [7] is a way of selecting subsets from a ﬁnite set to meet speciﬁc conditions, and the conditions may involve set members, set intersections and so on. Block Design is included in Combinatorial Design as one of important parts.

A Design is deﬁned on a ﬁnite set X ={x1, x₂,· · · , xv}, where element xi is called Treatment or Variety. And all treatments are partition into a non-empty collection of subsets of X. The collection of subsets is like B = {B1, B₂,· · · , Bb}, where Bi is called a Block. The number of blocks in which a treatment appears is called the valence of that treatment. The number of blocks in which any pair of treatments appears is called the covalence of that pair of treatments.

A Regular Design is a Block Design which satisﬁes two conditions : I: all blocks have the same size of k, k ≥ 2; II: all treatments have the same valence, such that every treatment appears in the same number r of blocks, where r is called replication number.

From above deﬁnitions, we have

bk = vr, (1)

where v is the number treatments, b is the number of blocks, r is the replication number and k is the size of block.

If all v treatments appear in the same block of a design, such block is called Complete. But if there is at least one incomplete block, we call the design is Incomplete. In an incomplete block design, assuming x and y are two diﬀerent treatments, their covalence is represented as λ_xy. We say a block design is balanced, if each pair of treatments has the same covalence, and they appear in the same number λ of blocks. λ is called the index of the design.

The block design which has been studied more frequently is Balanced Incomplete Block Design, abbr. BIBD. It is represented as a ﬁve-tuple (v, b, r, k, λ). And in a (v, b, r, k, λ)-BIBD, we have

(3)

r(k− 1) = λ(v − 1). (2)

2.2 Steiner triple system

Steiner Systems is a type of block design with λ = 1. A steiner systems, denoted by S(t, k, v), is based on a set V with cardinality v. and V is partitioned into a subsets collection B who has size k. Each t-element subset of V is contained exactly in one member of B. In terms of hypergraph, each element in V is a vertex, and each k-subset in B is a hyperedge, S(t, k, v) is a uniform hypergraph who has v nodes, and each edge contains k nodes.

S(2, 3, v), is a special steiner systems with t = 2 and k = 3. We use ST S(v) to present a steiner triple system of cardinality v. A ST S(v) is built on a elements set of v. The v elements are partition into b blocks and each block contains 3 elements. Each two elements appears in exactly one of b blocks.

A S(2, 3, v) has C(v, 2) 2-subsets and C(v, 3) 3-subsets. Every 3-subset contains C(3, 2) = 3 2-subsets. Thus a solution to S(2, 3, v) has b = C(v, 2)/C(3, 2) 3-subsets. Resolving the equation, we get:

b = v(v− 1)

6 . (3)

r is the number of 3-subsets that an element appears. Apparently, according to (1), we have rv = 3b, i.e.

r = 3b

v . (4)

Replacing the b in Equation (4) by Equation (3), we have

r = v− 1

2 , (5)

and

v = 2r + 1. (6)

The following is an example of ST S(9) on set V ={1, 2, 3, 4, 5, 6, 7, 8, 9}, ST S(9) = {{1, 2, 3}, {1, 4, 5}, {1, 6, 7}, {1, 8, 9},

{2, 4, 6}, {2, 5, 8}, {2, 7, 9}, {3, 4, 9}, {3, 5, 7}, {3, 6, 8}, {4, 7, 8}, {5, 6, 9}},

where ST S(9) is based on a elements set of 9. The elements are partition into b = 12 blocks, and each elements appears in r = 4 blocks. Any pair of elements presents in one block.

The necessary and suﬃcient condition [8] for the existence of ST S(n) is

n≡ 1 or 3 ( mod 6 ). (7)

(4)

And we say that an positive integer n is admissible if and only if n is congruent to 1 or 3 mod 6.

Apparently, the hypergraph representation of ST S(v) is a regular, uniform and linear hyper- graph.

A ST S(v) hypergraph can connect v vertices with degree of ^v⁻¹₂ . It has ^v(v₆⁻¹⁾ hyperedges with rank 3, and any two diﬀerent nodes in hypergraph are connected by exactly one common edge.

3 Multi-layers Structure Based on STS

In this article, we consider a hyper networks modeled by ST S(n) as a basic interconnection networks, and construct a multi-layers interconnection hypernetworks recursively by STS-hyper networks, denoted by M ST S. The M ST S solution is an excellent hypernetworks with advantages of high scalability and short communicating path.

3.1 M ST S structure

The computation units in M ST S have multiple network ports and are connected by low-end connection equipments. In M ST S computation units communicate with each other via com- mercial connection equipments, such as small scale switches. The whole hypernetworks is built recursively, such that high-level M ST S_l is constructed by low-level M ST S_l₋₁. The M ST S₀ is constructed by n computation units and a n-ports switch, all computation units connect to the switch directly. M ST S₀ is the basic block to build larger M ST S.

Considering a M ST S₀ as a single unit, we construct level-1 M ST S₁ using v M ST S₀s. And in a M ST S₁, the rules of connections among all M ST S₀ are refer to the solution of ST S(v). Since one M ST S0 is as an element of a ST S(v), each element presents in r = n 3-subsets. By the Equation (6), M ST S₀ consists of 2r + 1 M ST S₀s. Certainly, the value of 2r + 1 should satisfy the condition of Equation (7). We will prove this later. According to the Equation (3), all 2r + 1 M ST S₀s are connected via ^v(v₆⁻¹⁾ 3-ports switches. We label M ST S₀ with i where i ∈ [1, v].

For every 3-subset {i, j, k} in solution of ST S(v), we pick one computation unit which has not connect to any level-1 switches from i, j and k M ST S₀, and connect them by one 3-port switch.

1 2 3 4 5 6 7

(1,2,4) (1,3,7) (1,5,6) (2,3,5) (2,6,7) (3,4,6) (4,5,7)

Fig. 1: A M ST S1 hypernetworks with n=3.

(5)

An example of M ST S₁ with n = 3 is shown in Fig 1. In M ST S₁, we treat M ST S₀ as a virtual unit, and all M ST S₀s form hypernetworks according to the solution of ST S(v). The level-2 and higher M ST S_l are constructed in the same way. If the M ST S_l₋₁ has been constructed and each M ST S_l₋₁ has r_l₋₁ computation units, we can build a M ST S_l with v_l = 2r_l₋₁+ 1 M ST S_l₋₁s. As construction of M ST S₁, we treat M ST S_l₋₁ as a virtual unit, and all M ST S_l₋₁s form a solution of ST S(v_l). Let r_l denote the total amount of computation units of M ST S_l, we have

v_l = 2r_l₋₁+ 1, (8)

and

r_l = v_l× rl−1. (9)

for l≥ 1. The basic block MST S0 is diﬀerent with v₀ = 1 and r₀ = n, where n is the number of computation units in M ST S0.

Since in different level i, v_i, the number of virtual node M ST S_i₋₁, are different. In order to make sure that for i≥ 1, vi always satisfies the condition (7), we have follow theorem.

Theorem 1 If n≡ 0 or 1 (mod 3) , then ST S(Vl) exists for l ≥ 1.

Proof According to the necessary and suﬃcient condition for existence of a ST S(v), there exists a solution of ST S(v), if and only if n ≡ 1or3 ( mod 6 ), where n is the number of connection equipment ports in M ST S₀. If n≡ 0 or 1 ( mod 3 ), then r0 = n ≡ 0 or 1 ( mod 3 ).

We consider following two cases:

Case I: suppose r₀ = 3m, m = 1, 2,· · · , n, · · · .

According to (6), we have v₁ = 2r₀+ 1, thus v₁ = 6m + 1, m = 1, 2,· · · , n, · · · . Obviously, v1

satisﬁes the condition (7). According to Equation (9), r₁ = v₁× r0, such that r₁ = 3(6m²+ m).

Apparently, r₁ ≡ 0 (mod 3). Let m1 = 6m²+ m, we have r₁ = 3m₁. And v₂ = 6m₁+ 1 satisﬁes the condition (7). In the same way, for i ≥ 1, vi = 6m_i₋₁+ 1 satisﬁes the condition (7), where m_i−1 = 6m²_i₋₂+ m_i−2.

Case II: suppose r₀ = 3m + 1, m = 1, 2,· · · , n, · · · .

According to Equation (6), we have v₁ = 2r₀+1, thus we have v₁ = 6m+3, m = 1, 2,· · · , n, · · · . Therefore v₁ satisfies the condition (7). From Equation (9), we have r₁ = v₁ × r0, where r₁ = 3(6m²+ 5m + 1). Let m₁ = 6m²+ 5m + 1, we have r₁ = 3m₁. v₂ = 6m₁+ 1 satisfies the condition (7). As same as case I, for i≥ 1, vi satisfies the condition (7).

Concluding above two cases, theorem 1 is proved.

3.2 Properties of MSTS

Firstly, we investigate how many computation units a M ST S_l can be included.

Theorem 2 2^l(n + ¹₄)²^l −¹₄ < r_l < 2^l(n + 1)²^l− 1.

(6)

Proof

From Equation (9) and (8), we have

r_l = r_l₋₁× (2rl−1+ 1).

Thus, we get

rl = 2(rl−1+1 4)²−1

8 > 2(rl−1+1 4)²− 1

4, that is

rl+1

4 > 2(rl−1+1 4)². Solving above formulation iteratively, we get

rl> 2^l(n +1

4)²^l− 1

4, (10)

and

r_l = 2(r_l₋₁+ 1)²− 3rl−1− 2 < 2(rl−1+ 1)²− 1. (11) Equation (11) can be simpliﬁed as

r_l+ 1 < 2(r_l₋₁+ 1)². Solve it iteratively, we get

r_l < 2^l(n + 1)²^l− 1. (12)

From Equation (10) and Equation (12), Theorem 2 is proved.

From theorem 2, we ﬁnd out that the number of computation units included in a M ST S_l increases even more rapidly than doubly exponentially. Table 1 shows some examples. The ﬁrst column of Table 1 lists the number of computation units in M ST S₀ n. The other three columns are value of l, which is the number of layers of whole M ST S structure from 1 to 3 respectively.

And the numbers in other cells are the number of computation units connected by a M ST S with corresponding n and l connects. For instance, when n = 6 and l = 3, the number of computation units in a M ST S₃ is more than 299-million. With the same n, as layer number increasing, the amount of computation units scales hundreds even thousands times. Not only the amount is increasing, but the increasing speed is accelerating. In columns, a small addition to n leads to big expanding of amount of computation units in any layer. Although every computation unit has to deploy more than one communication port, compared to the size of the whole network being extended to, the number of links that one computation unit has is still small.

The diameter of a networks is deﬁned as the length of longest shortest path between every pair of nodes. The length of communication path between any pair of computation units in a

(7)

Table 1: Computation Unit Number in M ST S_l n l = 1 l = 2 l = 3

6 78 12,246 299,941,278 9 171 58,653 6,880,407,471

data center is not longer than the diameter. It means that the diameter is a upper bound of the path length. Small diameter provides eﬃcient communication. The following theorem proves the diameter of the M ST S_l networks.

Theorem 3 The diameter of a l-level M ST S_l is 2^l+1− 1.

Proof

As we know, for 1 ≤ i < l, any two different MST Sis are connected by one connection equipment. Without loss of generality, let s and d be source and destination computation units respectively, where s and d belong to different M ST S_l₋₁. We use n₁ and n₂ to denote the medium computation units which are in the same M ST Sl−1 as s and d respectively. In the case of longest path, s̸= n1 and d̸= n2. The problem is reduced to to find the sub-paths between s to n₁ and n₂ to d. The length is

Len_l= 2Len_l₋₁+ 1. (13)

Solving Equation (13) recursively, we get

Len_l = 2^l+1− 1.

Theorem 3 is proved.

Diameter is one of important factors for designing a data center topology structure. From theorem 2, we have r_l> 2^ln²^l. Such that 2^l < log^r_n^l−logn²^l < log^r_n^l. The diameter of M ST S is less than 2 log^r_n^l−1. As a comparison, a fat-tree structure has a diameter of 2 log^N2 , where N is the number of nodes. Let D_{f t} and D_msts be the diameter of fat-trees and M ST S respectively. With same number of nodes, r_l = N , we have D_{f t}/D_msts≃ logⁿ2. According to theorem 1, n = 3m or 3m + 1 for m = 1, 2, 3,· · · , n, · · · , thus

[Df t/Dmsts]≃ log^3m₂ or [Df t/Dmsts]^′ ≃ log^3m+1₂ .

A hypercube structure has a diameter of log^N₂ , with N is the number of nodes. And in the same way, we have a similar result

[Dhc/Dmsts]≃ log^3m2 /2 or [Dhc/Dmsts]^′ ≃ log^3m+12 /2.

where D_hc is the diameter of a hypercube.

From Table 2, no matter what the value of n is, to connect same number of nodes, the diameter of Fat-trees structure is multiple times diameter of M ST S. And the same result is held with

(8)

Table 2: Diameter Ratio

m 3m [D_{f t}/D_msts] [D_hc/D_msts] 3m + 1 [D_{f t}/D_msts]^′ [D_hc/D_msts]^′ 1 3 log³₂ (log³₂)/2 4 log⁴₂ (log⁴₂)/2 2 6 log⁶₂ (log⁶₂)/2 7 log⁷₂ (log⁷₂)/2 3 9 log⁹₂ (log⁹₂)/2 10 log¹⁰₂ (log¹⁰₂ )/2

hypercube structure, when n > 4. Comparing to a DCell structure whose diameter is 2^k+1− 1, k is the number of levels of a DCell, M ST S can connect 2^l times computation units than DCell with the same diameter. In another words, diameter of a M ST S is shorter than DCell with the same scale.

4 Concluding Remarks

In this paper, we presented an innovative topology networks of data center. called Multi-layer Steiner Triple System (M ST S). The construction of M ST S structure is based on the theory of Steiner Triple System which is a type of block design. In M ST S structure, computation units are connected by low-end commercial connection equipments. The level-0 consists of one n-ports connection equipments and n computation units. And in each higher level, M ST S_l consists of M ST S_l₋₁s and organizes them according to solution of a ST S. Each computation unit has one level-i links, which connects with two other computation units in diﬀerent M ST S_i₋₁s. We proved that M ST S has outstanding properties on scalability and communication path. By selecting n carefully, an available solution to construct the M ST S for any layer can be guaranteed. The amount of computation units scales by speed of times double exponent, and the diameter of a M ST S is also competitive.

References

[1] J-C. Bermond, F.O. Ergincan, Bus Interconnection Networks, Discrete Applied Mathematics, issue 68, pp. 1 - 15, 1996.

[2] Y. Saad, M.H. Schultz, Topological Properties of Hypercubes, IEEE Trans. on Computers, issue 7, vol. 37, pp. 867-872, 1988.

[3] S.G. Akl, T. Wolﬀ, Eﬃcient sorting on the star graph interconnection network, Telecommunication System, issue 10, pp. 3-20, 1998.

[4] D.K. Saikia, R.K. Sen, Two ranking schemes for eﬃcient computation on the star interconnection network, IEEE Trans. on Parallel and Distributed Systems, vol 7, no.4, pp. 321-327, 1996.

[5] C.Guo, H.Wu, K.Tan, L.Shi, Y.Zhang, S.Liu, DCell: A Scalable and Fault-Tolerant Network Struc- ture for Data Centers, Proc. of the ACM SIGCOMM 2008 conference on Data communication, August 17 - 22, 2008.

[6] M. Kliegl, J.Lee, J. Li, X. Zhang, C. Guo, D. Rincon, Generalized DCell Structure for Load- Balanced data center networks, proc. of Infocom 2010, 2010.

[7] W.D. Wallis, “Combinatorial Designs,” CRC Press1988.

[8] Stephen J. Hartley, Aaron H. Konstam, Using Genetic Algorithms to Generate Steiner Triple Systems, Proc. of the 21st Annual ACM Computer Science Conference, 1993.