• No results found

CONFIGURATION MANAGEMENT TECHNOLOGY FOR LARGE-SCALE SIMULATIONS

N/A
N/A
Protected

Academic year: 2021

Share "CONFIGURATION MANAGEMENT TECHNOLOGY FOR LARGE-SCALE SIMULATIONS"

Copied!
11
0
0

Loading.... (view fulltext now)

Full text

(1)

A. Sekiguchi, K. Shimada, Y. Wada, A. Ooba, R. Yoshimi, and A. Matsumoto.

ISBN 1-56555-374-8

12

CONFIGURATION MANAGEMENT TECHNOLOGY

FOR LARGE-SCALE SIMULATIONS

Atsuji Sekiguchi, Kuniaki Shimada,

Yuji Wada

Cloud Computing Research Center

FUJITSU LABORATORIES LTD.

Kawasaki, Kanagawa 211-8588 Japan

Akio Ooba, Ryouji Yoshimi,

Akiko Matsumoto

Service Management Middleware Division

FUJITSU LIMITED

Kawasaki, Kanagawa 211-8588 Japan

ABSTRACT

Large-scale simulation-based experiments are a key value-creation technique as well as a big hurdle for

M&S experts because of its long simulation execution time. To overcome this hurdle, the experts might

utilize cloud computing services in their simulation runs, yet they confront challenges of properly setting

the services that they are not familiar with. In large-scale ICT (Information and Communication

Technol-ogy) systems, such as cloud computing, a reduction of operation management workloads and stabilization

of the systems are requested. Configuration management can reduce the workloads and improper settings

of design by describing the rules for the design and verifying the configurations. However, it is too hard

for non-experts to describe the rules correctly. We thus developed configuration management technology

that does not require rules written by operation managers. The configurations have relationships between

hardware and software that corresponds to the tree structure of the ICT system. Our technology uses the

relationships as consolidation rules that should be satisfied by the configurations. We implemented a

pro-totype of our technology and applied it to real systems. As a result, 94 percent (117,632/125,286) of the

configurations converged under an environment of servers with uniform configurations, and 94 percent of

the workloads were also reduced. With our reduced setting requirements, we expect that cloud computing

services embrace broader range of users, such as M&S experts to run large-scale simulations.

1

INTRODUCTION

Recently, ICT systems have become larger and more complicated with the spread of cloud computing

da-ta centers (DCs), server virtualization technology, and other factors. As a result, operations management

of systems has become more complicated and workloads for operations management have increased.

However, operations management workloads must be reduced because customers of the systems request a

reduction in the costs of operations management. On the other hand, stabilization of systems is requested

as a social obligation because ICT systems are widely used in the social infrastructures. Thus, achieving

reductions in operations management workloads and stabilization is a problem in the operations

manage-ment of ICT systems.

ICT systems might cause large-scale problems and stop the supply of services because of a few improper set-tings (AWS Team 2011, Brown 2004). The configuration is a set of pairs of parameters and values that determine the behavior of devices and software, such as the servers, networks, and storage components of the ICT system. As examples of parameters, there are many kinds of information to identify the devices on the network (e.g., IP address: Internet Protocol address), to cooperate with other devices (e.g., DNS: Domain Name System). The values can be a number or a string (e.g., in the case of the host name, "server1"). The improper setting is to set the wrong value to a parameter.

Moreover, in large-scale ICT systems, operations management workloads for the configurations also increase. In general, the number of configurations increases in proportion to the number of devices and the software of the ICT systems. Thus, the workloads of design and the verification become larger.

(2)

ISBN 1-56555-374-8

13

Configuration management is needed for the following two reasons: 1) a reduction is required in the workloads of operations management for such configurations, and 2) stabilization is required from a reduction in the improper settings. In this paper, we define configuration management as the process of design, verification, and deployment of configurations for running the devices and software of the ICT systems correctly. Configuration management is necessary for the initial construction and changes (addition, modification, or removal of anything that could have an effect on ICT Services in ITIL (Lacy and Macfarlane 2007) in the systems. Usually, the system changes many times during its lifecycle. For the initial construction and change in the systems, improper settings can occur in the design, verification, and deployment.

Configuration management is not only a task of ICT professionals like as operation managers. Users, who use cloud computing resources for large-scale calculation such as simulation, might manage a large number of configu-rations of many servers. For instance, HLA (High Level Architecture, IEEE 1516) is a distributed architecture for distributed simulations. HLA can control simulators which run on many servers, but does not support proper config-urations of servers. Users who will set up HLA on cloud computing resources need to cope with configuration man-agement for their many servers.

To reduce the workloads of design and verification and to reduce the number of improper settings, several re-searches report that the configurations should satisfy rules that automate the design and verification (Eilam et al. 2006; Hagen and Kemper 2010). In these research studies, the rules are supposed to be described by the operation managers. However, it is difficult (especially, non-experts) to describe and maintain these rules. Because mistakes in describing the rules, that is bugs, are easily generated, improper settings cannot be easily reduced.

To reduce the number of improper settings in cloud computing resource, several researches report that the be-havior of the ICT system can be verified by the simulation of cloud computing resources (Joshi, Gunawi, and Sen 2011; Calheiros et al. 2011). These approaches may detect improper settings. However, these approaches require additional workloads, because these approaches require a large number of test cases to detect improper settings.

We developed configuration management technology that does not require rules written by operation managers. We focused on the following feature: when thinking like that ICT system has a tree structure such as servers, racks, and DCs, the nodes correspond to devices and software applications. Two parameters of the nodes in a subtree or two subtrees may have the same value (e.g., the value of the network address parameters on all servers in rack A). Based on this feature, we classified the rules of the relationships of parameters and values. Using these rules, opera-tion managers associate the rules with the configuraopera-tions. Our technology manages the parameters by consolidating two or more parameters with the same name and value based on the associated rules. As a result, the number of con-figurations managed by operation managers can be reduced. The workloads of design and verification can also be reduced. Moreover, because improper settings can be discovered by checking whether the rules are satisfied by the parameters and the values, system stability can be improved.

The rest of this paper is composed as follows. First, we treat related work in section 2. Next, we introduce our developed technology in section 3. In section 4, we explain an experiment that uses this technology in two large-scale environments. Finally, we end with the conclusion and future work.

2

RELATED WORK

We explain two stages and three components in configuration management and the improper settings that

occurred in these stages. Then, we describe other research studies.

Configuration management is requested for the following two stages of ICT systems: 1) initial construction, which is executed only once at the start of the life cycle of the ICT system, and 2) change, which is executed at the addition, modification, or removal of anything on the system in the life cycle of the ICT system.

Configuration management consists of the following three components: A) design, which identifies the re-quirements of system change and defines the solution that meets the rere-quirements; B) verification, which confirms the correctness of the configurations, and C) deployment, which applies the configurations to the devices. The im-proper settings that cause problems may occur in each component. For instance, in component A, the configurations may not be designed properly (mistakes in design). In component B, the improper settings of the configurations may not be detected (mistakes of the verification). In component C, the configuration may be deployed to the devices in-correctly (mistakes in deployment).

ITIL (Information Technology Infrastructure Library) is the name given to best practices for operations man-agement (Lacy and Macfarlane 2007). Component A corresponds to “change manman-agement” in ITIL. Components B and C correspond to “release and deployment management” in ITIL. However, ITIL does not describe a specific method to reduce the workloads and the number of improper settings for design and verification.

(3)

ISBN 1-56555-374-8

14

With regard to components A and B to reduce the workloads for design and verification, some approaches are formal methods and policy-based management (VDM 2012; alloy 2012; Eilam et al. 2006; Hagen and Kemper 2010). In these approaches, operation managers describe the rules with declarative languages in advance. These rules consist of conditions that create the configurations and constraints to check the configurations. Based on the rules, a large number of configurations can be created and verified automatically. Thus, operation managers can re-duce the workloads and the number of improper settings for design and verification. However, it is difficult for op-eration managers (especially, non-experts) to describe and maintain these rules. Therefore, mistakes in the descrip-tion of the rules (bugs) are easily generated. Bugs in the rules for component A generate improper settings, and bugs in the rules for component B prevent the detection of improper settings because the configuration cannot be properly verified.

About component C to reduce the workloads for design and verification, there are many approaches, such as deployment automation (CFEngine 2012; Puppet 2012; Opscode 2012; BMC 2012). These approaches deploy the configurations to devices automatically. Because there is no human operation, this approach can prevent mistakes in deployment caused by human error. However, it cannot solve the problem about components A and B.

Some approaches define models to measure configuration complexity and the workloads (Brown, Keller, and Hellerstein 2005; Diao et al. 2007). These approaches do not describe a specific method to reduce the workloads and the improper settings.

3

DEVELOPED TECHNOLOGY

First, we describe the features of the configurations. Next, we analyze the ICT systems and the configurations. Then, we explain the approach of our technology. At the end of this section, we analytically evaluate our approach.

3.1

Features of the configurations

A large-scale ICT system has a large number of devices and software applications, such as servers,

net-works, storage, and middleware. The devices and software applications have configurations. The

configu-rations consist of a set of pairs of parameters and values, which determine the behavior of the devices or

software applications as described in section 1. There are various parameters such as the host name, IP

address, and timeout setting. A parameter must be defined with a value (e.g., Timeout=120; 120 is the

value).

Hereafter, for simplicity, we will focus on servers as the target devices in this paper. On a server, the pair of a parameter and a value is listed in the configuration file. One configuration file contains one or more parameters and corresponding values (Figure 1-(A)). The same parameters might be listed in different configuration files. To distin-guish these parameters, we define an item as the pair of "a file path in a configuration file where the parameter is listed" and "the parameter" (Figure 1-(B)). This uses the fact that two file paths of two different configuration files are always different.

Figure 1: A server configuration structure.

Because a device or a software application has many items, the number of items for the entire system is substan-tial. The number is expressible as follows. Let N denote the number of servers. Let Mi denote the number of

configu-ration files for any server si (i=1 to N). Moreover, let pij denote the number of parameters in a configuration file fij on si (j=1 to Mi). At this time, let np denote the total number of items on all servers,

Value Value Configuration Configuration Configuration File Parameter ex. Timeout Value ex. 120

File Path ex. /etc/httpd/conf/httpd.conf

Item File Path Parameter Value

(4)

ISBN 1-56555-374-8

15

N i M j ij p i

p

n

1 1

.

Therefore, np is enormous when N and Mi are numerous in a large-scale ICT system such as cloud computing.

Hitherto, handling of the items for the number of np are required to design, verify, and deploy in every stage of

the change as shown as Figure 2. The workloads for design and verification become greater in proportion to the number of items. On the other hand, a few improper settings may cause large-scale problems and stop the supply of services.

Figure 2: A flow of a configuration management.

3.2

An analysis of the ICT system and the configurations

The ICT system has a tree structure that consists of a hierarchy such as servers, racks, and DCs (Figure 3).

In Figure 3, the node of the tree stands for a server, rack, and DC. The edge of the tree stands for a

rela-tionship of inclusion; if an upper node and a lower node are connected by an edge, the upper node

con-tains the lower nodes. For instance, a server is consolidated in a rack, and, a rack is consolidated in a DC.

Because a leaf node corresponds to a server, the items exist in every leaf nodes.

In the tree structure of the ICT system, two servers A and B in an arbitrary subtree may have items that have the same file path and the parameter. When the values of these items are the same, we define "the items are the same in the subtree". Some items in a subtree must be the same because of the system design such as the networking. For in-stance, the gateway addresses of servers in a rack are the same, and, the net-mask values in servers in a DC are the same.

Figure 3: A tree structure of a large-scale ICT system.

Two servers A (in an arbitrary subtree t1) and B (in another subtree t2) may have items that have the same file

path and the parameter. When the values of these items are the same, we define "the items are the same in the subtrees t1 and t2". Some items in subtrees must be the same because of the system design such as the location of

DCs. For instance, time-zone values in servers in a DC and another DC are the same.

In contrast, some items should have a different value in a subtree. Two servers A and B in an arbitrary subtree may have items that have the same file path and the parameter. When the values of these items are different from each other, we define "the items are different in the subtree". Some items in a subtree must be different because of the system design such as the networking. For instance, IP addresses of servers in a rack are different.

Two servers A (in an arbitrary subtree t1) and B (in another subtree t2) may have items that have the same file

path and the parameter. When the values of these items are different from each other, we define "the items are

dif-DC Rack Server (Root) Configuration Files Servers Administrator design (~ np) deploy np item verify repeat

(5)

ISBN 1-56555-374-8

16

ferent between subtrees t1 and t2". For instance, domain name of servers in a DC are different from the value of

an-other DC.

Generalizing the above, we can classify the items of the system using the following two axes, 1) Commonality and 2) Comparison Targets.

1)

Commonality: the same/different value

a)

Same: the values must be the same.

For example, the gateway setting of the servers

b)

Different: the values must be different from the others.

For example, IP addresses of the servers

2)

Comparison Targets: in a subtree / in two subtrees

a)

In a subtree: comparison targets are certain servers in a subtree.

For example, two servers in a rack

A

b)

In two subtrees: comparison targets are certain servers in two subtrees. One of the targets is in a

subtree, and the other is in another subtree.

For example, a server in a rack and another server in another rack

Consequently, the items can be classified into four types (1a-2a, 1b-2a, 1a-2b and 1b-2b) by two types of Com-monality and two types of Comparison Targets (Table 1).

Table 1: A classification of configurations.

Commonality

Same

Different

Comparison

Target

In a subtree

1a-2a

1b-2a

In two subtrees

1a-2b

1b-2b

3.3

Approach

Based on the above analysis, we developed a configuration management technology that supports the

work of A) design and B) verification in the stages of initial construction and changes.

The point of our technology is to consolidate items that have the same file path, parameter, and value in a subtree and to manage consolidated items. As a result, the workloads can be reduced by decreasing items to be man-aged, and the improper settings can be reduced by checking rules such as "values are the same/different".

We show the flow of the configuration management using our technology in Figure 4. The Configuration man-agement consists of the following five components: 1) preparation, 2) creation of a tree structure of the configura-tions, 3) design, 4) verification, and 5) creation and deployment of the configurations. 1) - 5) correspond to 1) - 5) in Figure 4.

Figure 4: A flow of a configuration management by this technology.

preparation (~ np) design ( ~ nc) deploy np nc item verify repeat Tree Structure Configuration Files Servers Admini-strator 1) 2) 3) 4) 5) creation of a tree

(6)

ISBN 1-56555-374-8

17

1) Preparation

This is the work of the initial construction stage. The input is the items and Table 1. We regard the four

classifications of Table 1 as rules of verification. In this paper, the rules mean constraints that should be

satisfied by the values of the items. Based on the analysis of section 3.2, we treat an item as the same for

all of the servers.

The operation managers select the rules from Table 1 for the items. About the item, they select the root node of a subtree that is the target of the rule. For example, if item i should have the same value in a subtree that has the root node nr, the rule of i is 1a-2a (the same value in a subtree) and the target node is nr.

The output is the result of the classification of the rule of each item.

2) Creation of a tree structure of the configurations

This is the work of the stage of initial construction. The input is a set of pairs of the item and value and

the output of 1). Our technology associates the set of pairs of the item and value with the nodes of the tree

structure of the ICT system by using the rules. We explain how to associate it by the following four cases.

Case 1: In all of servers in the subtree t1, if items have the same file paths, parameters, and values, the items are

classified under the rule 1a-2a. We create a new item i1 that have the same file path, parameter, and value of these

items. The item i1 stands for these items. We associate the item i1 and the rule 1a-2a with the target node. The target

node is the root node n1 of t1. The information is shown as No. 1 in Table 2. In this case, the items of the number of

the servers are consolidated to one item i1.

Case 2: In all of servers in two subtrees t2 and t3, if items have the same file paths, parameters, and values, the

items are classified under the rule 1a-2b. We create a new item i2 that have the same file path, parameter, and value

of these items. The item i2 stands for these items. We associate the item i2 and the rule 1a-2b with the target nodes.

The target nodes are the root nodes n2 and n3 of t2 and t3, respectively. The information is shown as No. 2 in Table 2.

In this case, the items of the number of the servers are consolidated to one item i2.

Table 2: An example of information of associated item, rule, and nodes.

No.

Item

Rule

Node(s)

1

i

1

1a-2a

n

1

2

i

2

1a-2b

n

2

, n

3

3

i

3

1b-2a

n

4

4

i

4

1b-2b

n

5

Case 3: In all of servers in the subtree t3, if there are some items that have the same file paths and parameters,

and the values of the items are different from each other, these items are classified under the rule 1b-2a. We associ-ate these items and the rule 1b-2a with each leaf nodes. For instance, in a case of the item i3 in these items, we

asso-ciate the item i3 and the rule 1b-2a with the target node. The target node is the leaf node (server) n4 that have i3. The

information is shown as No. 3 in Table 2.

Case 4: In all of servers in two subtrees t2 and t3, if there are some items that have the same file paths and

pa-rameters, and the values of the items are different from each other, these items are classified under the rule 1b-2b. We associate these items and the rule 1b-2b with each leaf nodes. For instance, in a case of the item i4 in these items,

we associate the item i4 and the rule 1b-2b with the target node. The target node is the leaf node (server) n5 that has i4. The information is shown as No. 4 in Table 2.

The output is the tree structure associated with the sets of pairs of the items and the rules.

3) Design

This is the work of the initial construction stage and the change stage. The input is the output of 2) or 4).

The operation managers design and change the values of each item. In the case of the consolidated item,

because it stands for many items, the operation managers can change the values on many servers by

changing one value of the item.

(7)

ISBN 1-56555-374-8

18

The output is similar to 2), but the values may be changed.

4) Verification

This is the work of the stages of initial construction and changes. The input is the output of 3). This

tech-nology verifies the configurations by checking whether the value, which may be changed by the operation

managers in 3), satisfies the rule for the item. It checks that the rules described in section 3.2 are satisfied

with the values of the items. If it is not satisfied, the operation managers are notified the error. They can

remove the error with a repeat of 3) and 4).

For instance, in a case of the item classified under the rule 1b-2a, the node associated with the item is the root node of a subtree, and the value of the items of the subtree should be different from each other. If items that have the same value exist, we regard the rule as having been violated.

The output is a verification result of each item.

5) Creation and deployment of the configuration

This is the work of the initial construction stage and the change stage. The input is the output of 3). This

technology makes the configurations from the tree structure associated with the set of the items and the

rules, and deploys them to the servers.

As we explained in 2), the items that are classified under the rule 1a-2a or 1a-2b are associated with the internal nodes of the tree structure. The other items that are classified under the rule 1b-2a or 1b-2b are associated with the leaf nodes. Therefore, the items of a server can be made by collecting items associated with nodes through the tree structure from the leaf node (server) to the root node.

Thus, in a case of the item that belongs to the rule of "Same" (cf. 1a-2a and 1a-2b in Table 1), the operation managers can consistently deploy the set of items to many servers simply by changing one consolidated item.

The output is the configuration files for each server.

3.4

Evaluation

We qualitatively evaluate the design and verification workloads that can be reduced using this technology.

We assume that the workloads are proportional to the number of managed items, and we compare

be-tween the number of the items of this technology and the number of the items of the existing method. The

total number of the parameters of the existing method is

n

p

as described in section 3.1.

Here, we define the total number of independent items nt. The independent items mean the items that have

dif-ferent file path and parameter from each other. Hence, nt < np.

In the work of the preparation (cf. section 3.3-1), the operation managers have to specify the rules for the items of the number of nt in the case of using our technology. Because this work did not exist under the existing method,

the operation managers' workloads will be increased compared with the existing method.

On the other hand, in the stage of the change, the design and verification for np was required in the existing

method. In this technology, only the number nc (nt < nc < np) of the consolidated items has to be designed and

veri-fied (cf. the right side of Figure 4). When the number of the consolidated items increases, nc approaches nt (the

ef-fect of the reduction of items rises). In the best case, nc = nt. For instance, if parameters of a hundred servers (np =

100) are consolidated to only one item, nc = nt = 1. That is, the number of items becomes 1/100. Moreover, the

de-sign and verification are repeated in each stage of the change.

Therefore, even if the workloads of the preparation are considered, the total workloads that include the work-loads of the preparation, design, and verification can be reduced when there are many consolidated items.

4

EXPERIMENT

We experimented to confirm the reduction of the workloads and the detection of the improper settings by

our technology. We explain the experiment and the result in the following order: 4.1 Experiment method,

4.2 Experimental environments, 4.3 Evaluation method, 4.4 Result and 4.5 Consideration.

(8)

ISBN 1-56555-374-8

19

4.1

Experiment method

We implemented an experimental tool that executes 1) – 5) described in section 3.3. We evaluated this

tool in two environments described in 4.2.

In this experiment, we picked up all of the items that were changed from default value in the environments at first. In addition, we calculated the total number of items np and the total number of independent items nt. The

inde-pendent items mean the items that have different file path and parameter from each other. As rules for each item, the operation managers selected ten practical rules as follows.

The items have the same or different values in a subtree that has the following node as the root

node: 1. the root node of the tree structure of the ICT system, 2. an internal node of the layer of

DCs, and 3. an internal node of the layer of racks. Three target nodes times two types of rules is

six rules.

The items have the same or different values in two subtrees that have the following nodes as the

root nodes: 1. two internal nodes of the layer of DCs, and 2. two internal nodes of the layer of

racks. Two target nodes times two types of rules is four rules.

4.2

Experimental environments

The data of two environments (A and B) are shown in Table 3. The environment A is an in-house system

that offers servers of uniform configurations for software development and tests. About a hundred servers

belong to one group. The configurations of the servers in the group are almost the same (e.g., language,

gateway, and time zone). However, a little number of configurations is different (e.g., IP address, and host

name). Therefore, the configurations can be consolidated in 1/100 ideally. The total number of the servers

is 1,020.

n

p

is 132,940 and

n

t

is 603.

Table 3: Information of the experimental environments.

Environment

Type

Number

A

The total number of servers

1,020

The total number of parameters [

n

p

]

132,940

The total number of items [

n

t

]

603

B

The total number of servers

292

The total number of parameters [

n

p

]

18,096

The total number of items [

n

t

]

3,016

In contrast, environment B is an in-house system that consists of the typical three layers of a web, an application, and a database. It has six networks. In each network, the servers have individual configurations. However, the con-figurations for each network are similar. Thus, the concon-figurations can be consolidated as 1/6 ideally. The total num-ber of the servers is 292. np is 18,096 and nt is 3,016.

4.3

Evaluation method

We evaluated the effect of the reduction of the workloads and improper settings. First, we define an index

X

[%] to evaluate the effect of the reduction of the workloads. Here,

X

is the reduction rate of the number

of parameters. We define

X

as

X

= (1 –

n

c

/

n

p

) * 100, where

n

c

is the number of consolidated items and

n

p

is the total number of parameters. Thus, the larger

X

, the more effective the reduction of the workloads.

As described in section 3, the items that have the same file paths, parameters, and values are consolidated in one item by creating a tree structure of the configurations. For instance, items of a hundred servers are consolidated in one if the items have the same file path, parameters, and values. In this case, X = (1 – 1/ 100)* 100 = 99[%].
(9)

ISBN 1-56555-374-8

20

Next, we confirmed the effect of the reduction of the improper settings by detecting the improper settings using the verification of the rules and the values of the items. If this technology can detect improper settings, this technol-ogy is useful for the verification of the configurations.

As described in section 3, we verify the configurations as follows. We compared the value of each item with its rule that had selected by the operation managers. If the item violated the rule, we assumed an error. The error shows the level of the gap between the classification of the rule by the operation managers and the real settings.

4.4

Result

We show the experiment result in Table 4. The total number of consolidated items (

n

c

) is 7,654 and

11,658 in the environment A and B, respectively. The number of reduced items (

n

r

, where

n

r

=

n

p

n

c

) is

125,286 and 6,438. The reduction rate

X

is 94% and 36%. The number of the error items (

n

e

) is 47 and

174 in environments A and B, respectively.

Table 4: Results.

Environment

Item

Number

A

Converged Items[

n

c

]

7,654

Reduced Items[

n

r

]

125,286

Reduced Rate [

X

]

94%

Error Items [

n

e

]

47

B

Converged Items[

n

c

]

11,658

Reduced Items[

n

r

]

6,438

Reduced Rate [

X

]

36%

Error Items [

n

e

]

174

4.5

Consideration

The effect of the reduction of the workloads

: Two results of

X

reflect the uniformity of the environments.

X

is high in the high uniform environment such as A. Therefore, the reduction rate of the workloads of the

operations management becomes high in high uniform environments. Because the uniformity of an

envi-ronment of cloud computing seems to be high, a large contribution for the reduction rate of the workloads

can be expected in the environment of cloud computing.

The effect of the reduction of the improper settings

: We examined the errors carefully. The errors consist

of the improper settings and improper selections of the rules by the operation managers. We confirmed

the improper settings in both environments A and B. These improper settings were not known by the

op-eration managers. The both of improper settings and improper selections of the rules can be reduced by

reviewing the configurations and the rules with the detection of such errors. Thus, a contribution for the

reduction of the improper settings can be expected in ICT systems.

5

CONCLUSION AND FUTURE WORK

We focused on the characteristics of the configurations that have relationships (such as the same or

differ-ent) between two devices or software applications that correspond to two nodes in a tree structure of the

ICT systems (such as servers, racks, and DCs). By using the relationships as consolidation and

verifica-tion rules that should be satisfied by the configuraverifica-tions, we developed a configuraverifica-tion management

tech-nology that does not require rules written by operation managers. This techtech-nology reduces the workloads

for design by consolidating the same configurations and decreasing the number of managed

(10)

configura-ISBN 1-56555-374-8

21

tions. Moreover, the improper settings are discovered by verifying the configurations based on the

rela-tionships. We evaluated this technology in two environments with different levels of uniformity. We

con-firmed the reduction rate of configurations and the ability to detect the improper settings. As a result, we

showed that the reduction rate of the configurations is very high (94%) in uniform environments such as

cloud computing. Moreover, we showed that the improper settings could be found by our technology.

We plan to cope with the following three future tasks. 1) Automation of selection of the rule for the item: In this paper, the operation managers selected the rules for each item based on the operational experience (heuristics). However, because the number of items is substantial, this work requires large workloads during preparation. Moreo-ver, the operation managers may make a mistake in the selection of the rule of the item. Then, it is preferable to in-vestigate item's values based on the rules and to select the suitable rule automatically. 2) Extraction of rules: In this paper, we prepared the rules beforehand. However, this approach cannot handle unknown rules. It is preferable to extract the rules by analyzing various data. By analyzing relationships between many items or correlations with the others data such as trouble information and traffic patterns, rules of avoidance of problems and performance optimi-zation may be obtained. 3) Additional supports for users of large-scale computing: Our approach in this paper helps operation managers who need to manage large-scale computing resources such as cloud computing. They are not on-ly ICT professionals but also application users who need to set up their configurations for their purpose (such as simulation). To help such users, it is necessary to support some operations, such as installation of software and mid-dleware, observation of resources to detect troubles, restoration of resources from troubles, and deployment of data for applications (e.g. simulation) to resources. These supports will enable users such as M&S experts to build and run large-scale calculations on cloud computing resources easily.

REFERENCES

Alloy. 2012. alloy: a language & tool for relational models. Accessed June 15. http://alloy.mit.edu/alloy/.

BMC. 2012. BladeLogic Automation Suite. Accessed June 15.

http://www.bmc.com/products/product-listing/bladelogic-automation-suite.html.

Brown, A. B. 2004. "Oops! coping with human error in it systems".

Queue

2:34-41.

Brown, A. B., A. Keller, and J. L. Hellerstein. 2005. "A model of configuration complexity and its

appli-cation to a change management system". In

Proceedings of the 9th IFIP/IEEE International

Symposi-um

:631-644.

CFEngine. 2012. CFEngine. Accessed June 15. http://cfengine.com/.

Calheiros, R., R. Ranjan, A. Beloglazov, C. De Rose, and R. Buyya. 2011. "CloudSim: a Toolkit for

Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource

Provision-ing Algorithms".

Software: Practice and Experience

41:23-50.

Diao, Y., A. Keller, S. Parekh, and V. V. Marinov. 2007. "Predicting Labor Cost through IT Management

Complexity Metrics". In

Proceedings of the 10th IFIP/IEEE International Symposium

:274-283.

Eilam, T., M. H. Kalantar, A. V. Konstantinou, G. Pacifici, and J. Pershing. 2006. "Managing the

Config-uration Complexity of Distributed Applications in Internet Data Centers".

IEEE Communications

Magazine

44:166-177.

Hagen, S., and Kemper, A. 2010. "Model-Based Planning for State-Related Changes to Infrastructure and

Software as a Service Instances in Large Data Centers". In

Proceedings of the 2010 IEEE 3rd

Inter-national Conference

:11-18.

Joshi, P., H. S. Gunawi, and K. Sen. 2011. "PreFail: a Programmable Tool for Multiple-Failure Injection".

In

Proceedings of the 2011 ACM international conference on Object oriented programming systems

languages and applications

:171-188.

Lacy, S., and I. Macfarlane. 2007.

ITIL Version 3 Service Transition

. Norwich:Stationery Office.

Opscode. 2012. Chef. Accessed June 15. http://www.opscode.com/chef/.

PuppetLabs. 2012. Puppet. Accessed June 15. http://puppetlabs.com/.

The AWS Team. 2011. Summary of the Amazon EC2 and Amazon RDS Service Disruption in the US

East Region. Accessed June 15. http://aws.amazon.com/message/65648/.

(11)

ISBN 1-56555-374-8

22

VDM.

2012.

VDM

(The

Vienna

Development

Method).

Accessed

June

15.

http://www.vdmportal.org/twiki/bin/view.

AUTHOR BIOGRAPHIES

ATSUJI SEKIGUCHI is a researcher at Dept. of Cloud Computing Research Center, Fujitsu Laboratories

Limited. His research focuses on the operations management of cloud computing. His email address is

sekia@jp.fujitsu.com.

KUNIAKI SHIMADA is a researcher at Dept. of Cloud Computing Research Center, Fujitsu Laboratories

Limited. His research focuses on the operations management of cloud computing. His email address is

shimada.k@jp.fujitsu.com.

YUJI WADA is a chief researcher at Dept. of Cloud Computing Research Center, Fujitsu Laboratories

Limited. His research focuses on the operations management of cloud computing. His email address is

wada.yuuji@jp.fujitsu.com.

AKIO OOBA is a developer at Dept. of Service Management Middleware Division, Fujitsu Limited. His

research and development focus on the operations management middleware. His email address is

ooba.akio@jp.fujitsu.com.

RYOUJI YOSHIMI is a developer at Dept. of Service Management Middleware Division, Fujitsu

Limited. His research and development focus on the operations management middleware. His email

address is yoshimi.ryouji@jp.fujitsu.com.

AKIKO MATSUMOTO is a developer at Dept. of Service Management Middleware Division, Fujitsu

Limited. Her research and development focus on the operations management middleware. Her email

ad-dress is matumoto.akiko@jp.fujitsu.com.

15. http://alloy.mit.edu/alloy/. http://www.bmc.com/products/product-listing/bladelogic-automation-suite.html. . http://cfengine.com/. 5. http://www.opscode.com/chef/. ne 15. http://puppetlabs.com/. e 15. http://aws.amazon.com/message/65648/. http://www.vdmportal.org/twiki/bin/view.

References

Related documents

Extension cable damaged Inspect cable, replace if damaged Internal wiring of machine damaged Contact Service Agent. Motor protector has activated Allow motor to cool and increase

To study the influence of surge tanks on the transient conditions, Fig. 7 shows the effect of surge tank in two cases: a) at the spiral case and b) at the draft tube. As shown

To evaluate the expression of Hsp90 in this genus, four Artemia populations from Iran (three bisexual and one parthenogenetic Artemia ) were exposed to high salinities.. The

DELETE.. Controller &amp; Views • Erinnerung: • HTML-Oberflächen für CRUD-Operationen def show .... Testen in Rails Testdaten Tests für Controller/Views Tests

Nowadays the whole global market follows the best solution as CEM to sustain the brand in this highly saturated market, countries like USA, UK, China and India are very

The two most common bariatric surgical procedures are laparoscopic adjustable gastric banding (LAGB), which is a purely restrictive procedure, and Roux-en-Y gastric bypass (RYGB),

This strategy applied to a project-based learning environment allows members to choose accountability and commitment to one another in working toward the group project goal.. Having