Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy

(1)

Improving Data Grids Performance by using Modified

Dynamic Hierarchical Replication Strategy

N. Mansouri* and Gh. Dastghaibyfard** (C.A.)

Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strategy, called Modified Dynamic Hierarchical Replication (MDHR) is proposed. This strategy is an enhanced version of Dynamic Hierarchical Replication (DHR). However, replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement task. MDHR replaces replicas based on the last time the replica was requested, number of access, and size of replica. It selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue, the distance between nodes and CPU process capability. Simulation results utilizing the OptorSim show MDHR achieves better performance overall than other strategies in terms of job execution time, effective network usage and storage usage.

Keywords: Access Pattern, Data Grid, Data Replication, Simulation.

1 Introduction1

Millions of files will be generated from scientific applications and researchers around the world need access to the large data [1-6]. It is difficult and inefficient to store such huge amounts of data using a centralized storage. Grid technology is the best solution to this kind of problem. The main objective of the Grid Project is to provide sharing of computing and storage resources by users located in different part of the world. Grid can be divided as two parts, Computational Grid and Data Grid. Computational Grids are used for computationally intensive applications that require small amounts of data. But, Data Grids deals with the applications that require studying and analyzing massive data sets [7-10]. Replication technique is one of the major factors affecting the performance of Data Grids by replicating data in geographically distributed data

Iranian Journal of Electrical & Electronic Engineering, 2014. Paper first received 20 June 2012 and in revised form 6 Oct. 2013. * The Author is with the Department of Computer Science, Shahid Bahonar University of Kerman, 22 Bahman Bolvar, Kerman, Iran. ** The Author is with the Department of Computer Science and Engineering, College of Electrical & Computer Engineering, Shiraz University, Molla Sadra Avenue, Shiraz, Iran.

E-mails: [email protected], [email protected] and [email protected].

stores. There are three key issues in all the data replication algorithms as follows:

• Replica selection: process of selecting replica among other copies that are spread across the Grid.

• Replica placement: process of selecting a Grid site to place the replica.

• Replica management: the process of creating or deleting replicas in Data Grid.

Meanwhile, even though the memory and storage size of new computers are ever increasing, they are still not keeping up with the request of storing large number of data. The major challenge is a decision problem i.e. how many replicas should be created and where replicas should be stored. Hence methods needed to create replicas that increase availability without using unnecessary storage and bandwidth. In this work a novel dynamic data replication strategy, called Modified Dynamic Hierarchical Replication (MDHR) algorithm is proposed. This strategy is an enhanced version of Dynamic Hierarchical Replication strategy. According to the previous works, although DHR makes some improvements in some metrics of performance like mean job time, it shows some deficiencies. Replica selection and replica replacement strategies in DHR strategy are not very efficient. But MDHR algorithm

(2)

28 Iranian Journal of Electrical & Electronic Engineering, Vol. 10, No. 1, March 2014 deletes files in two steps when free space is not enough

for the new replica: First, it deletes those files with minimum time for transferring (i.e. only files that are exist in local LAN and local region). Second, if space is still insufficient then it uses three important factors into replacement decision: the last time the replica was requested, number of access, and file size of replica. The number of requests and the last time the replica was requested define an indication of the probability of requesting the replica again. It also improves access latency by selecting the best replica when various sites hold replicas. The proposed replica selection selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue, the distance between nodes and CPU process capability. The proposed algorithm is implemented by using a Data Grid simulator, OptorSim developed by European Data Grid project. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, effective network usage and storage resource usage.

The rest of the paper is organized as follows: Section 2 presents some examples motivating Data Grid use. In section 3 the classification of data replication strategies is briefly explained. Section 4 explains some advantages of data replication strategies. Section 5 gives an overview of pervious work on data replication in Data Grid. Section 6 presents the novel data replication strategy. We show and analyze the simulation results in section 7. Finally, section 8 concludes the paper and suggests some directions for future work.

2 Motivation

Nowadays large volume data are generated in many fields like scientific experiments and engineering applications. The distributed analysis of these amount data and their dissemination among researchers located over a wide geographical area needs important techniques such as Data Grid.

For example, the high-energy physics experiment requires a lot of analyses on huge amounts of data sets. CERN is the world’s largest particle physics center [11]. The LHC (Large Hadron Collider) at CERN will start to work in 2007. The volume of data sets produced by the LHC is about 10 PB a year. CERN has used the Grid technique to solve this challenging huge data storage and computing problem.

Climate models are important to find climate changes and they are improved after today’s models are completely analyzed [12]. Climate models require large computing capability and there are only a few sites world-wide that are appropriate for executing these models. Climate scientists are distributed all over the world and they test the model data. Now, model analysis is done by transferring the data of interest from

the computer modeling site to the climate scientist’s organization for different post-simulation analysis tasks. The effective data distribution method is necessary to the climate science when the data volume is large.

The Biomedical application field [13] wants to use the Grid to help the archiving of biological objects and medical images in distributed databases, communications between hospitals and medical organization and to present distributed access to the data. Hence, data movement is essential. The Earth observation application area investigates the nature of the planet’s surface and atmosphere. One application of Grid technique is for the analyzing and validation of ozone profiles. The processed data sets is scattered to other places worldwide. Use cases for Earth observation science applications are explained in more detail in Ref. [14].

The Human Genome Project [15] produces detailed genetic and physical maps of the human genome. The project requires advanced means of making novel scientific information widely available to researchers so that the results may be used for the public good. Data Grid project is funded by the European Union [16]. A Data Grid consists of a group of geographically distributed computational and storage resources placed in various locations, and enables users to share data and other resources [17-20].

3 Classification of Data Replication Strategies Generally, replication algorithms are either static or dynamic as shown in Fig. 1. In static approaches the created replica will exist in the same place till user deletes it manually or its duration is expired. The disadvantages of the static replication strategies are that they cannot adapt to changes in user behavior and they are not appropriate for huge amount of data and large number of users. Of course static replication methods have some advantages like: they do not have the overhead of dynamic algorithms and job scheduling is done quickly [21, 22]. On the other hand, dynamic strategies create and delete replicas according to the changes in Grid environments, i.e. users’ file access pattern [23-27]. As Data Grids are dynamic environments and the requirements of users are variable during the time, dynamic replication is more appropriate for these systems [28]. But many transfers of huge amount of data that are a consequence of dynamic algorithm can lead to a strain on the network’s resources. So, inessential replication should be avoided. A dynamic replication scheme may be implemented either in a centralized or in a distributed approach. These methods also have some drawbacks such as; the overload of central decision center further grows if the nodes in a Data Grid enter and leave frequently. In case of the decentralized manner, further synchronization is involved making the task hard.

(3)

Fig. 1 Types o

4 Advanta Availabil improve ava system can other site. In Reliability the probabi completely, a Scalabilit provided by dependent on Grid than rep

Adaptabi changeable a time. The da the informat provide bette Performa Grid systems a user can ac

5 Related W Foster an replica strate Replication, Replica and also introduc • Tem are shor • Geo rece requ • Spa acce near These s patterns: firs data access w finally data a geographical indicate that replica strate the best in th different sch

Rahman e selection by Nearest Neig

of replication sc

ges of Data R lity: All the r ailability. Whe resort to rep

this manner t ty: The more t ility that us and hence a sy ty: This is a k a replication n architecture plication strate ility: The na and users can ata replication tion of the G er results. ance: One wa s is to store re ccess the data

Works nd Ranganath

egies: No Re Plain Cachin Fast Spread) ced three types mporal locality much possi rtly.

ographical lo ently by a uested by adja atial locality: essed file are r future. trategies eva st, access patt with a small d access with a l locality. T t different ac egies. Cascadi

he simulation eduling and re et al. [29] pro y using a sim

ghbor (KNN) chemes.

Replication St replication st en a fail happ plicated data the availability the number of ser’s request

ystem is more key paramete n strategy. Sc e model selec

egy.

ature of the n enter and qu n strategy mu Grid environm

ay to speed u eplicas at mult from the near

an [28], prop plica, Best C ng, Caching for multi-tier s of localities, y: The files a ible to be

ocality: The client are p acent clients, t

The related likely to be

aluated with tern with no degree of temp

small degree The results ccess pattern ing and Fast S

s. Also, the a eplication stra oposed an algo mple techniqu ). The KNN

trategies trategies want pen in any sit

that is stored y increases. f replicas mor

will be d e reliable. er which must

alability is m cted for the D

e Grid is v uit a Grid at st be adaptive ment in order

up access in d tiple locations rest site.

posed six disti Client, Cascad plus Cascad Data Grid. T , namely: accessed recen requested ag files acces probably to too.

files to recen requested in

different d locality. Seco poral locality of temporal of simulati needs differ Spread perform authors combi ategies. orithm for rep ue called the rule chooses t to e, a d at re is done t be more Data very any e to r to data s, so inct ding ding hey ntly gain ssed be ntly the data ond, and and ions rent med ned lica e k- the bes log the pre tec fro stra gua pra cac dis the rep site per tim sim alg rep and the dat a r the thr nod sys his do res ban bec int int ‘im sca a pro det dat als env the can inc on stra Pop per cha pop var beh dis

st replica for gs. They also e transfer tim ediction error chniques. Acco om a site whic

In [30] the ategy that ha arantee and c actical manne ching strategy stributed syste eir distributed plicas, each G e. This can rformance be me and bandw

mulation demo gorithm signif plication strate Tang et al. [3 d Aggregate B e average dat

ta grid. The m replica to nod e file’s acces reshold. SBU

de, but ABU stem. With storical access the same un sults show tha

ndwidth con cause its aggre

Andronikou eroperable ne o account the mportance’ of alable and the Grid environ oposed QoS-termines the ta request, co so places of vironment acc e overhead th

n handle the creasing or de the number a

Lee et al. [34 ategy for a pular File R riodically com anges of user pular files to riation. They haviors, inclu stributions, to

r a file by us suggested a p me between r as reported

ordingly, one h has minimu authors pre as a provabl can be implem

er. They als y, which can em such as Da d strategy is th Grid site keeps

n dramatical ecause transfe

width consum onstrated that ficantly outpe egy under vari 32] presented Bottom-Up (A a access resp main idea of th des close to i

ss rate is h uses the fil aggregates th ABU, a no s records to its ntil these rec at ABU impro sumption bet egation capab et al. [3 ew data repli infrastructura f the data.

strategies can nment to pro -aware dyna number of r ntent importa f the new r cording to th at the replica e dynamicity

ecreasing the and the geogra 4] presented a star-topology Replicate Fir mputes file acc rs’ access beh o suitable clu considered s uding Zipf-lik evaluate PFR

sing previous predictive wa sites and d by using Neu site can reque um transfer tim esented a dat

e theoretical mented in a di

o proposed n be easily a ata Grids. The hat when ther s track of its c lly enhance erring large-s ming [31]. Th

the distribute erforms a pop ious network Simple Botto ABU) strategie ponse time fo

he two strateg ts requesting igher than a e access hist he file access ode transmits s top tiers, and cords reach t oves job respo

tter than tho bility.

3] proposed ication strateg al constraints

The presente n be easily im ovide fast ex amic replicat eplicas requir ance and requ

replicas with he network ba ation techniqu of the Grid set of data r aphy of the da an adaptive da y Data Grid rst algorithm cess popularit haviors, and th usters/sites to everal types o ke, geometric,

RF. The simu

s file transfer ay to estimate decreased the ural Network est the replica me.

ta replication performance istributed and a distributed adopted in a e key point of re are several closest replica Data Grid sized files is he results of ed replication pular existing parameters. om-Up (SBU)

es to improve or a multi-tier gies is to store clients when a pre-defined tory for each history for a s aggregated d the top tiers the root. The onse time and ose of SBU

d a set of gies that take as well as the ed system is mplemented on xecution. The tion strategy red based on uested QoS. It hin the Grid andwidth and ue presents. It d system by replicas based ata demands.

ata replication d, called the m (PFRF). It ty to track the hen replicates adapt to the of file access and uniform ulation results r e e k a n e d d a f l a d s f n g ) e r e n d h a d s e d U f e e s n e y n t d d t y d n e t e s e s m s

(4)

30 demonstrate turnaround ti Saadat et replication s Dynamic Da (PDDRA). P and pre-repli prediction is of the Grid s files, it will show that t execution tim replications, Taheri et optimization Bee Colony operations computationa storage elem independent, (i.e., makesp heterogeneou Three tailor-m are applied t compare it w JDS-BC’s scenarios. JD making beha its objectives optimizing th Mansouri Dynamic Hi store replica has been acc sites. It also best replica proposed rep replica loca considering storage and show, it has other strateg comparativel Park et a based Repli access time avoiding net into several r the regions regions. So region, its fe two deficien within the re in all the req strategy has of storage el extension of

that PFRF ime and bandw

t al. [35] pre trategy which ata Replicatio PDDRA predic icates them be done based o sites. So when get them lo this strategy me, effective

hit ratio and p al. [36] prop strategy, call (JDS-BC). JD

to efficien al elements ments in a , and in many pan and transfe us systems made test Gri o evaluate the with other str

superiority DS-BC also p avior, where i s (e.g., transf he other one ( i and Dastg erarchical Re a in suitable s

cessed most, i decreases ac when differ plica selectio ation for th the replica data transfer less job execu gies especiall

ly small storag al. [38] prese cation (BHR by maximizin twork conges regions, wher is lower tha if the require etching time w ncies, first it egion and seco quested sites n

good perform lement is sma f BHR [38] st

Iran F can reduc

width consum esented a new h is called Pr on Algorithm cts future requ efore needs ar

on the past fi n a Grid site ocally. The si improves in e network us

percentage of osed a new B led Job Data S

DS-BC has tw ntly schedu

and replicat system so y cases confli er time of all d

are concurre ids varying fro e performance rategies. Resu under diffe presented a b

it occasionall fer time) to o e.g., makespa ghaibyfard [3 eplication (DH

sites where th instead of stor ccess latency rent sites hol on strategy c he users’ ru requests that time. The si ution time in y when the ge size. ented a Band R) which dec

ng network-le stions. They d re network ban an the bandw ed file is pla will be less. B

terminates, ond replicated not the approp mance only w

all. Modified trategy which

nian Journal o ce average mption.

w dynamic d re-fetching ba m in Data Gr uires of Grid s re requested. T

ile access hist requests a se imulation res n terms of sage, number storage filled Bee Colony ba Scheduling us wo collaborat ule jobs o te data sets

that the t icting, objecti datafiles) of s ently decreas om small to la e of JDS-BC ults showed t erent operat alanced decis ly relaxes one obtain more fr an).

37] presented HR) strategy t he particular ring file in m by selecting ld replicas. T chooses the b unning jobs t waiting in

imulation res comparison w

Grid sites h

dwidth Hierar creases the d

evel locality divided the s ndwidth betw width within aced in the sa BHR strategy if replica ex d files are pla priate sites. B when the capa BHR [39] is h replicates a

of Electrical & job data ased rids sites This tory t of ults job r of d. ased sing ting onto on two ives uch sed. arge and that ting sion e of rom d a that file many the The best by the ults with have rchy data and sites ween the ame has xists aced BHR city s an file tha nea stru [40 tha am has Sim Th (Le the res site 6 and Fig lev thr com reg to (sit eac 6 DH req siz can ban req tha rep Fig

& Electronic E at has been ac

ar future. A replicatio ucture and a 0]. They cons at has three le mong the cand

s the highes milarly, it use his leads to a b east Recently eir algorithm spectively. Be e) with most o

Proposed Dy In this sectio d then the DH

The Grid top g. 2. The hie vels. First le rough internet mprises LAN gion that has m

the first leve tes) within e ch other with

6.2 Dynamic In previous w HR strategy quested file si ze, file will

ndidate replic ndwidth to the quests. It also at has the high plica.

g. 2 Grid topolo

Engineering, V ccessed most

on algorithm scheduling a sidered a hier

evels. In thei didate replicas st bandwidth es the same t better perform Used) method selects the b est region (LA of the requeste

ynamic Repli on, first netw HR and MDHR

6.1 Network pology of sim erarchical net evel are Reg t i.e. have low N’s (local ar

moderately hig el. Finally the

ach the LAN a high bandw

Hierarchical work, we prop first checks ize is greater t be accessed cas selects the

e requester no places replica hest number o

ogy in the simul

Vol. 10, No. 1 and it may al

for a 3-lev algorithm are rarchical netw r replica sele s they selected h to the re technique for mance compari d. For efficien best region, L AN, site) is a r

ed files.

ication Algor work structure R algorithms a

k Structure mulated platfor

twork structu gions that ar w bandwidth. rea network)

gher bandwid e third level a N’s, that are width.

l Replication posed DHR al replica feasi than SE (Stor remotely. It e one that ha ode and has lea

as in best site of access for t

lation.

, March 2014 lso be used in

vel hierarchy e proposed in work structure ection method d the one that equested file. file deletion. ing with LRU nt scheduling, LAN and site region (LAN,

rithm e is described

are presented.

rm is given in ure has three re connected Second level within each dth comparing are the nodes connected to

Algorithm lgorithm [37]. ibility. If the rage Element) t among the as the highest ast number of s i.e. best site that particular 4 n y n e d t . . U , e , d n e d l h g s o . e ) e t f e r

(5)

If the ava requested fil files should minimum tim exist in local

6.3 Modi

MDHR a 1. Replica Se of file, there the best repli location is th factors are us • Stor The stora have a key r calculated by 1 F T Storage = StorageSp FileSize repr • Wai Each stor time and the time. So, one the storage q equation:

∑

=

n i i

T

T 0 2

where n is nu • Dist D(x,y) rep and y. Comp route comm information c the first time

• CPU

CPU proc

CP

C= ×F U where F is th

1 2 T = +T T

( , )

F x y =w

This func a weighted c The proper empirically. equations interrelations

ailable space in le size, it rep

be deleted. me for transfe l LAN and loc

ified Dynamic Alg algorithm has election: Whe is a significan ica. Replica s he best for th sed to choose rage access la age media spe ole in the ave y the followin

( )

( /

FileSize MB eSpeed MB Speed shows t

resents size of iting time in th rage media h e storage can p

e has to wait f queue. T2 can

umber of requ tance between presents netw puted using th mand. To can be stored e.

U process cap cess capability

PU

he CPU freque

1 2

w × +T w ×D

ction can be ta ombination of weights (w1,

We exam and selecte ships between

n best SE is g plicates the fi It deletes t ferring (i.e. on

cal region).

c Hierarchica gorithm

been describe en different si

nt benefit real election decid he Grid users

a best replica atency eed and size o erage response

g equation:

/Sec)

the storage m f file.

he storage que as some requ perform only for all the pre n be defined b

uests waiting i n nodes work distance

he number of h reduce the

when a replic

pability y is defined by

ency, UCPU is

3

( , )

D x y +w ×

ailored, becau f the three for w2, w3) hav

mined differe ed the b n the above pa

greater or equa ile. If not, so those files w nly files that

al Replication

ed in three pa ites have repli lized by select des which rep . Four import a:

of requests qu e time. T1 can

media speed

eue

uests at the sa one request a evious request

by the follow

in the queue.

between node hops with a tr cost, dista ca is checked

y equation

CPU usage.

C

×

se it is defined rmer metrics. ve been obtai ent weights est values.T arameters are al to ome with are n arts: icas ting lica tant ueue n be (1) and ame at a ts in wing (2)

es x race ance for (3) (4) (5) d as ned in The not ver tec red una exi in rep file sea the rep Oth in tha not pla loc sel rep file firs BS sel the job sto for bec wh exi Oth the rep ava del Fig

ry simple. In chnique for de

The goal of duce file trans available req istence of the

the same LAN plicas and sele

e doesn’t ex arches within e same region plicas and se herwise it gen other regions at has minimum

2. Replica p t available in t ace. Accordin cality the repl lect the BSE, D plica access) o

e in the region st storage ele SE. If more t

lected random e requested sit b execution tim

Assume MD oring replica R

r replica R, th cause S7 has hich is shown

3. Replica M ists in the B herwise if the e file is access plication does ailable in the leted using the

• Genera both av LAN. N list till

g. 3 Replica pla

future work etermining we replica manag sfer time for quested files file in local L N, then it wil ects a replica w xist in the s

the local regi n, then it will elects a replic nerates a list o

, and from th m F. placement: W

the local stora ng to the te lica is placed DHR creates of all SE’s tha n. Now the re

ment (SE) of than one SE mly. Therefore

tes. Hence, sto me can decrea DHR wants R. Now list1 sh

hen MDHR s the highest in Fig. 3. Management:

BSE, the se e file is availa sed remotely. s not exist a

same LAN, o e following ru ate a LRU sor vailable in th Now start del

space is availa

cement strategy

k, we plan to ights. ger (RM) in e

each job. RM of each job LAN. If the f ll create a list with the minim same LAN,

on. If the file l create a list ca with the of replicas that his list it selec

When a reques age, replicatio emporal and

in the best si a sorted list ( at requested t eplica will be f the above so

is candidate , replica is no orage cost as ase.

to find the hows the sorte selects site S7 number of re

If enough s elected file i able in the loc Now, if enou and requested one or more fi ules:

rted list of rep e site as well leting files fro able for replic

y.

use Taguchi

each site is to M for locating controls the file duplicated t of candidate mum F. If the then MDHR duplicated in of candidate minimum F. t are available cts the replica

ted replica is on should take geographical ite (BSE). To by number of that particular placed in the orted list, i.e. e one can be t placed in all well as mean

best site for ed list created 7 from LAN3 equests for R

storage space is replicated. cal LAN, then ugh space for d file is not iles should be

plicas that are l as the local om the above ca. i o g e d e e R n e . e a s e l o f r e . e l n r d 3 R e . n r t e e l e

(6)

32 • If

prev rand sort the • If sp

(tha dele som loca The cost fact num be c acce requ prob Obv with num valu

1

RV=w× +S

where the va users, S is fi replica, CT i time of repl would be r describes DH

7 Experim In this s configuration described res

OptorSim a real Data strategies, as several parts

• Com

com • Stor in D • Res

allo sele • Rep data acce • Rep

imp

The Op configuration

space is sti vious step for domly. In ot ted list of repl

site as well as pace is still in at contains one eted. In this me valuable fil al region and erefore, such d t of transfer.

ors (the last ti mber of access considered ins ess and the uested defin bability of r viously, it is m h large size, mber of repli

ue (RV) is calc

2 3

w NA w

+ × +

alue of w1, w2

ile size, NA is s the current t lica. The rep eplaced with HRS strategy.

ments

section, simul n, experimen spectively.

7.1 Sim m was develop a Grid for ev s shown in Fig

:

mputing El mputational res rage Element Data Grid.

ource Broker cates each jo ected schedulin plica Manager

a transferring essing the Rep plica Optim plements the re

7.2 Simu ptorSim tool

n files.

Iran ill insufficien r each LAN i ther words, g licas that are s the local reg nsufficient, a g

e or more rep step using L es that may no d may be ne deletions will

In this step ime the replic s, and file siz stead of LRU last time t ne an indi requesting the more importan because it ica replaceme culated by equ

3

1

CT LA

×

− and w3 can be

s the number time and LA i plica with lea h the new re

lation tool, s nt results and

ulation Tool ped to simulat valuating var g. 5 [41, 42].T

lement (CE source in Data

(SE): represe

(RB): gets th ob to proper s

ng policy. r (RM): at ea

and provides plica Catalog. mizer (RO):

eplication algo

ulation Input works bas

nian Journal o nt, then rep in current reg generate a L both available ion.

group of repli plicas) need to LRU may de

ot be availabl eeded in futu l result in a h p three valua ca was reques

e of replica) w U. The number

the replica w ication of

e replica ag nt to replace f can reduce ent. The rep uation

e assigned by of access to is the last requ ast value of eplica. Figure

simulation inp d discussion

te the structure rious replicat The OptorSim

E): represe a Grid. ents data resou

he user’s job site based on

ach site contr a mechanism

within R orithm.

t

sed on sev

of Electrical & peat

gion LRU e in

icas o be elete e in ure. high able ted, will r of was the ain. files the lica

( 6 )

the the uest RV e 4

put, are

e of tion has

ents

urce

and the

rols m for

RM

eral

Fig

& Electronic E • Parame

simulat configu to be ru maximu strategi • Grid co

topolog availab and nu sizes. • Job co

about s job, the

• Bandw

backgro

g. 4 MDHR algo

Engineering, V eter configur

tion paramete uration file su un, delays bet um queue size ies, access pat onfiguration gy, i.e., the lin ble network b umber of CEs

onfiguration f simulated jobs e probability e idth configur ound network

orithm.

Vol. 10, No. 1 ration file: rs are set in t uch as total nu tween each job e, the choice tterns for the j file: contains nks between g

bandwidth b and SEs, as

file: describes s, the files ne each job runs,

ration file: k traffic.

, March 2014 The basic the parameter umber of jobs b submission, of replication job, etc.

the network grid sites, the etween sites, well as their

s information eeded by each

etc.

specifies the

4 c

r s , n

k e , r

n h

e

(7)

Fig. 5 OptorSi

Fig. 6 Various

As menti during execu requested is important acc are selected file), random distribution), one direction the direction Gaussian (fil These file ac

There ar each region master files consists of th topology of o 11 SEs. The GB and the s Bandwidth in job types, an (each is 2 GB

im architecture

s file access pat

ioned above, ution. The or

determined cess patterns a in the order s m (files are , random walk n away from

n will be les are request ccess patterns

7.3 Con re three regio

has an aver are distribut he original fil our simulated e storage capa storage capac n each level is nd each job typ B) for executio .

tterns.

jobs require rder in which

by the acce are as follow: stated in the j e accessed u k unitary (file the previous random) and ted in a Gauss are shown in

nfiguration ons in our co

rage two LA ted to CERN le and cannot platform incl acity of the m

ity of all othe s given in Tab pe on average on.

s to access f h those files

ss pattern. F Sequential (f ob configurat using a rand es are selected file request d random w sian distributio

Fig. 6.

onfiguration ANs. Initially N. A master

t be deleted. T ludes 10 CEs master site is 3

er sites is 40 G ble 1. There ar e requires 16 f

files are Four files tion dom d in and walk on).

and all file The and 300 GB. re 6 files

Ta

val we

stra Ra Ra alg

ble 1 Bandwidt

Parameter Inter LAN b Intra LAN b Intra Regio

ble 2 General j

Parameter Number of Number of Number of Size of sin Job delay (

Table 2 spec lues used in o e assumed that

We evaluated ategies in fou andom Access andom Walk gorithms have • In Lea always the job space f storage • In Leas replicat is exec new re elemen

• Bandw

stores bandwi likely to • The M

files wi the high • 3-Leve

conside has thr factor f

• Dynam

(DHR) best sit for tha access conside the stor

th configuration

r

bandwidth (lev bandwidth (lev on bandwidth (le

ob configuratio

r f jobs f jobs types f file access per ngle file (GB)

(ms)

cifies the simu our study. To t the data is re

7.4 Simulati d the perform ur types of ac s, Random W k Gaussian

been used for ast Frequentl replication ta b is executing for the new re e element is de st Recently U

tion takes plac cuting. If ther

plica, least ac nt is deleted.

idth Hierarchy the replicas idth and repl o be requested Modified BHR

ithin the regio hest access. l Hierarchic ers a hierarch ree levels. B for replica sele mic Hierarchi

places replic te that has the at particular r latency by se ering the repli rage and data n.

Value vel 3) 1000 vel 2) 100

evel 1) 10

on.

Valu 2000 6 r jobs 16

2 2500

ulation parame simplify the ead-only.

on Results mance of MDH

ccess patterns Walk Unitary

Access. Six r evaluation, n ly Used (LF akes place in t

g. If there is eplica, the old

eleted. sed (LRU) str ce in the site w re is not enou

ccessed file i

y based Repli in a site tha licates those d soon within R algorithm r on in a site w

cal Algorith hical network andwidth is ection and del ical Replicat cas in appropr e highest num replica. It als electing the be ica requests th transfer time.

e (Mbps)

ue

eters and their requirements,

HR and the six s: Sequential, y Access and x replication namely: FU) strategy the site where s not enough dest file in the

rategy always where the job ugh space for in the storage

ication (BHR) at has a high files that are the region. replicates the where file has

hm (3LHA) structure that an important letion. tion strategy riate sites i.e. mber of access so minimizes est replica by hat waiting in r ,

x , d n

y e h e

s b r e

) h e

e s

) t t

y . s s y n

(8)

34 Iranian Journal of Electrical & Electronic Engineering, Vol. 10, No. 1, March 2014 Fig. 7 Mean job execution time for various replication

algorithms.

The comparison result is shown in Fig. 7. The experiment results show that MDHR has the lowest value of mean job execution time in all the experiments and all of file access patterns. Obviously, the No Replication strategy has the worst performance as all the files requested by jobs have to be transferred from CERN. In this simulation LRU and LFU have almost the same execution time. BHR algorithm improves data access time by avoiding network congestions. The 3LHA performs better than BHR because it considers the differences between intra-LAN and inter-LAN communication. DHR mean job execution time is faster than Modified BHR. MDHR strategy has the best performance since it will not delete those file that have a high transferring time. One of the important parameters that reduce the Grid site’s job execution time is having their needed files locally stored on their storage element. It replicates files wisely and does not delete valuable files which results in preserving the valuable replicas. As in Random access patterns comprising Random, Unitary random walk and Gaussian random walk, a certain set of files is more likely to be requested by Grid sites, so a large percentage of requested files have been replicated before. Therefore, MDHR strategy and also all the other strategies have more improvement for random file access patterns.

Figure 8 displays the mean job time based on changing number of jobs for seven algorithms. It is clear that as the job number increases, MDHR is able to process the jobs in the lowest mean time in comparison with other methods. It is similar to a real Grid environment where a lot of jobs should be executed.

Data replication takes time and consumes network bandwidth. However, performing no replication has been demonstrated to be ineffective compared to even the simplest replication strategy. So, a good balance must be discovered, where any replication is in the interest of reducing future network traffic. Effective Network Usage (ENU) is used to estimate the efficiency the network resource usage. ENU (Eenu) is given from

[43]:

rfa fa

enu

lfa

N N E

N

+

= (8)

where Nrfa is the number of access times that CE reads a

file from a remote site, Nfa is the total number of file

replication operation, and Nlfa is the number of times

that CE reads a file locally.

The effective network usage ranges from 0 to 1. A lower value represents that the network bandwidth is used more efficiently. Figure 9 shows the comparison of the Effective Network Usage of the seven replication strategies for the sequential access pattern. The ENU of MDHR is lower about 58% compared to the LRU strategy. The main reason is that Grid sites will have their needed files present at the time of need, hence the total number of replications will decrease and total number of local accesses increase. The MDHR is optimized to minimize the bandwidth consumption and thus decrease the network traffic.

Figure 10 shows the mean job time of replication algorithms for varying inter region bandwidth. When we set narrow bandwidth on the inter-region link, our strategy outperforms other strategy considerably.

Fig. 8 Mean job time based on varying number of jobs.

Fig. 9 Effective network usage with sequential access pattern generator.

Fig. 10 Mean job time with varying bandwidth.

(9)

8 Conclusion

Data replication is a key method used to improve the performance of data access in distributed systems. In this paper, we propose a novel data replication algorithm for a three level hierarchical structure with limited storage space to improve system performance. MDHR replaces replicas based on the last time the replica was requested, number of access, and file size of replica. Therefore, sites will have their required files locally at the time of need and this will decrease response time, access latency, bandwidth consumption and increase system performance considerably. It also improves access latency by selecting the best replica when various sites hold replicas. The proposed replica selection selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue, the distance between nodes and CPU process capability. To evaluate the efficiency of our policy, we used the Grid simulator OptorSim that is configured to represent a real world Data Grid test bed. We compared MDHR to seven of existing algorithms, No replication, LRU, LFU, BHR, Modified BHR, 3LHA and MDHR for different file access patterns. The evaluation shows that MDHR outperforms the other algorithms and improves Mean Job Time and Effective Network Usage under all of the access patterns, especially under the different random file access patterns. Also, we concluded that MDHR can be effectively utilized when hierarchy of bandwidth appears apparently.

In the future, we plan to test our simulation results on real Data Grid. We also try to investigate dynamic replica maintenance issues such as replica consistency. In the longer-term, we would like to consider the set of QoS factors taken into account for dynamic replication, including both service provider and client-related requirements.

References

[1] A. Folling, C. Grimme, J. Lepping and A. Papaspyrou, “Robust load delegation in service Grid environments”, IEEE Transactions on Parallel and Distributed Systems, Vol. 21, pp. 1304-1316, 2010.

[2] O. Sonmez, H. Mohamed and D. Epema, “On the benefit of processor co-allocation in multi cluster Grid systems”, IEEE Transactions on Parallel and Distributed Systems, Vol. 21, pp. 778-789, 2010.

[3] H. Li, “Realistic workload modeling and its performance impacts in large-scale E-science Grids”, IEEE Transactions on Parallel and Distributed Systems, Vol. 21, pp. 480-493, 2010. [4] M. Sedghi and M. Aliakbar-Golkar, “Distribution

Network Expansion Using Hybrid SA/TS Algorithm”, Iranian Journal of Electrical &

Electronic Engineering, Vol. 5, No. 2, pp. 122-130, 2009.

[5] M. Sharma and K. P. Vittal, “A Heuristic Approach to distributed Generation Source Allocation for Electrical Power Distribution Systems”, Iranian Journal of Electrical & Electronic Engineering, Vol. 6, No. 4, pp. 224-231, 2010.

[6] M. R. Aghamohammadi, “Static Security Constrained Generation scheduling Using Sensitivity Characteristics of Neural Network”, Iranian Journal of Electrical & Electronic Engineering, Vol. 4, No. 3, pp. 104-114, 2008. [7] GriPhyN: The Grid physics network project, 12

July 2010. http://www.griphyn.org, [Accessed: January 2014].

[8] R. S. Chang and M. S. Hu, “A resource discovery tree using bitmap for Grids”, Future Generation Computer Systems, Vol. 26, pp. 29-37, 2010. [9] J. Wu, X. Xu, P. Zhang and C. Liu, “A novel

multi-agent reinforcement learning approach for job scheduling in Grid Computing”, Future Generation Computer Systems, Vol. 27, pp. 430-439, 2011.

[10] S. Ebadi and L. M. Khanli, “A new distributed and hierarchical mechanism for service discovery in a Grid environment”, Future Generation Computer Systems, Vol. 27, pp. 836-842, 2011. [11] CERN. http://public.web.cern.ch/public/,

[Accessed: January 2014].

[12] N. T. Thuy, T. T. Anh, D. D. Thanh, D. T. Tung, N. T. Kien and T. T. Giang, “Construction of a Data Grid for meteorology in Vietnam”, Proceedings of the 2007 Int. Conf. on Grid Computing & Applications, pp. 186-191, 2007. [13] J. Montagnat and H. Duque, “Medical data

storage and retrieval on the Data Grid”, Technical Report DataGrid-10-TED-345148-0-1, the European Data Grid Project, 2002.

[14] Work Package 9, WP9.4 Use Case, Technical Report DataGrid-09-TED-0101-1-1, the European Data Grid Project, 2002.

[15] Human Genome Project: http://

www.nhgri.nih.gov/

,

[Accessed July 2011]. [16] The Data Grid project.

http://eu-datagrid.web.cern.ch/eu-datagrid/, [Accessed January 2014].

[17] S. Figueira and T. Trieu, “Data replication and the storage capacity of Data Grids”, Berlin, Heidelberg: Springer-Verlag, pp. 567-75, 2008. [18] D. G. Cameron, A. P. Millar, C. Nicholson, R.

Carvajal-Schiaffino, K. Stockinger and F. Zini, “Analysis of scheduling and replica optimization strategies for Data Grids using Optorsim”, Journal of Grid Computing, Vol. 2, No. 1, pp. 57-69, 2004.

[19] H. Zhong, Z. Zhang and X. Zhang, “A Dynamic replica management strategy based on Data

(10)

36 Iranian Journal of Electrical & Electronic Engineering, Vol. 10, No. 1, March 2014 Grid”, in 2010 Ninth Int. Conf. on Grid and

Cloud Computing, pp.18-22, 2010.

[20] H. Lamehamedi, B. Szymanski, Z. Shentu and E. Deelman, “Data replication strategies in Grid environments”, in Proceedings of the fifth int. conf. on Algorithms and Architectures for Parallel Processing, pp. 378-383, 2002.

[21] O. Tatebe, Y. Morita, S. Matsuoka, N. Soda and S. Sekiguchi, “Grid datafarm architecture for petascale data intensive computing”, Proc. of the 2nd IEEE/ACM Int. Symp. on Cluster Computing and the Grid (CCGrid), pp. 102-110, 2002. [22] A. Chervenak, E. Deelman, I. Foster, L. Guy, W.

Hoschek, A. Iamnitchi, C. Kesselman, P. Kunszt, M. Ripeanu, B. Schwartzkopf, H. Stockinger, K. Stockinger and B. Tierney, “A framework for constructing scalable replica location services”, in Proc. of Super Computing, ACM/IEEE Conf., pp. 1-17, 2002.

[23] I. Foster and K. Ranganathan, “Design and evaluation of dynamic replication strategies a high performance Data Grid”, in Proceedings of Int. Conf. on Computing in High Energy and Nuclear Physics, China, 2001.

[24] U. Cibej, B. Slivnik and B. Robic, “The complexity of static data replication in Data Grids”, Parallel Computing, Vol. 31, No. 8, pp. 900-912, 2005.

[25] Y. Yuan, Y. Wu, G. Yang and F. Yu, “Dynamic data replication based on local optimization principle in Data Grid”, in Proceedings of GCC, pp. 815-822, 2007.

[26] H. Lamehamedi, Z. Shentu, B. Szymanski and E. Deelman, “Simulation of dynamic data replication strategies in Data Grids”, in Parallel and Distributed Processing Symposium, 2003. [27] B. D. Lee and J. B. Weissman, “Dynamic replica

management in the service Grid”, 10th IEEE Int. Symposium on High Performance Distributed Computing Proc., pp. 433-434, 2001.

[28] K. Ranganathan and I. Foster, “Identifying dynamic replication strategies for a high performance Data Grid”, in Proceedings of the Second International Workshop on Grid Computing, pp. 75-86, 2001.

[29] R. M. Rahman, K. Barker and R. Alhajj, “Replica selection strategies in Data Grid”, Journal of Parallel and Distributed Computing, Vol. 68, pp. 1561-1574, 2008.

[30] D. T. Nukarapu, B. Tang, L. Wang and S. Lu, “Data replication in data intensive scientific applications with performance guarantee”, IEEE Transactions on Parallel and Distributed Systems, Vol. 22, No. 8, pp. 1299-1306, 2011. [31] A. Chervenak, R. Schuler, M. Ripeanu, M. A.

Amer, S. Bharathi, I. Foster and C. Kesselman, “The Globus replica location service: design and experience”, IEEE Trans. on Parallel and

Distributed Systems, Vol. 20, pp. 1260-1272, 2009.

[32] M. Tang, B. S. Lee, C. K. Yao and X. Y. Tang, “Dynamic replication algorithm for the multi-tier Data Grid”, Future Generation Computer Systems, Vol. 21, No. 5, pp. 775-790, 2005.

[33] V. Andronikou, K. Mamouras, K. Tserpes, D. Kyriazis and T. Varvarigou, “Dynamic QoS-aware data replication in Grid environments based on data “importance”, Future Generation Computer Systems, Vol. 28, No. 3, pp. 544-553, 2012.

[34] M. C. Lee, F. Y. Leu, and Y. Chen, “PFRF: An adaptive data replication algorithm based on startopology Data Grids”, Future Generation Computer Systems, Vol. 28, No. 7, pp. 1045-1057, 2012.

[35] N. Saadat and A. M. Rahmani, “PDDRA: A new pre-fetching based dynamic data replication algorithm in Data Grids”, Future Generation Computer Systems, Vol. 28, No. 4, pp. 666-681, 2012.

[36] J. Taheri, Y. C. Lee, A. Y. Zomaya and H. J. Siegel, “A Bee Colony based optimization approach for simultaneous job scheduling and data replication in Grid environments”, Computers & Operations Research, Vol. 40, No. 6, pp. 1564-1578, 2013.

[37] N. Mansouri and G. H. Dastghaibyfard, “A dynamic replica management strategy in Data Grid”, Journal of Network and Computer Applications, Vol. 35, No. 4, pp. 1297-1303, 2012.

[38] S. -M. Park, J. -H. Kim, Y. -B. Go and W. -S. Yoon, “Dynamic Grid replication strategy based on internet hierarchy”, Int. Workshop on Grid and Cooperative Computing, in: Lecture Note in Computer Science, Vol. 1001, pp. 1324-1331, 2003.

[39] K. Sashi and A. Thanamani, “Dynamic replication in a Data Grid using a modified BHR region based algorithm”, Future Generation Computer Systems, Vol. 27, No. 2, pp. 202-210, 2011.

[40] A. Horri, R.Sepahv and and Gh. Dastghaibyfard, “A hierarchical scheduling and replication strategy”, International Journal of Computer Science and Network Security, Vol. 8, No. 8, pp. 30-35, 2008.

[41] D. G. Cameron, A. P. Millar, C. Nicholson, R. Carvajal-Schiaffino, F. Zini and K. Stockinger, “Optorsim: A simulation tool for scheduling and replica optimization in Data Grids”, in Int. Conf. for Computing in High Energy and Nuclear Physics (CHEP 2004), 2004.

[42] OptorSim-A Replica Optimizer Simulation: http://edg-wp2.web.cern.ch/edgwp2/optimization/ optorsim.html, [Accessed January 2014].

(11)

[43] W. H. K. St Dynam OptorS on Gri

Science from 2009. Her re Distributed Sy

Bell, D. G. C tockinger an mic Grid

Sim”, in Proc id Computing,

Najme of Com Univer M.Sc. Depart Engine Compu Univer B.Sc. Shahid Bahon esearch interes ystems, and Gri

Cameron, L. C nd F. Zini,

Replication c. of the ACM/

, 2002.

e Mansouri is mputer Science

rsity of Kerman in software tment of Com eering, College uter Engin rsity (Iran). S

(Honor Stude ar University o sts include Pa d Computing.

apozza, P. Mi “Simulation Strategies M/IEEE Worksh

currently a fac at Shahid Baho n. She received e engineering mputer Science

e of Electeric neering, Sh She received ent) in Comp of Kerman, Iran arallel Process

illar of in hop

culty onar d her at e & cal& hiraz her puter n in sing,

Eng Shi Par Tec

gineering, Coll iraz University rallel Algorith chnology.

Gholamho received Computer Engineerin Departmen University Oklahoma respectivel professor Departmen lege of Electric y. His curren hms, Grid C

ossein D his M.Sc. a Science fro ng and Com nt, College of y of Oklaho a USA in 19 ly. He is curren

of Compute nt of Compute cal & Compute nt research int Computing and

Dastghaibyfard and Ph.D. in om Electrical mputer Science f Engineering, oma, Norman 79 and 1990, ntly an assistant er Science at er Science and er Engineering, terests include d Information d n

l e , n , t t d , e n