• No results found

In this chapter, HDD has been identied as the main technology used in today's data centers, mainly due to its low cost and high areal density. From an energy perspective, cold storage has been the topic of most of the recent researches related to energy ecient storage in cloud environment.

Further, Cloud Simulation has been identied as a cost-eective solution to perform experiments in a controllable, stable and repeatable way. Numerous Cloud simula- tors have been compared. As a consequence of its popularity, its availability and its extensibility, CloudSim has been the choice of this thesis work. The analysis revealed a lack in storage modeling that has not been yet overcome.

Last, a background of CloudSim operation has been presented to prepare, in the next chapter, the introduction of the CloudSimDisk module.

3 CLOUDSIMDISK: ENERGY-AWARE STORAGE

SIMULATION IN CLOUDSIM

This chapter presents CloudSimDisk, a module for energy aware storage simulation in CloudSim simulator. The rst part explains the objectives of the module, and the module requirements. Next, the main concepts of CloudSimDisk are described, such as HDD model, HDD power model, data cloudlet and data center persistent storage. Then, the execution ow and the packages diagram is explained. At last, energy awareness and scalability for CloudSimDisk are discussed.

3.1 Module Requirements

CloudSimDisk has been developed according to dierent requirements, which pro- vide several advantages to the module and explain the architectural design choices. The main requirement was to respect the architecture and the core processing of CloudSim. In fact, CloudSimDisk is a module for CloudSim so it has to operate in the same way than CloudSim. Also, similar design choices will reduce the learning curve of CloudSim users who want to adopt CloudSimDisk. Further, it will encour- age participations and contributions for the future development of the module. Another important requirement is the scalability of the module. HDD technology is complex to model due to the electromechanical nature of the devices. Additionally, the technology is evolving rapidly. Hence, CloudSimDisk has to be developed with the idea that implementing more characteristics, more features, should be possible later. This capability is a positive argument for the adoption of CloudSimDisk. An additional requirement was to consider rst only the main parameters of the HDD technology, and to provide energy consumption results based on this simpli- ed model. In fact, this work has to be achieved within strict deadlines, so some development decisions such as this one has been taken.

3.2 CloudSimDisk Module

This section introduces the CloudSimDisk module. At rst, the Hard Disk Drive (HDD) model and the associated HDD power model are presented. Then, the data cloudlet object, or storage task, is explained in details. Further, the data center persistent storage is dened.

3.2.1 HDD Model

As explained in Chapter 1, HDDs are still today the most used storage technology in Cloud computing environment. Unfortunately, CloudSim provides only one model of HDD reused from GridSim simulator [76], barely scalable and including some mistakes in its algorithm [78]. To overcome this barrier, CloudSimDisk module implements a new HDD model.

According to [87] [88] [89], the main characteristics aecting the overall HDD perfor- mance are the mechanical components, combination of the read/write head transver- sal movement and the platter rotational movement. Additionally, the internal data transfer rate, often called sustained rate, has been identied as a bottleneck of the overall data transfer rate of an HDD [90]. More recently, [91] proposed a HDD model based on 23 input parameters which achieve between 91% to 96.5% accuracy. Figure 14 shows a diagram of model parameters used in their implementation, organized by functional category. Each parameters is described in detail in order of importance: rst parameter is the position time, "the sum of the seek time and the rotational latency", and second is the transfer time, "the time required to transfer one sector of data to or from the media", namely the Internal Data Transfer Time.

A new package, namely cloudsimdisk.models.hdd, has been created, and contains classes modeling HDD storage components. Each model implements one method, namely getCharacteristic(int key). In this method, the parameter key is an integer corresponding to a specic characteristic of the HDD. To ensure the consis- tency between dierent HDD models, all the classes extend one common abstract class, which declares the getCharacteristic(int key) method. Thereby, the pa- rameter key corresponds to the same HDD characteristic in each model.

However, it is not convenient for developers or users to play with key numbers. Hence, the getCharacteristic(int key) method has been declared as Protected and cannot be used directly. Instead, the common abstract class implements a getter for each HDD characteristic (getCapacity(), getAvgSeekTime(), etc.). The getCharacteristic(int key) method is used only internally to retrieve the required characteristic. As a result, methods accessed by users are semantically un- derstandable. Table 3 inventories the available methods declared in HDD models to retrieve HDD characteristics.

Table 3. CloudSimDisk HDD characteristics. KEY

0 The name of the Manufacturer (Ex: Seagate Technology, Toshiba, West-getManufacturerName() ern Digital).

1 The unique manufacturer reference (Ex: ST4000DM000).getModelNumber() 2 The capacity of the HDD in megabyte (MB).getCapacity()

3 The average rotation latency of the disk which is dened as half thegetAvgRotationLatency() amount of time it takes for the disk to make one full revolution, in second (s), directly dependent on the disk rotation speed in Rotation Per Minute (RPM).

4 The average seek time of the disk which is dened as the average timegetAvgSeekTime() needed to move the read/write head from track x to track y, also corre- sponding to one-third of the longest possible seek time, moving from the outermost track to the innermost track, assuming an uniform distribution of requests [92].

5 The maximum internal data transfer rate which is dened as the rate atgetMaxInternalDataTransferRate() which data is transferred physically from the disk to the internal buer, also called Sustained Data Rate or Sustained Transfer Rate.

3.2.2 HDD Power Model

For the toolkit 3.0, Anton Beloglazov has included a power package to CloudSim, based on his publication a year before [72]. This implementation provides the nec- essary algorithm for modeling and simulation of energy-aware computational re- sources, i.e. Host and Virtual Machines. However, it does not provide energy awareness to the storage component.

Thus, similarly to 3.2.1, the package cloudsimdisk.power.models.hdd has been created in accordance with the power package in place. Inside, the abstract class PowerModelHdd.java implements semantically understandable getters to retrieve the power data of a specic HDD in a particular operating mode. Table 4 invento- ries the available operating power mode declared in HDD power models.

Table 4. CloudSimDisk HDD power mode.

KEY MODE DESCRIPTION

0 Active The disk is handling a request.

1 Idle The disk is spinning but there is no activity on it.

3.2.3 Data Cloudlet

As explained in 2.3.1, CloudSimDisk models a request with a Cloudlet component. However, the CloudSim implementation of this component interacts mainly with the Host's CPU hardware element. No examples of interactions with storage element are provided and no results are printed out. Thus, an extension of the CloudSim Cloudlet is proposed by CloudSimDisk. The default Cloudlet constructor with eight parameters has been reused. Additionally, two new parameters have been dened:

• requiredFiles: a list of lenames that need to be retrieved by the cloudlet. These requested les have to be stored on the persistent storage of the Data- center before the cloudlet is executed.

• dataFiles: a list of les that need to be stored by the cloudlet. These new les will be added to the persistent storage of the Datacenter during the cloudlet processing.

Note that requiredFiles has been already implemented in CloudSim v3.0.3 but the constructor parameter to set this variable has been called fileList. However, this list is not a list of File object, but a list of String corresponding to lenames. To make matters even more confusing, the new parameter dataFiles implemented in CloudSimDisk is a list of File. Thus, in order to clarify things, the fileList parameter has not been reused by CloudSimDisk. Instead, requiredFiles and dataFiles are parameters of the new Cloudlet's constructor (see Figure 15).

Figure 15. CloudSimDisk Cloudlet constructor.

3.2.4 Data Center Persistent Storage

In CloudSim, one parameter of the data center entity is a list of Storage elements. This list models the data center persistent storage. Unfortunately, CloudSim does not provide any example how to interact with this component.

CloudSimDisk's aim is to provide a module for storage modeling and simulation in CloudSim. Thus, an extension of the CloudSim data center model has been realized by CloudSimDisk. Methods have been deleted, overridden and created in order to interact only with the data center persistent storage. As a result, the data center model implements all necessary algorithms to process requiredFiles and dataFiles of a Cloudlet when one is received.

Related documents