• No results found

4.3 Attribute Ontology

4.3.1 Generic Attribute Ontology

In the system ontology described in Section 4.2, a class named Attribute has been introduced. Individuals from the class SystemComponent are related to individuals from Attribute via the property hasAttribute, that can also be expressed as “characterized by”. The attribute ontology refines the Attribute class, both in terms of semantics (what does the attribute express) and syntactics (how is the attribute value constructed).

Domain, Purpose and Scope. The domain of the ontology are non-functional attributes char- acterizing the components of distributed, service-oriented systems. Its purpose is to provide a framework for classifying such attributes, describing their structure, and specifying whether they are used in the deployment selection optimization process.

Building the Ontology. The ontology building consists of the following three steps. Integration. The integration has been discussed in Chapter 3, Section 3.3.3.

Capture. The identified classes and properties are represented in Figures 4.6 and 4.7 and described in detail below.

Coding. The attribute ontology has been implemented in Prot´eg´e in the same OWL file as

the system ontology presented above. The full OWL file is available in Appendix A.2. Evaluation. In the same way as for the system ontology, the model has been checked with the

Prot´eg´e built-in reasoner HermiT 1.3.4 and the applicability is studied on the TAG system,

as described in Section 4.3.2.

Documentation. Design considerations are detailed in the description of the classes and properties below.

The following base concepts have been identified as classes in the ontology:

AttributeMetaData regroups metadata of attributes used to identify them. For example, an indi- vidual from the class Attribute should have a name and a unique identifier inside a system, so that it can be unambiguously referred to.

AttributeValueDataType is a class regrouping all possible data types that an attribute value can assume. Examples include integer, float, boolean, string, etc.

AttributeValueType regroups subclasses of types of values that attributes can assume, i.e., abso- lute or relative. The class AbsoluteValueType defines absolute values that can be for example defined and set, measured, or estimated. A RelativeValueType is a QoS value that is defined in scales instead of absolute values, such as “low-medium-high”. For example, for the Amazon Elastic Compute Cloud (Amazon EC2) service, instance types are defined in the following categories: small, large, and extra large [4]. Moreover, attributes related to privacy and se- curity can hardly be expressed in absolute values, but they can be rated for instance as low, medium and high. Attributes resulting from user-rankings also usually fall into this category. RuleValueType is an attribute type for non-numerical attributes, for example for regulatory attributes expressing the compliance to a certain law. In such a case, the value cannot be a simple number or integer, but can be expressed as a machine-understandable rule.

AttributeValueUnit regroups possible units an attribute value can assume. Examples include

“bits per second,” “requests per day,” etc.

AttributeDimension defines dimensions along which an attribute is computed. Common dimen- sions include time and size, but others, as well as combined dimensions can be defined. The class TimeDimension refers to the dimension of a computed attribute that has a time aspect, i.e., the value is implicitly or explicitly computed per time interval. An example for such an attribute is the availability, which is defined as the relative uptime in percent, over a consid- ered time period. A SizeDimension refers to the dimension of a computed attribute that has a size aspect, i.e., the value is computed based on size information. An example for such an

4.3. ATTRIBUTE ONTOLOGY 61 attribute is the reliability, which is defined for a certain number of invocations and is com- puted by the number of successful invocations divided by the number of all invocations, also denoted in percent. CombinedDimension refers to a computed attribute that contains time and size-based information, or any combination of other information. Examples include values for performance, e.g., in tasks per time unit or average data blocks per time unit.

AttributeValueAcquisitionType defines the way an attribute value is determined or computed. A DefinedType is set and normally not subject to changes. These attributes are usually attached to resources. Examples include CPU count, total RAM available, etc. A MeasuredType is a constantly changing value that can be measured at a given point in time. Examples include CPU load and network load as well as free RAM and free disk space. Different techniques can be used for publishing the measured values. For example, polling allows gathering attribute values of system components by regularly requesting them. This can be achieved by using either a push-based or a pull -based notification mechanism. In the push-based case the components publish their attribute values to a central instance, often called registry. In the pull-based case the central instance collects the attribute values by querying the components. In case of a snapshot -based publishing, the attribute values are collected upon request, usually when the values are needed for the optimization of a request. This technique allows taking into account the state of a component at request time, e.g., in terms of load. In an application scenario, the presented techniques can be combined, depending on which is most appropriate for a given attribute. An EstimatedType is a basic value that is estimated instead of measured, for instance by analyzing execution logs. Examples include CPU and network load. Such attribute values can be based on historical information, i.e., execution logs are used to derive attribute values, by mining the data. In case of a prediction, known values are extrapolated. For example, if the amount of data touched by a given query is known, it can be used to predict the execution time and the resource demand of a deployment for that particular query.

AttributeDynamicType includes StaticType, QuasiStaticType, and DynamicType. It specifies

whether a considered attribute value is subject to no change (static), very rare changes (quasi static: for example, when a server is upgraded and gets more RAM, this single metric changes, but the resource remains the same, i.e., the values are usually static, but not necessarily throughout the whole lifetime of the component), or frequent changes (dynamic).

The defined properties are the following: hasValueDataType, hasValueType, hasDynamicType, hasValueAcquisitionType, hasDimension, hasValueUnit and hasMetaData. The domain of all the properties are individuals of the class Attribute (also functional attributes can have such characteristics, although they are not studied here), the ranges are suggested by the respective name of the property. All these properties are functional, i.e., each attribute individual can be linked to only one individual of the range class. Intuitively, each attribute has exactly one data type, one value type, one dynamic type, etc. Each attribute individual must be related to exactly one individual via each of these properties, i.e., an attribute must have specified metadata, a specified dimension, acquisition type, etc. If it does not have all of those relationships defined, an attribute does not have enough specification to be included in the deployment selection optimization process.

The following sub-classes of NonFunctionalAttribute have been identified, based on the state of the art presented in Chapter 3 (Section 3.3.2):

PerformanceAttribute: specifies a measure of the performance of a component, typically a time dimension, e.g., the estimated execution time of a task on a component.

CostAttribute: specifies the costs associated with the execution of a task.

AvailabilityAttribute: specifies the availability of a component (resource, deployment, link, provider), i.e., its relative uptime.

ReliabilityAttribute: provides a measure of how reliable a resource, deployment, link or provider is, usually expressed as the percentage of successful requests compared to all requests. ReputationAttribute: quantifies the reputation of a resource, deployment or provider.

LoadAttribute: specifies the current load (at time t) on a resource, deployment or link that influ- ences the execution performance of incoming requests.

UtilizationAttribute: specifies the degree of utilization of a system component.

LatencyAttribute: specifies the time delay between the submission of a request and its execution on a resource, due for instance to the network.

BandwidthAttribute: specifies the available bandwidth, most commonly on a link component. ThroughputAttribute: specifies the available throughput, usually of a network link, as the differ-

ence between the theoretically available bandwidth and the current load.

SecurityAttribute: specifies a measure for the security level provided (or ensured) by a system component.

AccessibilityAttribute: provides a measure for the level of accessibility of a system component, i.e., whether it has access restrictions for example.

RegulatoryAttribute: specifies the compliance of a deployment to certain regulatory measures. For instance, data locality can be an issue – data stored in the United States is subject to other data protection rules than data stored in Europe for example.

In a service selection optimization process, often not all QoS attributes are equally important. Weights and priorities can be assigned to them. Weights and priority assignments are characteristics of an optimization problem definition, and are thus not part of the attribute ontology.