Temporal network management model Concepts and implementation issues

(1)

Temporal network management model

Concepts and implementation issues

Theodore K. Apostolopoulos*, Victoria C. Daskalou

Department of Informatics, Athens University of Economics and Business, 76 Patission Street, 104 34 Athens, Greece

Received 15 March 1996; accepted 2 August 1996

Abstract

Networks grow in scale and become more complex and heterogeneous as well. A lot of research effort is focused on the topic of network management. This is because nowadays a network management system is a vital component of any network installation. In this paper, we propose an extended network management information model that includes the time dimension. This view leads us to adopt a temporal data model as the underlying information model. Moreover, the services provided by this model, as well as the architecture supporting them, are presented. In addition to that, we have implemented a prototype incorporating the proposed network management view. The description of the prototype is also presented in this paper. q1997 Elsevier Science B.V.

Keywords: Networks; Time dimension; Temporal data model

1. Introduction

Computer and telecommunication networks are becom-ing one of the drivbecom-ing forces for the progress of economic and social life. These networks are not only growing in scale but are becoming more complex and heterogeneous as they support multivendor applications on a variety of underlying transmission facilities. Such expansion and complexity have induced growing challenges to efficiently manage the net-work elements according to the netnet-work providers’ and users’ objectives and expectations. As a result, a lot of research effort has gone in to solving problems arising in this area and establishing standards that could be used across a broad spectrum of product types (e.g. hosts, routers, bridges, switches, telecommunication equipment) in a multi-vendor environment. In response to the need for standards two main efforts are underway: one from the International Organization for Standardization (ISO), named OSI Systems

Management, and another from the Internet Engineering

Task Force (IETF), named Simple Network Management

Protocol’s (SNMP) family.

The huge size and the high complexity of such networks dictate also the use of automated network management systems that could help the network engineer to efficiently manage the network elements. The general architecture of a

network management system (Fig. 1) is based on a client– server architecture, where the server is called the agent, while the client is the managing process. Each network component has an agent which maintains a local manage-ment information base (MIB), and can communicate with the management application residing in the network management station through a network management proto-col such as the SNMP [1] or CMIP protoproto-cols [2]. The MIB is a conceptual representation of the network resources that provides the network manager with the means to observe and control the current behaviour of network elements. The interaction between the management application and the management agents admits the retrieval and/or update of the MIB information in a way that enables the imple-mentation of various network management functions, from monitoring the state of the network to changing the network configuration.

Time is an attribute of most real world phenomena, and in this paper we address the issue of time as an attribute of network management information. More precisely, we incorporate the temporal dimension in the management information model proposed by the IETF, as it is described by SMI [3]. This new temporal network management infor-mation model was built upon a temporally ungrouped his-torical data model—TRDM, as it has been named in [4]— which was proposed in [5]. The core of the proposed management information model is the Temporal

Manage-ment Information Base (TMIB), a conceptual representation

Computer Communications 20 (1997) 694–708

PII S 0 1 4 0 - 3 6 6 4 ( 9 7 ) 0 0 0 6 3 - 7

* Corresponding author. Tel.: +30 1 8203173; fax: +30 1 8226204; e-mail: [email protected]

(2)

of the diachronic behaviour of network resources. The con-cept of a temporal network management information model has already been applied to the areas of performance and configuration management, as is described in earlier works [6,7].

The temporal network management model and the TMIB design are presented in detail in Section 2. The architecture of the network management application that implements the proposed model and the network management services are described in Section 3. In Section 4 we present the archi-tecture and the main operations of the prototype we have implemented, while in Section 5 we summarise the main points of our approach and we discuss possible applications of the proposed model.

2. The temporal network management information model

2.1. The concepts

Two different information models were defined in order

to represent the network management information: one from the ISO [8] and another one from the IETF [3]. Both models are centred around the so-called managed objects, which represent an abstraction of network resources and form the management information base (MIB). The informational model of the ISO is fully object-oriented, with managed objects being instances of managed object classes. The IETF’s model is less complex where managed objects can be classified in two types: scalar objects and

table objects. The table objects are two-dimensional

arrays of scalar objects and at a given time they consist of multiple row entries. The scalar objects are simple MIB variables which can have at most one instance at a given time.

Managed objects that are defined, based on either of these two models, have to provide resources for all functional areas of network management. The requirements of the functional areas are quite different to a large extent and not yet well understood [9]. In order to specify the type of management information that is necessary for each of the five functional areas we proposed a generic management informational model (Fig. 2). This model tries to classify the management information contained in the MIB accord-ing to the followaccord-ing three criteria:

1. the intended use of the network management information 2. the frequency that management information variables

change their values

3. the temporal dimension of the management information. The first criterion introduces a concept closely related to the type of functions included in a network management application in relation to the types of data that they need. According to this criterion, the management data can be classified into three types [10]:

• Sensor data: these data represent the current network

Fig. 1. Network management architecture.

Fig. 2. The proposed network management perspective.

(3)

‘health’, in terms of its operational state. More precisely, they are the raw information received by the network monitoring and retrieval process.

• Control data: these data capture the current setting of the

networks tuning parameters (for example, the routing table) that are used mostly by control applications. • Structural data: these data describe the physical and

logical construction of the network. That is, information related to the network topology, the configuration of network resources, the user description records, etc. By taking into consideration the second criterion we define a higher level abstraction of the management infor-mation, which adopts two broad classes of objects: • Quasistatic objects, which describe the current network

configuration (e.g. the number of host interfaces, the routing table, etc.); their values do not change very often.

• Dynamic objects, which are related to network events

(e.g. the transmission of packets); their values change very often during time.

The third criterion we have adopted is related to the time dimension of the management information required by dif-ferent applications. For example, a network management application concerning real-time performance monitoring has radically different requirements, as far as the required data are concerned, from a network management applica-tion concerning the collecapplica-tion and usage of long-term statistical data. The main difference lies in the time horizon required by both applications. So, we have to consider net-work management as a concept that incorporates temporal information in relation to the type of network management application. We can distinguish three regions for the time parameter related to network management operations. The

real-time region which is related to real-time operations, the time non-critical region which is related to operations that

are based on information in a given time window and finally the global time region which is related to operations that are based on long-term information. Usually, the different time regions reflect the difference in the point of view of various people involved in a system (for example a network engineer, people with financial responsibilities etc.). It should be clear that the time parameter has a meaning directly related to the type of application, and not an absolute meaning.

According to this point of view any network management application has to be defined in a suitable area, described by specific values of the three parameters defined above. For example, taking into account the five OSI functional areas, we can have the following considerations as far as the used subspace is concerned:

• Configuration management: for example, a simple

con-figuration management tool that provides only a central storage for generic network information, e.g. name of devices, system physical location, system administration,

etc., network address assignments and other pertinent data for network elements, uses a subspace defined as global

time 3 quasistatic information 3 structural data. A

more complex configuration management tool, that auto-matically gathers and stores information from all managed devices, that compares the system current running con-figuration with control data stored in the tool and enables the user to change the running configuration of the device, uses a subspace defined as time non-critical3 quasistatic information 3 control data.

• Performance management: for example, a performance

management tool that measures the current network uti-lisation and compares its value with a user-defined threshold in order to generate an alarm uses a subspace defined as real-time 3 dynamic information 3 sensor data.

• Fault management: for example, a fault management

tool that checks the number of packets received in error for every interface of managed routers in an enter-prise network, so as to generate an alarm when the value increases up to a predefined level, uses a subspace defined as real-time 3 dynamic information 3 sensor data. A fault management tool that checks the operating

and the administrative status of every interface of all managed objects in an enterprise network in order to detect an interface failure uses a subspace defined as

real-time 3 quasistatic information 3 sensor data. 2.2. The model

The temporal management information model proposed in this paper represents the past, current and future network state in a temporal management information base, TMIB. This new information model uses as a basis the IETF’s model and it extends it in order to include the temporal nature of the management information. The incorporation of the temporal dimension in the network management informational model admits the adoption of a temporal data-base model. This model is augmented by the appropriate network monitor procedures in order to gather the historical network behaviour and store it in the TMIB, and by the appropriate network executors in order to change the current network state. Moreover, it is augmented by prediction and simulation procedures in order to provide information about the future behaviour of network resources. As a conse-quence, the TMIB is a temporal database whose design is tailor-made for network management.

The TMIB is the core of a network management system, as it provides the interface between the network adminis-trator and all the functions of network management. One important parameter of the TMIB design is to have the net-work administrator interact solely with the database, by using a temporal query language, so that from the user’s point of view the TMIB embodies the network diachroni-cally. As a result of this TMIB design, whenever the net-work administrator wishes to make changes in the netnet-work

(4)

objects, such as changing the routing scheme, the adminis-trator updates the appropriate variables in the database by using a temporal replace statement.

The adoption of a temporal database model is quite a difficult task because over the course of the past decade various relational data models have been proposed in order to include the temporal dimension. Generally, these data models extend the standard relational data model by including a temporal component. This incorporation of the time parameter has taken a number of different forms. Among these we can distinguish two main approaches: the incorporation of the temporal dimension at the tuple level (by timestamping each tuple) or at the attribute-value level (by including time as part of each value). For the first approach, the terms first-normal-form (1NF) relations or

temporally ungrouped (TU) data models are also used.

For the second approach, the terms non-first-normal-form relations or temporally grouped (TG) data models are also used [4].

Among the temporally ungrouped data models we can distinguish TRDM, a temporal database model presented in [5]. The main advantage of this model is its query language, TQuel (Temporal QUEry Language), which can be used for the manipulation of temporal databases. The TRDM provides for two types of historical relations. One, called an interval relation, is derived from a standard rela-tion through the addirela-tion of two temporal attributes,

valid-from and valid-to, both of whose domains are sets of times T. The values of the nontemporal attributes of a tuple in such

a relation are considered to be valid during the beginning of the interval of time starting at the valid-from value and end-ing at, but not includend-ing, the valid-to value. This interval thus denotes the lifespan of the tuple. The second type of relation, an event relation, is defined by extending a stan-dard relation by a single temporal attribute valid-at, indi-cating the time instant that a specific event took place.

According to the new temporal management information model proposed in this paper, the TMIB is a collection of temporal table objects that represent a diachronic view of management information. There are two cases: according to the first case, a TMIB table consists of a set of explicit columnar management information objects and of two implicit time objects. These time columnar objects,

valid-From and validTo, represent the time interval [validvalid-From, validTo) during which the state of the management

information is valid. We call this type of object an interval table. According to the second case a TMIB table consists of a set of explicit columnar management information objects and of one implicit time object validAt. This time object refers to the instant that an event described by the explicit columnar objects took place. We call this type of object an event table. The MIB table objects in every group are mapped onto temporal table objects, with their columnar objects representing the explicit non-temporal columnar objects of the derived TMIB table. Moreover, the scalar objects of each group are mapped onto the explicit non-temporal columnar objects of TMIB tables. The values of the implicit time columnar objects, e.g. validFrom, validTo and validAt, represent the time at the manager site. More-over, values of explicit columnar objects that are derived from MIB objects related to time, for example in MIB-II [11] the sysUpTime, ifLastChange, etc. and in SNMPv2 M2M MIB [12] the snmpEventLastTimeSent, etc., repre-sent the time at the agent site in a highly centralized archi-tecture and/or at the intermediate manager site in a distributed architecture. This mapping is designed in a way to permit the representation of all the SNMPv1 MIBs and new SNMPv2 MIBs, such as SNMPv2 M2M MIB, that include objects related to time at the agent or at the inter-mediate manager level, in TMIB tables and give an open character to our model.

Taking into consideration the criterion that classifies the network management information into quasistatic and dynamic managed objects we define two types of TMIB tables, quasistatic and dynamic TMIB tables. The non-temporal columnar objects of a quasistatic TMIB table represent only the quasistatic part of the corresponding MIB table or quasistatic scalar objects of MIB groups. Dynamic TMIB tables represent only dynamic managed objects. The non-temporal columnar objects of a dynamic TMIB table represent only the dynamic columnar objects of the corresponding MIB table or only dynamic scalar objects of MIB groups. The explicit columnar objects of a dynamic TMIB table represent delta values of the corresponding MIB objects.

An example of a quasistatic TMIB table representing the history of a quasistatic part of the ipRouteTable for router ‘pegasus’ is depicted in Fig. 3. For example, the row with values {pegasus, 193.92.156.0, 1, 3,…, 193.92.96.95, …, 100, 120} means that during the past time interval [100, 120)

Fig. 3. Part of the quasistatic ipRouteTable TMIB table object.

(5)

the routing entry for the destination 193.92.156.0 had the values mentioned in this TMIB table row. In this table, the current management information, that is the routing entries that are still valid, has a value equal to `for the validTo columnar object.

An example of a TMIB table representing the history of some scalar dynamic objects of the ip group of the host ‘phaethon’ is depicted in Fig. 4. Each row of this table repre-sents the delta values of the dynamic objects for a given managed node during the interval [validFrom, validTo).

An example of a part of the tcpConnTable TMIB object of host ‘dias’ is illustrated in Fig. 5. In this figure we can see the history of different states of the TCP connection in host ‘dias’ between the tcpConnLocalAddress¼193.92.99.60, tcpConnLocalPort ¼ 21 and tcpConnRemAddress ¼

193.92.99.80, tcpConnRemPort¼3223.

According to the nature of the information included in the two categories of TMIB tables the operations that can be performed in each category are quite different. The infor-mation included in quasistatic TMIB tables can be the argu-ment in a retrieve, append, delete or replace database operation, while the information included in dynamic TMIB tables can be an argument only in a retrieve opera-tion. The TMIB model is illustrated in Fig. 6. In the lower layer we can see the network monitors, which use polling procedures in order to gather and store the historical net-work management information in the quasistatic and dynamic TMIB tables, and the network executors, which are used in order to implement the append, delete and replace data-base operations in the current network management

information included in quasistatic TMIB tables. In the middle layer we can see the dynamic and quasistatic TMIB tables that provide the historical (past and current) manage-ment information. Finally, in the upper layer we consider a number of prediction and simulation algorithms that provide information about the future behaviour of network resources.

Finally, we have to consider the issue of choosing the value of the polling interval used by network monitors in order to gather and store the historical management infor-mation in dynamic and quasistatic TMIB tables. This choice apparently depends on various factors such as the number of agents, the network delay, the processing time implementing the polling mechanism, etc. The values that are appropriate for a given configuration have to be determined in practice. In the sequel, we just give some indications and constraints that the choice of the polling interval has to satisfy.

The polling takes place in a non-blocking way, that is the process first issues the whole number of requests and then it waits for the replies. In this way the network latency does not play a crucial role. Let T be the value of the polling interval in seconds and let dibe the time required to process

the request/response for the ith agent. The last quantity depends on the processing power of the workstation on which the polling process is running, on the specific TMIB structure (how many MIB objects are required per request), and on the number of requests/responses processed at the same time. For a given number of agents n let

d¼max{di} i¼1,2,:::n Fig. 4. Part of the dynamic scalar TMIB table object of the ip group.

(6)

Thus by imposing an upper bound UBcpuof the CPU utilisa-tion for the polling process we have that the following must hold:

n·d

T ,UBcpu

Let RQ and RS be the total size in bytes of a request and reply respectively in every poll, and let C be the capacity of a virtual line used by the network management station to communicate with each managed node separately. Obviously, C depends on the topology of the network and indirectly on the number of agents n. By imposing an upper bound UBnetfor the network utilisation dedicated to network management we have that it must hold:

(RQþRS)·8

T3C ,UBnet

Let NDB and RDB be the average number of rows inserted in the TMIB after a poll and the size of each row in bytes, respectively. Imposing an upper bound UBdbin bytes on the daily added amount of data in the TMIB, we have that it must hold:

n24·3600

T NDB·RDB,UBdb

Note that the type of MIB, quasistatic or dynamic, deter-mines that the value of NDB varies between 0 and 1.

For a given implementation the polling interval must be chosen so as to satisfy all the above constraints. More pre-cisely, the bounds must be set not as single values but as pairs of values, for quasistatic and dynamic objects for a given configuration and TMIB respectively. The fulfilment

of the above constraints leads to the minimum value of the polling interval that can be selected. In practice, the polling interval may be any number greater than or equal to T depending on the nature of the application and the desired accuracy of network management information.

3. Architecture and services

In this section we will present the architectural model that incorporates directly the proposed network management perspective. We define network management application object (NMAO) as an object consisting of data and proce-dures. Each NMAO is referred to a specific network element (simple object) or network (aggregate object), it is physi-cally located in a specific host and it is totally independent of any other NMAO. It has the responsibility of collecting network management information for all the network com-ponents and/or the networks belonging to the area of respon-sibility of this particular object. Each NMAO maintains a TMIB describing the network state. More precisely, the net-work state is mapped to a generic temporal database schema through a preselected and user-adjustable use of filters. By using these filters the NMAO can choose the right resolution as far as the time (how often) and/or the space (what infor-mation) is concerned.

As we have already seen, the TMIB structure is an exten-sion of the MIB structure. It is implemented through a data-base approach that adds some new attributes, namely the time attributes, but maintains the MIB semantics. In this way each NMAO database can communicate with other Fig. 6. The type of network management information contained in the TMIB.

(7)

NMAO databases following the common TMIB semantics. The NMAO databases may cooperate among themselves by negotiating what information they should exchange and by deciding how they will exploit the results. In this way, the network management of a generic internetwork may take place through the cooperation among completely autono-mous systems. This approach may be implemented by using a distributed database model. The use of a local data-base in every NMAO saves communication bandwidth in collecting network management data, at the expense of not having available a central global state of the system. The above described concept is depicted in Fig. 7.

All the well-known techniques and protocols borrowed from the distributed databases era may be used in order to achieve a consistent global state of the overall network management application. For example, the two-phase com-mit protocol (2PC) may be used to modify routing tables in a way that ensures atomicity. Another example concerns the data replication concept found in the distributed database approach. Consider the case of problems arising from tem-poral network disconnections. They can be minimised by using appropriate network management tools for restoring network connectivity. In order to do so, it is necessary to be able to access the appropriate MIB information. This infor-mation can be available through data replication.

By using the aforementioned temporal database model we can implement basic network management services as temporal database operations. In what follows TQuel has been adopted as the data manipulation language. In the sequel, we will consider the ‘retrieve’ operation only. We will describe the incorporated extensions as far as the time component is concerned. The other TQuel operations behave similarly. A skeleton for the TQuel retrieve state-ment that we use with our model can be the following [5]:

retrievetarget_list

validclause

whereclause

whenclause

The target list specifies the specific columnar objects of the retrieved rows and the where clause specifies constraints that apply on the values of the explicit columnar objects in order to restrict the underlying rows that participate in the query. The when clause is the time analogue of the where clause: it specifies a predicate on the implicit time columnar objects of the underlying rows that must be satisfied for those rows to participate in the remainder of the processing of the query. Finally, the valid clause serves the same pur-pose as the target list, specifying the value of an attribute in the derived table; but in this case, the values of the implicit time columnar objects of the derived row are being speci-fied. Note how easily the network engineer can acquire and use network management information.

In order to present the retrieve statement we give some examples of the queries that the network manager could issue. More precisely, with regard to the representation of the part of the quasistatic TMIB table object in Fig. 3, we present a number of queries in order to discuss the TQuel retrieve statement. First, the two examples of QUERY 1 retrieve the current state of the routing table of the host ‘pegasus’, as a whole or as a specific part. Then, the two examples of QUERY 2 retrieve the state of the routing table (as a whole or a specific part) of the host ‘pegasus’ at a specific past time span. Moreover, the two examples of QUERY 3 retrieve the history of the routing table (as a whole or a specific part) of the host ‘pegasus’, during a past period of time. Finally, the example of QUERY 4 is used to retrieve the time when specific changes to the values of MIB objects were detected by the polling mechanism. QUERY 1

(a) Which is the current routing table of node ‘pegasus’?

range ofRisipRouteTable

retrieve(R.all)

whereR.hostID¼‘pegasus’

whenRoverlap present

(8)

(b) Which is the last value of the ipRouteNextHop for the

ipRouteDest ‘193.92.156.0’ of node ‘pegasus’?

retrieve(R.ipRouteDest,

R.ipRoute-NextHop)

whereR.hostID¼‘pegasus’ and

ipRouteDest¼‘193.92.156.0’

whenRoverlap present

QUERY 2

(a) Which was the routing table of node ‘pegasus’ at noon

on May 5, 1995?

retrieve(R.all)

whenRoverlap|12 PM May 5, 1995|

(b) Which was the value of the ipRouteNextHop for the

ipRouteDest ‘193.92.156.0’ of node ‘pegasus’ at noon on May 5, 1995?

R.ipRoute-NextHop)

whereR.hostID¼‘pegasus’ and

ipRouteDest¼‘193.92.156.0’

whenRoverlap|12 PM May 5, 1995|

QUERY 3

(a) Which was the routing table of node ‘pegasus’ during

March 1995?

retrieve(R.all)

whenRoverlap[March 1995]

(b) Which was the value ipRouteNextHop for the

ipRoute-Dest ‘193.92.156.0’ for node ‘pegasus’ during March 1995?

R.ipRouteNextHop)

whenRoverlap[March 1995]

QUERY 4

When was it detected by the management application that the ipRouteNextHop for ipRouteDest ‘193.92.156.0’ for node ‘pegasus’ changed its value to ‘193.92.98.9’?

retrieve(R.ipRouteNextHop)

valid at begin ofR

whereR.ipRouteDest¼‘193.92.156.0’

and R.ipRouteNextHop¼‘193.92.98.9’

The proposed view of the network management concept, already described above, can be implemented in a layered structure in a control plane as is depicted in Fig. 8. Each NMAO consists of the following layers and elements.

3.1. Layer 1

This includes the SNMP protocol. This layer provides the raw service of transmitting network management informa-tion based on the SNMPv1 primitive operainforma-tions get-request, getnext-request, set-request, get-response and trap. The net-work management protocol that can be used is not limited to SNMPv1. The SNMPv2 [13,14] can also be use in this layer. The key enhancements of SNMPv2 are that we can support a distributed architecture and manage a set of intermediate managers, new protocol operations, as getbulk-request and inform-request, and specific security services as privacy, message authentication and access control. The CMIP or any other network management protocol can also be used for this type of service.

3.2. Layer 2

This provides three types of service.

Type 1 provides the service of changing/updating the net-work management information and is based on the SNMP primitive service set-request. This service concerns the operations used for configuration changes in order to main-tain the prescribed levels of performance and for activating restoring mechanisms in the event of host failure or mal-function.

Type 2 provides a monitoring service. This service can be further subdivided into a real-time performance monitoring service and a real-time network status monitoring service.

Type 3 provides the service of collecting and storing the MIB information.

Fig. 8. The proposed layered architecture.

(9)

On the basis of the services already defined we consider three generic service elements lying on the top of these services. We name them strategic applications service ele-ment, control applications service element and monitoring service element. These service elements do their job based on the layer 2 services and they may be distributed or not.

4. Description of the prototype

In order to implement the above described network management concepts, we developed a prototype that includes some of the proposed characteristics. In the sequel, we will describe the main characteristics of the prototype as well as our experience from its usage.

The prototype was implemented in a DEC 3100 work-station at the beginning, and has now been transferred to an Alpha workstation. The focus was on the MIB II, but generalisation (by adding other MIBs) can easily be derived. The network management protocol used was the SNMP protocol. In order to acquire network management informa-tion we used the polling approach with polling period dependent on the type of the requested information and on the number of managed nodes.

Apart from the TMIB implementation which follows the rules already described, we had to define two other tables, namely the networks table and the status table, which have the following description:

• Networks: this table has one row for each network we

want to manage. It mainly maintains information about the hosts included in each network by referring to a HostFile with structure described in RFC 952.

• Status: this table has the structure depicted in Fig. 9.

The table has at least one row for each host in the spe-cified network. This table is updated (by inserting new rows) every time the status of the corresponding host changes.

As far as the implementation architecture of the prototype is concerned, the general scheme of the architecture is depicted in Fig. 10. Each NMAO consists of two main pro-cesses: the polling process and the process that implements the graphical user interface (GUI).

The polling mechanism is implemented as a background process that periodically polls the SNMP agents of the net-work and communicates with the GUI process with IPC methods. This allows the polling parameters to be received dynamically and user commands to be obeyed to initiate and terminate the polling procedure. Moreover, the polling pro-cess, in order to collect the values of the MIB variables contained in the MIB-like database tables, constructs the SNMP requests by issuing queries to the data dictionary.

Some of the facilities provided by the GUI process are the following:

• Real-time monitoring of the following parameters: operational state of network resources, MIB control and configuration variables.

• Graphical presentation of the performance measures as functions of time, based on historical and current raw performance data.

• Dynamic definition of performance measures.

• Maintenance of raw network management information. In the sequel, we will present the part of the prototype related to the management of the quasistatic MIB variables. The operations included are rather primitive, while manage-ment applications may be built on the top of these operations.

The ‘quasistatic MIB management’ option of the frame depicted in Fig. 11 allows the user to choose a specific Fig. 9. The status table.

Fig. 10. The elements of the network management application imple-mentation.

(10)

combination of session and network in order to define table objects that contain the network management informa-tion. This choice leads to the frame illustrated in Fig. 12, where the tables that are included in this session and net-work combination are presented.

The user can select the table he wants and do the follow-ing operations:

• View: table data presentation according to specific user

constraints.

• Delete: deletion of the table contents.

• View Part: table data presentation in a way that enables

the user to choose at the same time the fields to be displayed as well as the constraints that have to be fulfilled.

The constraints concern two parameters. First, they concern the host (we can choose one or more hosts). Second, they concern the valid time constraints the user wants. In addition to that, there is a prespecified operation, we

call it the LAST operation, that enables us to locate the rows of the table that are still valid. The constraints can be set by using the quasistatic constraints frame depicted in Fig. 13.

Next, we will present the part of the prototype related to the graphical presentation of the performance measures as functions of time. The operations included are rather primitive. More complex management applications may be built on the top of these primitive operations.

The ‘performance monitoring’ option of the frame illu-strated in Fig. 11 leads to the frame depicted in Fig. 14, where the user can choose a specific combination of session identity and network name and a performance measure for presentation in a line graph as a function of time.

The performance indicators included in the prototype are based on dynamic sensor data and they concern host performance indicators, related either to host interfaces or to the TCP/IP implementation. Thus, the ‘input error rate’ is used in order to trace possible transmission medium failures, while the ‘output error rate’ is used to monitor the performance of the interface. For example, in a thin Ethernet network the terminator may be disconnected. This will result in a great number of incoming packets received in error. A high value of the ‘output error rate’ indicates a possible interface malfunction. In addition to that, the ‘throughput’ and the ‘utilization’ for an interface is defined.

The rest of the described performance indicators are related to TCP/IP performance. We can use the ‘reassembly failure probability’, which indicates a possible problem with the values of various TCP/IP parameters or a problem in the network itself. By examining the ‘TCP seg-ments retransmission probability’ we can distinguish between those cases.

In what follows, we will present some of the performance indicators we have used. Taking into consideration the fact that only the rate at which the dynamic sensor data change contributes an indicator of the network state, we adopted Fig. 12. The table list frame.

Fig. 13. The quasistatic constraints frame.

(11)

the following notation: MIB-variable name(t): ¼value of the MIB-variable at time t; (MIB-variable):¼MIB-variable name(t1)¹MIB-variable name(t0).

TCP/IP implementation related performance indicators:

Reassembly failure probability¼D(ipReasmFails)

D(ipReasmReqds)

Fragmentation failure probability

¼ D(ipFragFails)

D(ipFragOKs)þD(ipFragFails)

Network services utilisation¼D(tcpCurrEstab) tcpMaxConn

TCP segments retransmission probability

¼D(tcpRetransSegs)

D(tcpOutSegs)

Host interface related performance indicators (on specific host interface):

Throughput : S¼D(ifInOctets)þD(ifOutOctets)

D(sysUpTime)

Link utilization¼ Sp8 ifSpeed

Input error rate¼D(ifInErrors)

D(sysUpTime)

Output error rate¼D(ifOutErrors)

D(sysUpTime)

These performance indicators can be chosen by the user directly from the performance monitoring frame depicted in Fig. 14.

After selecting a specific performance measure, the user can set two types of constraints to its presentation. The first type of constraint concerns the network node and the interface number (wherever needed, as for the throughput

Fig. 14. The performance monitoring frame.

IP throughput¼D(ipInRecieves)þD(ipOutRequests)þD(ipForwDatagramsp)

D(sysUpTime)

*for nodes acting as gateways;

UDP throughput¼D(udpInDatagrams)þD(udpNoPorts)þD(udpInErrors)þD(udpOutDatagrams)

(12)

calculation). The second one concerns the valid time bound-aries and the step parameter. The frame for setting con-straints to performance measures related to TCP/IP implementation is depicted in Fig. 15.

Except for the performance indicators presented in the performance monitoring frame the user can define dynami-cally a performance indicator of his own choice. By using string parsing the management application will start a poll-ing procedure for the appropriate MIB variables in order to give a graphical presentation of the performance indicator. Finally, another part of the prototype concerns the exis-tence of database management tools, accessible directly from the network management environment. Using this part, the user can perform database management operations on the raw performance data stored in the data tables (queries, table creation, etc.).

Our prototype has been tested thoroughly as far as its functionality is concerned. So, we have done a number of experiments that show:

• how useful the temporal parameter is in building appli-cations that are based either on quasistatic or dynamic objects. The whole history is directly accessible in a unified way.

• how easy the development of new applications is. New applications are developed by using either database queries or 4GL programming.

Part of these experiments is described in the next section discussing paradigms of how the proposed model can be used in network management applications.

As far as the size of the network being managed is con-cerned, the tests have been done for the university network. More precisely, the number of managed nodes varied from 5 to 10, in order to examine the influence of the network size in the performance of our model. Costs do not seem to vary significantly. Regarding the network utilisation used

by the network management requests and replies we observed that it remains below 1% for all cases, while as far as the data volume is concerned, it varies linearly with the number of managed nodes. The last observation does not pose any problem in real applications, because we can define a number of NMAOs, each of which has a TMIB for specific applications, which is of manageable size rela-tive to the computing power of the workstation where the NMAO resides. From our experience every typical midsize workstation can host a NMAO.

As far as the choice of the polling period is concerned we have tested a number of pair values (polling period for quasistatic objects, polling period for dynamic objects). The choice influences first the percentage of network resources used for network management and second the size of TMIB. For the quasistatic objects, it seems that while the percentage of network resources used for network management grows proportionally with the decrease of the polling interval, the resulting information existing in the TMIB does not differ significantly. This means that as far as the choice of the quasistatic objects polling interval is concerned a rather large value is suitable. Typical values may be [30 sec, 3 min].

For the dynamic objects, it seems that small values for the polling interval increase both the percentage of network resources used for network management and the TMIB size because typically every query results in new TMIB infor-mation. But most of the applications that use this type of data, such as performance and fault management applica-tions, are not based on instantaneous data but rather on averaged data over a given period. For most cases values in the region of [30 sec, 90 sec] are sufficient. The approxi-mation of the values of various measures is good enough. Lower time resolution is required only for measurements of burstiness.

As far as the database size is concerned we have observed Fig. 15. The performance monitoring constraints frame.

(13)

that this is not a serious problem. Considering the polling interval of 30 sec for both quasistatic and dynamic objects we have that per managed node the database size is on average as follows: for the dynamic objects almost every request resulted in new rows, thus giving at most 3000 rows per day, each row consisting of about 0.5 Kbytes. For quasi-static objects typically only one out of n requests contributes new rows where n takes values larger that 4 for most cases. Thus about at most 750 rows per day (worst case) are added, each row consisting of about 1.5 Kbytes. So, for the full MIB-II information we need about 6 Mbytes per day per managed node. Assuming that the lifetime of information which is useful for real-time or time non-critical applica-tions is no more that two days, we find that we need storage space of about 12 Mbytes per managed node. As the net-work being managed by one NMAO consists of no more than 50 nodes, the required storage is less than the 1 GB which is typically available in any network management workstation. As far as global time applications are con-cerned, the necessary storage space depends heavily on the specific application. In those cases, off-line analysis takes place—not necessary in the network management sta-tion—and the data are provided from off-line storage.

5. Conclusions

In this paper, we have proposed a new network manage-ment information model that directly includes the temporal dimension. In this framework the services provided by this model have been given, while the architecture supporting them has been presented as well. Our view has been sup-ported by the implementation of a prototype related to the proposed model. The main aspects of our approach are: • The direct incorporation of the time parameter as an

attribute of the network management information, in a

way that maintains the already existing semantics as far as the network management model is concerned. • The adoption of the temporal database model for storing

the historical network management information. • The use of temporal database queries for doing network

management related operations.

• The fact that we can easily incorporate in our model either future MIBs or private MIBs, that is our model is open as far as the creation of new MIBs is concerned. • The layered architecture that we have proposed for the

temporal network management information model. The proposed model can be used to facilitate the devel-opment of network management applications. Having as a base the evaluation of network management information in a given time window, we can easily apply various types of requests ranging from a user query to the database to an algorithm detecting event correlation either in time non-critical or global time regions. We conclude our dis-cussion presenting a few cases for which the proposed model has been used.

We consider that we have to manage an enterprise net-work that consists of several subnetnet-works connected to the enterprise backbone network. Suppose that a subnetwork A is connected to the backbone network with a serial leased line and two routers, one named ‘sisyfos’ at the backbone network site and another one named ‘aiolos’ at the subnet-work A site. In this environment users of subnetsubnet-work A complain that they have low service quality. In order to solve the problem, we try to check if the line is problematic, towards which direction and for which period of time, so as to inform the public PTT (Post, Telephone and Telegraph) organization that is the leased line provider. For this purpose, we need to specify for what interfaces of routers ‘sisyfos’ and ‘aiolos’ the number of incoming packets discarded due to format errors was greater than a small percentage of the incoming packets delivered to the layer Fig. 16. A dynamic TMIB table used to monitor the percentage of incoming packets discarded due to the format errors on a network interface.

(14)

above, and for which period of time this was true. So, we need an interval table with the following dynamic MIB-II objects of the ifTable in interfaces group as the TMIB table columnar objects: ifIndex, ifInUcastPkts, ifNUcastPkts and ifInErrors. Part of this table can be as shown in Fig. 16.

We consider that, for a given interface, the line connected to that interface has a problem when the number of incom-ing packets discarded due to format errors is greater than the 5% of the incoming packets delivered to the layer above. So, we have to issue the following query in order to identify when the line had a problem and towards which direction:

range of F is ifTable

retrieve F.nodeID, F.ifIndex valid during F

where F.nodeID¼‘sisyfos’ or F.nodeID¼

‘aiolos’ and F.ifInErrors .

0.05*(F.ifInUcastPkts þ

F.ifInNUcastPkts) when true

In the same environment, suppose we want to make a report on the failures of interfaces of router ‘aiolos’ during a specific day. This report will include also the time when each failure happened at the agent site and at the manager site. For this purpose, we need an interval table with the following quasistatic MIB-II objects of the ifTable in inter-faces group as the TMIB table columnar objects: ifIndex, ifAdminStatus, ifOperStatus and ifLastChange. If ifAdmin-Status is up and ifOperifAdmin-Status is down, the related interface is in failure mode. IfLastChange gives us the time at the agent when the interface changed operational state. A part of this table can be as shown in Fig. 17.

In order to construct such a report, we have to issue the following query:

range of F is ifTable

retrieve F.nodeID, F.ifIndex,

F.ifLastChange valid at begin of F

where F.nodeID ¼ ‘aiolos’ and

F.ifAdminStatus¼1 and F.ifOperStatus

¼2

when F overlap [June 6, 1996]

In the same environment, the enterprise network is con-nected to two different network providers through serial

lines with a router named ‘pegasus’ which has two serial interfaces. Suppose we want to evaluate the results of a specific routing policy regarding the two service providers. In order to do so, we need to trace the changes of the routing table entries (next hop) for a given set of destinations. For this purpose, we need an interval table with the following quasistatic MIB-II objects of the ipRouteTable in ip group as the temporal table columnar objects: ipRouteDest, ipRouteIfIndex and ipRouteNextHop. Part of this table can be as shown in Fig. 18.

For example if we want to know the history of routes that the datagrams followed for the destination 193.92.156.0 during last month (June 1996) in order to check the routing strategy related to the two different providers, we could issue the following query:

range of R is ipRouteTable retrieve R.ipRouteIfIndex, ipRouteNextHop

valid during R

where R.nodeID ¼ ‘pegasus’ and

R.ipRouteDest¼‘193.92.156.0’

when R overlap [June 1996]

References

[1] J. Case, M. Fedor, M. Schoffstall and J. Davin, A simple network management protocol, RFC 1157, DDN Network Information Center, SRI International, May 1990.

[2] Information processing, open system interconnection, common man-agement information protocol (CMIP), International Organization for Standardization and International Electrotechnical Committee, Inter-national Standard 9596.

[3] M. Rose and K. McCloghrie, Structure of management information for TCP/IP based internets, RFC 1155, DDN Network Information Center, SRI International, May 1990.

[4] J. Clifford, A. Croker and A. Tuzhilin, On completeness of historical relational query languages, ACM Trans. Databases, 19 (1994). [5] R. Snodgrass, The temporal query language TQuel, ACM Trans.

Databases, 12 (1987).

[6] T. Apostolopoulos and V. Daskalou, A model for SNMP based per-formance management services, Proc. IEEE SICON/ICIE ’95, Singa-pore, July 1995.

[7] T. Apostolopoulos and V. Daskalou, On the implementation of a prototype for performance management services, Proc. IEEE Int. Symp. on Computers and Communications, Alexandria, Egypt, June 1995.

[8] Information processing, open system interconnection, structure of Fig. 18. A quasistatic TMIB table used to evaluate routing policies by monitoring the history of routes that datagrams followed for a given destination.

(15)

management information, Part 1, International Organization for Standardization and International Electrotechnical Committee, Inter-national Standard 10165.

[9] B. Neumair, Modelling resources for integrated performance manage-ment, in: Integrated Network Management III, (ed. H.-G. Hegering, and Y. Yemini) IFIP Working Group 6.6, Elsevier Science, San Francisco, 1993.

[10] J. Haritsa, M. Ball, N. Roussoloulos, J. Baras and A. Data, Design of the MANDATE MIB, Proc. IFIP TC6/WG 6.6 Symp. on Integrated Network Management III (ed. H.-G. Hegering and Y. Yemini) IFIP Working Group 6.6, Elsevier Science, San Francisco, 1993. [11] K. McCloghrie and M. Rose, Management information base network

management of TCP/IP based internets: MIB-II, RFC 1213, DDN Network Information Center, SRI International, March 1991. [12] J. Case, K. McCloghrie, M. Rose and S. Waldbusser, Management

information base for version 2 of the simple network management protocol (SNMPv2), RFC 1450, SNMP Research, Hughes LAN Systems, Dover Beach Consulting, Carnegie Mellon University, April 1993.

[13] J. Case, K. McCloghrie, M. Rose and S. Waldbusser, Introduction to version 2 of the Internet-standard network management framework, RFC 1441, SNMP Research, Hughes LAN Systems, Dover Beach Consulting, Carnegie Mellon University, April 1993.

[14] W. Stallings, SNMP, SNMPv2, and CMIP: The Practical Guide to Network-Management Standards, Addison-Wesley, Massachussets, USA, 1993.

Theodore K. Apostolopoulos is an Associate Professor at the Department of Informatics at Athens University of Economics and Busi-ness. He obtained his diploma in electrical engineering in 1979 and his Ph.D. in infor-matics in 1983 from the National Technical University of Athens. His research interests include telecommunications, computer net-works, distributed systems and databases, performance evaluation of computer and communication systems, parallel algorithms for the solution of engineering problems. His current research activity focuses on network management, network services, distributed applications and telematics. The research activ-ities in the above areas have been supported by various projects funded by industry and government agencies. He has authored more than 30 publications covering the above scientific areas. He is a member of the IEEE, the IEEE Computer Society and the IEEE Communications Society.

Victoria C. Daskalou received her bachelor degree in Informatics from the Department of Informatics at Athens University of Eco-nomics and Business in 1992, where she has been a Ph.D. student since 1993. Her research interests include network manage-ment, databases and intelligent networks. Currently, she participates in three research projects related to the above scientific areas.