Design and Evaluation of Distributed Load-Balancing
for Wireless Networks
G. Nunzi, S. Schuetz, M. Brunner Network Laboratories
NEC Europe Ltd. Heidelberg, Germany
{nunzi, schuetz, brunner}@netlab.nec.de Abstract—Self-configuration is regarded a key instrument to
increase the efficiency of many management tasks. This work proposes and investigates the application of self-configuring mechanisms to decentralized network management in wireless networks. In particular, the load balancing function has been recently investigated and in this work we consider a distributed algorithm that can be executed autonomously by each Access Point. The contribution of this work, besides the definition of the algorithm, consists in an extensive evaluation of its performance. In particular, we study how the main issues in distributed self-configuring applications (namely stability, convergence, and overhead) can be controlled.
I. INTRODUCTION
Wireless technologies opened the way to seamless and ubiquitous service delivery. The recent growth of the wireless market enables new ways of communicating. Similar to the computing industry some time ago, this eruptive diffusion results in confusion of standards and technologies. Some technologies are used in areas not designed for and new applications are demanded from the market. The vision of the Ambient Networks project [1] is to harmonize the jungle of wireless technologies into a single cooperative as well as competitive environment, where the flexibility of the wireless access medium is exploited with the concept of “composition”: wireless networks can compose vertically, between different technologies (e.g. PAN and WLAN), and horizontally, between different administrative domains. The main efforts are directed towards the design of advanced mechanisms to control wireless networks, and self-configuring methods are the solutions for a simplification of management tasks in such environments.
Today, the management of wireless networks is based on a canonical centralized approach, where base stations are controlled centrally by a management station or many times by a wireless switch, for example, in 2G/3G networks the Radio Network Controller (RNC) or in WLAN a wireless switch.
This architecture responds to the classical vision of service providers, which require a central controller to access the whole network. Unfortunately this approach maximizes the robustness of the management processes, but ignores many other efficiency aspects. In this paper we investigate a distributed approach to perform configurations tasks in a wireless scenario, where the implementation and simulation scenarios make some assumptions near the WLAN architecture and mechanisms. The distributed approach can be successfully adopted in wireless networks to gain important benefits. The most relevant is a simplification of management tasks, by delegation of many configuration processes to the base stations; this translates immediately into a reduction of the costs in maintaining a wireless network, since the operation of the centralized management processes is not needed anymore.
In this paper, we concentrate on a particular application of wireless network management, which concerns, in general, Radio Resource Control. In particular, we consider how a load balancing mechanism can be deployed in a wireless network through a self-configuring application in the base stations. RRC is a fundamental task in wireless networks, because the wireless channel is a precious resource and should be subjected to a continuous optimization process during service provisioning. The delegation of configuration tasks to a self-configuring application can then bring effective benefits in wireless network management.
Despite its promising features, a self-configuring mechanism does not come without technical challenges that must be carefully evaluated: in short, self-configuring nodes make autonomous decisions and the overall behavior of the global network can become unpredictable. The main risk is that the autonomous decisions can lead to a non-optimal configuration or can even produce unstable configurations, leading to oscillations in the management domain under control. Other side-effects can be the overhead introduced by the propagation of management information. These effects must be investigated, before network providers will adopt self-configuring solutions in their operative network. Similarly, the monitoring and intervention of self-configuring systems [20] is another tool for improving and checking the correctness of the operation, but it might already be too late to change the device in the network.
In this paper a self-configuring approach for load balancing in wireless networks is designed and evaluated. The main This document is a byproduct of the Ambient Networks Phase 2
project, partially funded by the European Commission under its Sixth Framework Program. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Ambient Networks project or the European Commission.
contributions of this work reside, besides the design of the distributed load balancing algorithm, in a step-by-step design and evaluation of the distributed architecture between base stations. We start with a basic algorithm and at each step we first evaluate the behavior of the architecture and then we optimize the algorithm. The evaluation is conducted through extensive simulations and numerical results. The results are assuming some features, which exist in WLAN. Therefore, the results are definitely valid for the particular WLAN use case, but the step-by-step approach adopted in this paper offers a common methodology for other technologies as well as other self-configuring applications in the area of configuration management in distributed networks. The final goal is to provide a methodology to deploy self-configuring application for configuration tasks that unaffected of the drawbacks of distributed systems aforementioned.
The authors are aware of a multitude of solutions for load balancing in WLAN. They are reported in the related work section. The interest of this work focuses on issues related to distributed self-configuration processes in a wider scope than the application mentioned. Therefore the design phase is based on realistic assumptions that do not pretend to give a solution on the problem of load balancing in WLANs, but give a realistic use case for a distributed self-configuring application in general. A different approach to the problem, based on a theoretic model, can be found in [12]. Nevertheless this work dedicates marginal attention to the enforcement of the optimal algorithm into a real deployment scenario, but focuses on the issues around the more general case.
The remainder of this paper is organized as follows. Section II reviews the related work concerning self-configuration in network devices and load balancing techniques in WLANs. Section III states the problem of bringing load balancing into self-configuring base stations. Section IV contains the definition of the algorithm. Section V provides a step-by-step evaluation of the solutions proposed. Section VI concludes with final remarks.
II. RELATED WORK
This paper studies the application of self-configuring algorithms to management tasks in wireless LANs. We summarize the related work in two distinct paragraphs: work related to self-configuration and those related to radio management in WLANs.
A. Related work on self-configuration in networks
Most of the work concerning self-configuration in networks have been done at service level under the umbrella of autonomic computing. In this field, already a first attempt to evaluate the implications at service level of a control system is considered in [6]. In particular authors evaluate the costs that are generated by a load balancing in a database system and evaluate the trade-off between the goals of the load balancing algorithm and the costs in the network; the evaluation is performed through extensive simulations with different parameterization of the control system. Our work basically follows this approach, but it focuses on what happens on the level of network devices. The main difference is that the costs
are computed in terms affecting directly the end services to users. This has an impact on the control algorithm itself.
One of the first proposals for self-management in wireless base stations has been proposed in [7]. This work reports the possible applications of self-management in APs, and then analyzes an architecture to propagate information between them. The distributed load balancing presented in this work is based on the same propagation model, where “local” information is exchanged between neighboring APs to collaborate on a management task.
The work in [8] discusses the adoption of autonomic computing principles to wireless networks. The paper is mainly descriptive, but justifies why autonomic principles can bring advantages into wireless network management. Besides, they show the results for admission control, comparing a centralized against a distributed approach. The results reveal better performance of the distributed approach and the authors indicate as contributing reasons the better coordination and optimization of the distributed scheme.
The work in [9] has introduced the concept of distributed management. The paper evaluates only the impact of introducing distribution, based on the concept of management by delegation. This afterwards mainly ended into hierarchical monitoring. Authors of [10] present a formalism to describe an autonomic model through the use of sets, but do not provide a real application scenario.
B. Related work in load balancing of WLANs
Load balancing in WLANs received recently lots of interest. This is mainly motivated by the perspective of an integrated access to multimedia services from every type of access network; in this scenario WLANs must respect the same reliability and the same management capabilities as traditional telecom networks.
The work in [11] analyzes the problem of fair sharing of wireless resources among users. As result, authors provide an algorithm to perform an optimum association of users to their respective APs. A further study of these authors is then performed in [12], where a load balancing algorithm is designed based on cell breathing technique. The cell breathing technique is used as reference architecture also in this work, but the work in [12], relied on a centralized architecture, where a Network Operation Center (NOC) is controlling the whole hotspot. Moreover the work in [12] mainly concentrates on a mathematical correctness of the algorithm, but gives little space to the impact of the cell breathing technique on the service to end users.
The work in [13] already proposed a framework for congestion control in WLANs, but, again, it relied on a centralized architecture: its aim was the support of business models where user profiles where considered in the load balancing algorithm. A further extension has been proposed in [14] for the support of QoS extension in WLANs.
III. PROBLEM FORMULATION
Self-configuration of a particular task needs to solve two key problems. First of all, the configuration problem needs to
be investigated and solved with a specific algorithm: in this paper, we address the load balancing functionality. Second, the application of the algorithm to a distributed scenario must be investigated.
The assignment of radio resources is a major task in wireless networks. Generally, a wireless network is deployed through a cycle of few steps that include the estimation of the traffic load, planning of the wireless area, placement of the base stations and, finally, operation and maintenance during the operative period. In the latter phase, a Radio Resource Control function should maximize the use of wireless resources. In fact the approach of over-provisioning is a practical solution for wired networks, but radio links are precious resources and wastage of spare radio resources traduces into an immediate loss of revenues, specifically, if radio spectrum is bought fairly expensive. The problem from an operators' perspective can be formulated as follows: given a certain wireless network topology (outcome of the planning phase), how can we maximize the usage of wireless resources?
The problem of optimizing wireless resources is that users appear in the physical area in unpredictable manner: real measurements [15]][16] reported that in a WLAN area the unbalance of user distribution is remarkable. In these situations any planning of wireless network results vain, whereas a dynamic adjustment of the wireless resources can be more efficient to follow changes in the load distribution. We should make evident anyway that these situations should be treated as emergency cases: a well planned wireless network performs good for most of the operative time, but occasionally mobile users create situations not covered in the original planning. A load balancing mechanism is then an efficient instrument to maximize the usage of resources in emergency situations like those just depicted. It is efficient, because it can redistribute the load online and without a costly re-planning of the network: adding intelligence in the network is always cheaper than touching physical components. This justifies why recently a lot of work appeared about implementations of load balancing for wireless networks [11]][12][13].
On the other hand, a control mechanism always introduces new challenges and network operators are always skeptical to adopt such solutions into their operative networks. Routing protocols are a classical example of the benefits and potential risks of on-time resource control [17]. A routing protocol chooses the best route according to the current load of the different trunks Switching the route through a different trunk alters in turn the load distribution, and this in turn can lead to another assignment and so on. Similar scenarios are inherent to the behavior of self-managed systems, including a load-balancing mechanism. In fact such control can potentially create a badly configured network and eventually creates instability. These risks must be fully cleared out, before operators will deploy a local control mechanism in an operative network.
In the following two sub-sections, we outline the scope of this work. First we outline the particular application of load balancing we investigate, and then we outline the potential risks of implementing this management task as a distributed self-configuring application.
A. Goals and control space
There are two goals that can be achieved in an algorithm controlling user associations. The first one is to achieve full load balancing in the network, like proposed in [12]: if two base stations show different load, then this mechanism distributes as much as possible the load among neighboring base stations. The second goal is limited to perform congestion control, like in [13]: load is only transferred from a base station to another if it is near the maximum load admitted for that specific base station. In considering a choice between the two mechanisms, we should keep in mind that load distribution, independently from the goal achieved, does not come without side effects. The main drawback is that a transfer of load implies handovers of users and therefore the load distribution affects directly the service delivered to users.
Therefore, the load balancing algorithm presented in this work uses the number of users being moved to another base station as a constraint that should be kept as little as possible. In other words, we try to minimize the cost associated with the move of a user. In WLANs, the association time to a new base station is time consuming and prevents the implementation of seamless handovers. In other networks, the costs are in double association due to make-before-break requirements.
As result, we concentrate in this work on the second goal described above, i.e. the study of a load balancing algorithm for congestion control. We believe that a strict load balancing mechanism has little application in WLANs or similar wireless networks for two reasons. First, WLANs show high dynamicity in the load requests and load balancing cannot easily be achieved. Second, a strict load balancing mechanism requires a higher number of interventions and this is not consistent with the goal of minimizing the impact on service interruption. On the other hand, a congestion control has the effects of increasing the acceptance rate of users, which is the primary metric in which we are interested, while impacting at minimum the reassociations of users between APs.
As proposed in [12] and [13], the radio resource control relies on the cell-breathing technique to move the load between APs. It means the control knob is the radio cell coverage area, but it does not have knowledge of the position of the users. In other words, the congestion control can alter only the coverage area of the base stations, but it has no way to guess the effect of the actions to the network. This assumption holds true for inexpensive technologies like WLAN, so that our work can address real scenarios with off-the-shelf APs. Moreover, [12] already highlighted that power control in WLANs can be enforced by affecting only beacon frames: this assures the final effect of moving users from one base station to another one, while limiting the consequences at radio level. This approach is in fact very similar to the case of cellular networks. Even if this feature is not defined by the 802.11 specifications, authors of [12] prove that this can be supported through software extensions to most of the chipset drivers. In our simulation scenarios, the power is controlled by discrete steps. This is conformant with the definitions of the Management Information Base (MIB) provided by the standard [4], where each power step can be mapped to a specific power level expressed in dBm.
B. Challenges in distributed control
In general, the algorithm enforced by each network device performs certain actions in response to events; each action is directed towards achieving certain goals. Such a decision engine introduces a control loop: with respect to the load balancing application, the load is the input variable to make decisions, and the actions generated, affect in turn the current load in the network. The technical difficulties of this mechanism belong in general to the field of control theory [19]. When the control loop has not been well designed or an unexpected case appears in the network, the action might have an unwanted effect; in pathological cases the final service can deteriorate. The uncertainty about the effects of a control loop is the major obstacle to the adoption of an intelligent control mechanism like load balancing. Without a complete study on the potential risks of a load balancing algorithm, providers will not adopt it into operative networks.
Load control through cell-breathing techniques has several sources of uncertainty. The main issue is that positions of users are not known and therefore we cannot predict the effect of the action of transferring load. As a consequence, the action of transferring load can have no effect or the outcome might be even worse.
Additionally an autonomous decisions engine enforces single corrective actions in reaction to events, but we cannot predict the effects of a sequence of actions in a long period. Autonomous decisions can for instance generate oscillations in the assignment of radio resources and users would then experiment continuous reassociations and in turn bad service.
IV. DEFINITION OF THE ALGORITHM
The algorithm is defined as a set of actions in reaction to certain events: in this way the self-configuring algorithm can be easily deployed inside a policy framework such as the IETF policy framework [18]. In general, we want to activate the load balancing when a new request arrives or allocated resources are released. To achieve a clear definition of the algorithm, we first identify the information domain that is under control and afterwards we provide the list of actions for the algorithm.
A. Information model
The first element to define in a distributed configuration system is the control space, i.e. the set of the information elements under control. The information model is made up of
few events that can be supported by any communication protocol between two access points; an immediate implementation can be as SNMP traps. Nevertheless the purpose of this work is to integrate this control algorithm with the point-to-point protocol between two access points presented in the architecture of [7]. The information model required for a load balancing application is shown in Figure 1. Two actors are present in the network: the base stations and the users. A base station contains the two parameters under control; these parameters are in practical terms stored as an internal status of the base station. The load represents the observed information and will be the input element of the control algorithm. The power is instead the element controlled to move the load since it controls the coverage area. We assume that the topology of the network is known to each base station, like described in [1] and implemented in [3]. A table with the neighboring base stations is maintained for synchronization; a local copy of the status of each neighbor is maintained and this copy needs to be updated by refresh messages between base stations.
In the following, we assume the events that are specified in the 802.11 specifications [4] and are normally supported in commercial WLAN APs. A set of users is associated with each access point and each user generates some load. Note that from a management perspective we are not interested in defining how this load is created. The association and deassociation
events control the start and the end of the wireless connection of a user with an access point; the drop event occurs when a user leaves an access point; the reassociation event occurs when a user reassociate to another access point. The occurrences of these events are handled by the load balancing algorithm to perform decisions.
B. Algorithm
The algorithm is built using the cases we want to control. Figure 2 shows the different cases that a load balancing mechanism in WLAN should cover. Please note that these cases represent only static conditions and the dynamic behavior in an operative network cannot be predicted in advance. We can then identify three different situations, which are recognized by the algorithm as different load thresholds:
• Maximum Load (ML): This threshold defines the limit of the resources in an AP. If an AP reaches this level, it starts to reject new requests.
• High Load (HL): This threshold defines a level close to
distributed
load X
Case 1 Case 2 Case 3
Max Load (ML) High Load (HL) Light Load (LL) distributed load go to initial go to initial go to initial
Figure 2 Use cases of actions for load balancing.
AccessPoint
Neighbor Load Power
User
Service Mobility
Association Desassociation Reassociation Drop
*
*
* *
generates
Associates to
ML, so that the load should be proactively transferred to avoid rejections to upcoming requests.
• Light Load (LL): This threshold defines a level far from the ML, so that the AP can safely receive load. The algorithm can then be built in two steps. The first step consists in recognizing the use cases described; afterwards specific actions are executed to handle the case discovered. Table 1 shows the list of rules designed by inspection of use cases. The first two rules identify the cases when an AP is under the High Load case (rule #1) and under the Light Load case (rule #2). Once one of these two cases is identified, a new internal event is launched, so that specific rules can be executed to handle it. The case of Maximum Load is not discussed, because it belongs to the Access Control mechanism of the AP that starts rejecting users in conditions of maximum load; such Access Control is generally programmed in the chipset as MAC primitives.
Rules from 3 to 7 are instead used to handle the two internal events generated by the previous rules. In case of Heavy Load, we enforce the necessary power adjustments: first we shrink the local AP provided full coverage area is guaranteed, then we send a request to the neighboring APs to enlarge their coverage area, on the contrary. This requested is actually enforced by the receiving AP, only if it the Light Load condition is satisfied (rule #7). The case of Light Load is handled similarly by rules #5 and #6.
C. Oscillation avoidance
As the risk of oscillations is immanent, we discuss here how an oscillations avoidance mechanism can be adopted. Common methodologies in the autonomic area involve the use of control theory to smooth the intervention of the actions [18]. We argue here that this approach is not needed when implementing self-configuring approaches in network configuration.
In fact, the final objective of a management task in network devices is to optimize the final Quality of Experience (QoE) of users. Therefore, we are not directly interested in tuning the behavior of some variables in the system; instead the main objective is, during the execution of optimizing algorithms, to avoid problems at the service level.
We enforced an oscillation avoidance mechanism as a set of blocking conditions that stop the enforcement of an action. In this way, the algorithm is stopped when it would interrupt users’ service. As motivated in Section III “Problem Statement”, we want to use this blocking mechanism to limit the number of reassociations users are subject to. We therefore
define a “blocking window” as a time interval, in which two opposite actions are blocked. In particular the blocking conditions are defined as follows:
• Block a “shrink” command if an “enlarge” command falls inside the blocking window.
• Block an “enlarge” command if either an “enlarge” or a “shrink” command falls inside the blocking window.
• Block a “release” command (rule #6) if either an “enlarge” or a “shrink” command falls inside the blocking window.
Given the information model, some assumptions about the network from the WLAN world are taken into account in the design of the algorithm described.
V. EVALUATION
The algorithm has been designed through an analysis of static cases, as explained in section IV.B: when one case is identified, the related actions are executed. The fulfillment of this design procedure does not give any guarantee on the correct behavior of the distributed network in a general sense. Correctness rather refers to the following aspects:
• Effectiveness of the algorithm in operative conditions: The algorithm should reduce the number of rejections under long term traffic conditions, which is the primary goal.
• Stability of the configuration: The algorithm should not only optimize the load distribution, but also minimize the number of re-configurations.
• Overhead introduced: a distributed algorithm consumes resources in terms of both processing capabilities on each AP and of traffic wastage due to the messages exchanged, which should be minimized. Therefore, an evaluation of the dynamic behavior of the algorithm is necessary and we conducted a series of simulations described in the following paragraphs.
A. Simulation Model
In the simulation we considered a wireless hotspot of 25 APs, positioned in a grid structure as shown in Figure 3. We consider this setup being a large hotspot, because we can stimulate high load on different areas of the network and study the effects of the algorithm also on nodes not directly affected by the change of the traffic pattern. Most of the time, users appear evenly distributed over the whole area, but we also consider 4 time windows where users appear unevenly distributed: in the first time window (between minutes 120-140) users appear more concentrated in the upper left area (between APs 1-2-5-9-8), in the second time window the concentration area appears between APs 13-14-21-20, then in the area 6-7-13-14 and finally in the area 12-13-15-16. The entire simulation lasts 420 minutes. The inter-arrival time during normal condition of load is set to 1 minute, while the inter-arrival time during the four time windows is set to 0.5 minute. In both cases users stay connected for 10 minutes, which is a typical time for telephone calls. The maximum load
Event Condition Action
1 Associate HL LaunchHeavyLoad
2 Leave LL LaunchLightLoad
3 HeavyLoad NoHole ShrinkLocal
4 HeavyLoad EnlargeNeighbours
5 LightLoad BelowInit EnlargeLocal
6 LightLoad ReleaseNeighbors
7 ReqEnlarge LL Enlargelocal
for the access points was set to 50 users, which is a tolerable load for WLANs. The values of the thresholds HL and LL have been set to 40 and 30 respectively; these values have proven good performance results during a set of preliminary simulations. The rationale for this setup is to simulate a big exposition area, where visitors move in an unpredictable manner. The area is uniformly covered by a set of access points, but special events attract a concentration of users to some parts of it on different periods of time.
At the beginning we evaluate the algorithm, as if it would be implemented in a central controller: we assume that the information is contained in a central repository and there is no collaboration between nodes. At this step, we capture the performance metrics of the central algorithm and we take this as starting point to evaluate the distributed architecture. In the simulations of the distributed architecture, the information is then distributed in different objects (i.e. the states of the APs) and a propagation algorithm between neighboring APs is applied. The algorithm takes a propagation delay of 500ms into account: this can be regarded the worst case delay of communication between APs. From this stage, we concentrate on the distributed implementation of the algorithm: we analyze the side effects of the distributed architecture and we then enforce the mechanism to increase its efficiency.
B. Results
Results are shown both as tables summarizing the performance metrics of the service delivered to users and
graphs showing the dynamics of the algorithm. The performance metrics are the number of requests accepted, the number of those rejected and the number of users moved from one AP to another one. To increase readability of the results, we report only the values of the APs involved in the first time window (i.e. APs 1, 2, 3, 5, 6, 9, 10, and 13) and the totals of all APs.
First we report the results of the central case and we make a comparison with the distributed approach to highlight the effect of the propagation of information in the network. The next steps consist in an optimization of the algorithm. First of all we show the presence of oscillations in the network and we enable the oscillation avoidance mechanism. Secondly we measure the overhead of the communication between APs and we change the propagation mechanism to reduce the number of messages exchanged.
1) Unbalanced network
As already clarified, we run a preliminary simulation to take a snapshot of the performance metrics without any load balancing algorithm. Results are reported in Table 2, which shows the effect of the unbalanced load distribution in terms of requests rejected.
2) Distributed algorithm
We run a first version of the distributed algorithm as defined in Table 1 to understand its behavior with respect to the central case. The difference is given by the fact that now the status of neighbors is stored locally by each AP and must be updated regularly. As explained before, a delay of 500ms is introduced in the status updates to take in account the transmission delay and reports these results.
The sums reveal that the algorithm is slightly less effective than the central case. Since each AP receives an updated status of the other APs with a certain delay, corrective actions cannot be taken in time and the algorithm reduces its effectiveness: this is clearly shown in the totals of Table 4. On the other side, the figures of single APs show a different behavior: for example APs 5 and 13 register a reduced number of requests rejected. The fact that a single AP registers better metrics should be regarded as a statistical deviation, due to the fact that the AP derives its configuration from outdated information. Over the entire simulation, the totals show worse values instead.
AP Acc Rejected Dropped
AP1 141 3 (2.08%) 0 (0%) AP2 188 5 (2.59%) 0 (0%) AP3 140 0 (0%) 0 (0%) AP5 311 41(11.65%) 36(11,57%%) AP6 265 0 (0%) 0 (0%) AP9 284 5 (1.73%) 0 (0%) AP10 330 37(10.08%) 22 (6,67%) AP13 306 17 (5.26%) 6 (1.96%) Total 5362 169 (3.15%) 78 (1,45%)
Table 3 Central algorithm.
AP Acc Rejected Dropped
AP1 141 8 (0.01%) 0 (0%) AP2 185 5 (0.01%) 0 (0%) AP3 134 0 (0%) 0 (0%) AP5 288 36(0.01%) 15 (5.2%) AP6 255 0 (0%) 0 (0%) AP9 278 12(0.01%) 0 (0%) AP10 326 53(0.01%) 17 (5.21%) AP13 303 12(0.01%) 0 (0%) Total: 5311 209(3.79%) 66 (1.24%)
Table 4 Distributed algorithm with immediate propagation. AP Acc Rejected AP1 131 0 (0%) AP2 166 0 (0%) AP3 134 0 (0%) AP5 284 97 (25.46%) AP6 246 0 (0%) AP9 268 2 (0.74%) AP10 313 86 (21.55%) AP13 291 3 (1.02%) Total 5110 351 (6.43%)
Table 2 Effects of load unevenly distributed. 1 2 9 8 5 3 4 11 10 7 6 16 15 12 18 17 14 13 23 22 19 25 24 21 20
Figure 3 Network model for simulations.
The dynamics of the algorithm is shown in Figure 3, which reports the power levels in function of time. AP5 is the one receiving most of the load and thus, it is the only one working with a reduced coverage area with respect to the initial condition; the other APs have instead more transmission power to capture part of the load. The picture shows that frequent commands are enforced by the APs and oscillations are visible throughout the graph. This happens because the load balancing alters the load incoming on each APs and in turn the new load distribution can trigger new actions in the algorithm. The oscillations in the control parameter should be regarded as a drawback of a control algorithm, because continuous changes in the configuration can have a bad impact on the final service delivered to customers. In the case of load balancing, the continuous change is reflected in a high number of reassociations caused by the power adjustments, e.g. AP5 shows 15 reassociations in a period of 20 minutes, although we would expect some adjustments only at the beginning of the time window.
To ameliorate the behavior of the algorithm, we then show in the next section the effectiveness of a windowing mechanism to avoid oscillations.
3) Oscillation avoidance
The impact of oscillations can be smoothed by the windowing mechanism described in section IV.C. We basically block the adjustments during a certain time window. The effect on the service level is that users have to reassociate less frequently. On the other side, the blocking window must be tuned in such a way that the algorithm is still able to adjust to changes of the load distribution in the long term. Please note that other mechanisms can be adopted, e.g. a control loop to estimate the optimal transmission power.
To achieve both effects, we used two different time windows, depending on the working conditions of the AP. If the AP is working near its initial state, then a small blocking window is used: the rationale is that in the initial state we want the AP to promptly respond to the changes of the network. In case the state of an AP reaches its border values, a larger blocking window is used to impose stricter limitation on the oscillations. The two working conditions are defined as
follows: an AP is working near its initial state if its transmitting power is close to its initial value.
The analysis in section IV.B as well as our simulations has clearly indicated the need for using these two different blocking windows. A window too small did not mitigate all the oscillations. On the other hand, a window too large caused bad performance when users appeared unevenly distributed in the network. This was evident for example after minute 120, because the algorithm was not able to move load from the overloaded AP.
The behavioral graph is shown in Figure 5. The graph reveals that now the oscillation effects are mostly mitigated. The numerical results are shown in Table 5. In this case we can clearly see that now the total number of reassociations is significantly reduced, while the performances of the algorithm in terms of rejections remains stable.
4) Propagation Filtering
To evaluate the overhead imposed by the distributed algorithm, we count the number of messages sent and received and study the effects of a filter applied to the propagation messages. Table 4 shows the results measured when the new status of an AP is propagated on every change. The number of packets generated is relevant for two reasons. First, the packet generation at the sending AP and the status update at the receiving AP has an impact on the computational resources of the devices. Second, the bandwidth available on the uplink interface of an AP can be limited, e.g. in a hotspot deployed as a mesh network. Therefore, we try to reduce the overhead introduced by the collaborative algorithm and we study the 0 2 4 6 8 10 120 124 128 132 136 140 Power step Time (m) AP1 AP2 AP3 AP5 AP6 AP9 AP10 AP13
Figure 3 Power adjustments in the first version.
AP Acc Rejected Dropped
AP1 141 10 (0.01%) 0 (0%) AP2 179 4 (0.01%) 0 (0%) AP3 130 0 (0%) 0 (0%) AP4 51 0 (0%) 0 (0%) AP5 284 39 (0.01%) 5 (1.76%) AP6 272 0 (0%) 0 (0%) AP9 274 11 (0.01%) 0 (0%) AP10 307 20 (0.01%) 0 (0%) AP13 311 14 (0.01%) 0 (0%) Total: 5277 197(3.6%) 14(0.26%) Table 5 Figures with oscillations avoidance. 0 2 4 6 8 10 116 120 124 128 132 136 140 Power step Time (m) AP1 AP2 AP3 AP5 AP6 AP9 AP10 AP13
Figure 5 Effect of oscillation avoidance. O
O
O O
effects.
The overhead can be reduced by introducing a filter in the propagation process: update messages during a certain time window are aggregated and sent only at the end of the time period. The benefit is that the number of packets generated is reduced. The drawback is that now update messages can be delayed and each node might operate based on outdated information about their neighbors.
We evaluated two periods of filtering: one filter of one minute and another one of three minutes. The effect is visible in the graphs reporting the dynamics in Figure 4. In fact, the dynamics are a little bit delayed with respect to the previous case: this is evident if we look at the instance when the line of the highest load starts to decrease. Compared to the previous case, shown in Figure 5, the propagation filter delays the effect of algorithm.
Table 6 and Table 7 report the numeric values obtained for two cases: blocking messages every one minute and three minutes. The number of messages is heavily reduced as expected, because we are actually applying a filter. On the other hand the results show that the performances of the algorithm are marginally affected. This is due to the fact that the input process, i.e. the number of users, varies in general with a movement velocity less than the windowing time. Therefore the algorithm does not wrongly guess on the current
status of its neighbors. This is true for most of the time, i.e. when the traffic of users is almost constant; in other situations, instead, when there is an abrupt change of the status of the network, an outdated knowledge of the neighbors’ statuses can occur; this behavior becomes visible from the graphs of the dynamics, like explained before. The increase of rejections registered in Table 6 and Table 7 is in fact generated mainly in this period.
VI. CONCLUSIONS
In this paper we analyzed the application of self-configuration to radio resource control in WLANs for the load balancing task. The purpose of this paper is to study the different effects that affect typically self-configuring applications, in particular instability and overhead. The load balancing in wireless networks include different implementations; in this paper the most generic approach, based on the “cell-breathing” technique, has been chosen among the others.
The design of the distributed control algorithm has been conducted through an analysis of static use cases. The application of such an algorithm to a dynamic scenario can therefore lead to an unpredictable behavior. In fact, the simulations showed that under high traffic load conditions, the algorithm enforces frequent corrective actions and the performance at service level (i.e. reassociations of users) is negatively affected. Different from other approaches based on results of control theory, the instability can be mitigated by the use of a blocking mechanism within time windows. Its purpose is not to follow exactly the behavior of the algorithm, but to mitigate its effects on the quality of experience of end users. This analysis in fact has been conducted with use of service-centric metrics, like number of rejections and reassociations in the network.
The overhead of the algorithm has been evaluated with the number of messages exchanged between nodes. An uncontrolled propagation of statuses between base stations can be intolerable for constrained uplinks, like in mesh networks. Therefore, we proposed also in this case a filtering mechanism to block the propagation of status updates. The analysis in this case aimed at evaluating the impact of this filter on the efficiency of the algorithm, where again the efficiency was measured in terms of metrics at service level. Results showed that for the load balancing task, a good trade-off between timeliness of status updates and overall efficiency of the algorithm can be achieved.
2 4 6 8 10 12 14 16 18 20 22 24 26 28 120 124 128 Load Time (m) AP1 AP2 AP3 AP5 AP6 AP9 AP10 AP13 4 8 12 16 20 24 28 120 124 128 Load Time (m) AP1 AP2 AP3 AP5 AP6 AP9 AP10 AP13
Figure 4 Filter of1minute(left)and3 minutes (right)
AP Acc Rejected Dropped Sent Received
AP1 135 8 (0.01%) 0 (0%) 309 1452 AP2 184 3 (0.01%) 0 (0%) 418 2538 AP3 134 0 (0%) 0 (0%) 346 2507 AP4 51 0 (0%) 0 (0%) 123 1253 AP5 289 52 (0.01%) 13 (4.5%) 624 3023 AP6 267 0 (0%) 0 (0%) 619 4080 AP9 277 6 (0.01%) 0 (0%) 645 4814 AP10 307 18 (0.01%) 0 (0%) 704 4455 AP13 299 8 (0.01%) 0 (0%) 702 5192 Total: 5244 230(4.2%) 14(0.27%) 12194 76890
Table 6 Filtering of 1 minute.
AP Acc Rejected Dropped Sent Received
AP1 139 10(0.01%) 0 (0%) 312 1412 AP2 180 5(0.01%) 0 (0%) 395 2502 AP3 138 0 (0%) 0 (0%) 357 2454 AP4 53 0 (0%) 0 (0%) 127 1240 AP5 282 44(0.01%) 7 (2.48%) 607 2961 AP6 253 0 (0%) 0 (0%) 584 4046 AP9 277 4(0.01%) 0 (0%) 647 4760 AP10 322 59(0.01%) 0 (0%) 736 4353 AP13 294 3(0.01%) 0 (0%) 694 5134 Total: 5215 254(4.64%) 8 (0.15%) 12065 76079
REFERENCES
[1] N. Niebert, et al, “Ambient networks: An architecture for
communication networks beyond 3G”, IEEE Wireless Communications, vol. 11, pp. 14-22, April 2004.
[2] Peter K. Ibach, Tobias Hübner, Martin Schweigert MagicMap –
Kooperative Positionsbestimmung über WLAN, Chaos Communication Congress, Berlin, 27.-29. Dez., 2004.
[3] MagicMap,
http://www2.informatik.hu-berlin.de/rok/MagicMap/download.htm.
[4] IEEE Std. 802.11, 1999 Edition (ISO/IEC 8802-11: 1999).
[5] GF Franklin, JD Powell, and A Emami-Naeini, Feedback Control of
Dynamic Systems, , 1993.
[6] Yixin Diao, Joseph L. Hellerstein, Adam Storm, Maheswaran Surendra, Sam Lightstone, Sujay Parekh, and Christian Garcia-Arellano. Incorporating cost of control into the design of a load balancing controller. Invited paper, Real-Time and Embedded Technology and Application Systems Symposium, 2004.
[7] Kai Zimmermann, Lars Eggert, Simon Schuetz and Marcus Brunner.
Self-Management of Wireless Base Stations. IST Mobile Summit, Mykonos, Greece, June 2006
[8] Shen, C. Pesch, D. Irvine, J., A framework for self-management of hybrid wireless networks using autonomic computing principles, Communication Networks and Services Research Conference, 2005. Proceedings of the 3rd Annual.
[9] Mohsen Kahani and Peter Beadle. Decentralized Approaches for
Network Management. ACM Computer Communication Review Journal, 1997.
[10] Wang Fei Li Fan-Zhang, The design of an autonomic computing model and the algorithm for decision-making, Granular Computing, 2005 IEEE International Conference on.
[11] Y. Bejerano, S.-J. Han, and L. Li, “Fairness and Load Balancing in Wireles LANs Using Association Control”, Proceedings of MobiCom 2004, Philadelphia, PA, September 2004.
[12] Yigal Bejerano, Seung-Jae Han, “Cell Breathing Techniques for Load Balancing in Wireless LANs”, Infocom 05.
[13] Sandra Tartarelli and Giorgio Nunzi, QoS Management and Congestion Control in Wireless Hotspots, proceeding of NOMS06, Vancouver, CA, 2006.
[14] Giorgio Nunzi, Sandra Tartarelli, Luca Vollero, QoS aware load
balancing in 802.11e hotspots, WTC 06, Budapest, Hungary, 2006.. [15] M. Balazinska and P. Castro, Characterizing mobility and network usage
in a corporate wireless local-area network. In Proc. USENIX MobiSys, 2003.
[16] T. Henderson, D. Kotz and I. Abyzov, The Changing Usage of a Mature Campus-wide Wireless Network. In Proc. ACM MobiCom 2004. [17] C. Labovitz, R. Malan, and F. Jahanian, Origins of pathological Internet
routing instability, in Proc. IEEE INFOCOM, 1999.
[18] Moore, B., et al., “Policy Core Information Model”, RFC 3060, IETF, Feb 2001.
[19] JL Hellerstein, Y Diao, S Parekh, and DM Tilbury, Feedback Control of Computing Systems, Wiley, 2004.
[20] Giorgio Nunzi, Marcus Brunner, Simon Schuetz, Generic Monitoring
and Intervention on Self-Configuring Networks, Application Session, NOMS'06, Vancouver, CA, 2006.