Chapter 2 LITERATURE REVIEW
2.5 Spatial Join Systems
2.5.3 Distributed spatial join systems
2.5.3.1 Load balancing for distributed systems The load balancing problem
in different field of computer science including network and HPC has been long established.
Researchers have tried to look at this problem from different angels from practical and
experimental perspectives to theoretical aspects. In this section, we summarized some of the
most significant theoretical load balancing works. We pointed out their pros and cons and
their limitations compared to our proposed framework.
A general static load balancing model for job scheduling over a distributed computer
network to minimize the mean response time of a job is proposed in [68]. They formu-
lated the load balancing task as a nonlinear optimization problem with n(n+ 1)/2 variables
where n is the number of nodes. Optimal solution is presented using a Lagrange multi-
plier approach. They also provided two efficient algorithms that determine optimal load
for each host. parametric-study algorithm that generates optimal solution as a function of
communication time and single-point algorithm that gives optimal solution for given system
parameters including communication time. The framework has some limiting assumptions.
First, they did not consider the problem of partitioning a big job into small task. In fact,
they assumed all nodes have the same processing capabilities and a job may be processed
completely at any node in the system. However, considering the big data era and jobs pro-
cessing huge volume of data, this is an impractical assumption. Second, the model assumed
the communication between different nodes is one way that is if node A transfers a job to
node B, no job can be sent to node A from B. Third, as name implies, the framework is
static and decision to transfer a job from one node to another does not depend on the state
of the system.
[69] proposed a framework for load balancing iterative computations in clusters with
heterogeneous computing nodes. The model assumes that application data is already par-
titioned between the processing nodes forming a virtual ring. In each iteration, the com-
putation involves independent calculations carried out in parallel in each node followed by
to select a subset of processing nodes from all nodes and balance the load between them
such that it minimizes execution time while these nodes are not fully connected and pairs
may share physical communication links due resource limitations. The model considered two
scenarios, 1) SharedRing that there may exist several messages sharing a link, 2) SliceRing
that dedicated links are used for communications. Some heuristic algorithms is provided in
this work to solve these optimization problems. The main limitation of this model is that it
is suitable only for application iterative computations with local partial results exchange in
each step. In fact, it is limited to ring topologies. Moreover, the problem of big data IO and
initial partitioning of data between nodes is not addressed.
Dynamic load balancing on message passing multiprocessor has been studied in [70]
as diffusion schemes. They provided converging conditions as well as convergence rate for
arbitrary topologies. Hypercube network analysis is provided as a case example and they
showed that diffusion approach to load balancing on a hypercube topology of multiprocessors
is inferior to dimension exchange method. This well-presented model has several limitations.
First, they quantified work in terms oftasks and assume all tasks require an equal amount of
computational time and nonuniform task partitioning of heterogeneous data is not addressed.
Second, although the model is designed for any topology it did not consider spatially related
tasks that means it is not suitable for load balancing applications that need to maintain
locality.
In [71] and [72] two data migration-based load balancing models are provided. In [71]
a load balancing framework called Ursa is proposed for large scale cloud storage services.
It formulates an ILP optimization problem that chooses a subsets of objects from heavy-
loaded nodes called hot-spots and performs topology-aware migration to minimize latency
and bandwidth. Ursa is designed to identify cost-optimal source-destination node pairs for
dynamic and scalable load reconfiguration by applying 1) a workload-driven integer linear
programming optimization approach to eliminate hot-spot nodes while minimizing reconfig-
uration costs, and 2) a divide and conquer technique to break down expensive computations
cloud storage services, because it is designed at a storage layer it is not application-aware
and does not consider data locality or other application-specific requirements for distributing
the work load between nodes. Furthermore, it assumes that architecture is organized as a
spanning tree topology that makes unsuitable for other network architectures. SWAT [72],
a load balancing algorithm, is proposed to address the problem of performance isolation of
multi-tenant databases in cloud systems that caused by resource sharing among co-located
tenants. Similar to [71], the general idea is to select tenant pairs for load-swap in a highly
resource and time efficient manner. SWAT initially tries to eliminates all the hotspots and
balance the load among all the nodes byload leveling. If it is not possible to balance the load,
then, it eliminates the hotspots through hotspot elimination process. Finally, in case both
load balancing and hotspot elimination fail, SWAT tries to minimize the overload rather
than eliminating it by hotspot migration. Partitioning problem and maintaining locality