Load balancing provides a means to direct traffic flow to those instances that
should receive it. In the most basic form, incoming traffic can be split equally to all of the instances, which spreads load evenly and allows for better scaling. In more advanced forms, instances can be monitored so that traffic is split based on
availability, performance and level of activity. In particular, if an instance goes down, it can be excluded from receiving additional traffic until that instance is restored back to service.
Load balancers typically provide an easy means to configure how traffic should flow inside an application. A pool is created to monitor a particular service and servers can be added and removed from the pool on the fly. Load balancers
monitor the services of each server and determine what traffic should go to it, if any. A pool often has an IP address and port assigned to it. As long as at least one server in the pool is able to receive traffic, the pool’s IP address and port is active. Load balancers monitor the services in a pool by connecting to the service.
Monitoring can be as simple as just connecting successfully to the service, or it can be as complex as connecting to the service and expecting a specific banner or
string to be returned. Some load balancers provide a means of attaching custom scripts to the checks so that complex checks can be performed, such as
authenticating to the service and performing some kind of action. Load balancers can also monitor performance in a way, by looking at how long its checks are taking and basing decisions on that. Successful checks mark the service as available and unsuccessful checks mark the service as unavailable.
Once the load balancer has collected all of the data from the checks performed on the service, it needs to decide how to distribute the incoming traffic. A pool set up to use a round robin algorithm will send traffic to each service, one after the other in sequential rotating fashion. A pool set up to use a least connections algorithm will send traffic to the service that has the fewest active connections. A pool could also be setup to send traffic to the service with the least network latency. More complex algorithms can also be supported, combining simple algorithms, or setting up a priority of services that should get traffic before other services get traffic.
There are many types of load balancers available. Hardware load balancers usually provide the most capabilities, reliability and ability to handle large amounts of traffic. However, they are also more expensive than any other type of load balancer. Also, hardware load balancers managed by another team may add additional complexity to its use. Nonetheless, if hardware load balancers are available, it is recommended to take advantage of them.
Software load balancers are cheaper and can be more flexible than hardware load balancers. You can build and incorporate software load balancers into the
application, tightly coupling how load balancing is done with the needs of the application. There are many types of software load balancers. One of the more popular choices is HAProxy. There are a number of load balancers available using Apache and Java as well.
OpenStack also provides a Load-Balancing-as-a-Service (LBaaS), which is
implemented using Neutron. It supports many of the same features that regular load balancers support, such as service monitoring, management of the services in the pool, managing connection limits, and providing session persistence. Check with the OpenStack cloud administrators to see if LBaaS is available and how it can be used.
One of the things that need to be considered when setting up load balancing for an application is what kind of traffic will be going through it. Not all network
the application needs to share session information across all of the needed servers, or the load balancer needs to be configured to send a single session’s traffic to the same back-end server until that session is terminated.
Another thing to be considered is that load balancing will increase logging quite a bit on the servers in the pool. Generally, load balancers like to check services every few seconds to make sure they are up. In an enterprise environment, there may be two or more load balancers configured identically, all of them checking every few seconds on those same services. Unless the application is configured to not log those connections, logs can grow quite a bit.
Ultimately, load balancing provides a valuable way to improve an application. It provides a means to monitor the services and remove servers from a pool that are no longer working. It also provides a means to add and remove servers on the fly, which is an important part of application scalability.