Load balancing is a popular technique when you need to be able to serve thousands and more concurrent requests in a reliable manner. Load balancers distribute incoming network traffic across a group of backend servers (a.k.a. server pool or server farm) to increase throughput. Although many people associate load balancing primarily with the higher throughput, this technique can be used in other important scenarios as well.
Benefits of load balancing
Having multiple servers running your application increases the overall availability. If, let’s say, Server C stops responding due to failure, the other two servers will be able to serve the incoming traffic.
Easier to scale up and down
There may be peaks in the popularity of your application, where the number of incoming requests is drastically higher than usual. In such situations you can easily add another server to the server pool and your load balancer will automatically begin distributing requests to it. When the peak is over, you can disconnect the extra servers again.
Load balancers can monitor the incoming traffic and deny access if the request seems suspicious. An example could be that your load balancer ensures the incoming traffic is HTTP where only GET and POST methods are allowed.
Many web sites require HTTPS connection to some resources (e.g., when logging in). In order to monitor the traffic in these scenarios, the load balancer can apply SSL offloading: You, as an administrator, install your SSL certificate on the load balancer and when a HTTPS request comes, the load balancer uses this certificate to decrypt the request. After the request has been inspected, there are two possibilities:
- Encrypt the request again and send it to one of the backend servers. This is useful if your application checks for HTTPS in the requests or you want to secure the inner communication. However, this uses additional computation resources in order to encryp the requests.
- Send the request without encrypting it. This will save you extra computation resurces, but will make the inner communication less secure. Moreover, your application could fail to respond if the incoming request is not HTTPS. What typically happens in such situations is an endless redirect loop, as your application will respond with 301 Permanently Moved and a Location header indicating the HTTPS version of the request URL. To resolve these issues load balancers insert a special HTTP header, X-Forwarded-Proto, to indicate that the original request is actually HTTPS but it has been forwared.
Load balancers can help you improve your deployment process. When deploying a new version of your application, you could do the following for each server in the pool (one at a time):
- Detach the server from the load balancer
- Install the new version of the application
- Test if the installed version works
- Attach the server again to the load balancer
This is called zero-downtime deployment as your users can still use your application while you are updating it. Even if an error occurs while deploying, you can use more time to debug as the server remains detached from the server pool.
Although load balancing can help you deploy seamlessly, you should be aware of possible issues on the user side. It could happen that a user is in the middle of an important activity when you do the deployment. It could happen that the user gets some of the requests (e.g., AJAX calls) from the old version of the application and others from the new one. This could lead to UX inconsistency or security vulnerabilities. To prevent this you should deploy small packages with a few changes only on daily basis.
Things to be aware of
Although load balancers provide a lot of benefits, there are certain things you should be aware of.
Single point of failure
The load balancer becomes the front door to your application and this makes it a single point of failure. This means that if it stops responding due to an internal or external issue (e.g., a DDoS attack), your entire application becomes unavailable. To tackle this you could replicate the load balancer, i.e. use a second one that is stand-by and ready to replace the original one if needed.
Many web applications nowadays have state. This could be a database, physical files, or in-memory data. However, applications using state have troubles with load balancers. This is especially problematic for session-based application – applications using in-memory session for its users. Imagine a user logs in and she is served from Server A. An in-memory session data is created for this user on this server. However, the next request is served from Server B, which doesn’t know anything about the session of the user. To address this issue load balancers provide a feature, called sticky sessions. What happens is that the first request goes to a random server and all consequent requests from the same client will go to the same chosen server.
Although it does solve the problem with sessions, sticky sessions decrease the overall power of load balancers. Now you cannot just attach and detach servers as there may be session data on them and doing so will reset the state of the users. I personally believe that applications should be stateless. This means that all data an application saves should be on a separate server, accessible from all other application servers. This gives you all the benefits of load balancing I described above.