Building resilient services with Java and Spring Boot

In this post I am going to touch on a subject that many only think about when they reach production and scale; how to build API's that are resilient against most common request load issues that you will encounter once you reach a certain amount of traffic.

For the examples in this post I am going to use Java and the Spring Framework with related libraries, but the problems are the same with any framework and similar tools can be found for most stacks.

Distributing load (load-balancing & auto-scaling)

Let's first talk about network load.

Usually network load is something that you will not able to predict before you have your application running publicly and you start to get a decent amount of users. What counts as "a decent amount" depends on your application, but usually we are talking about thousands of requests per minute.

There are some different ways you can tackle this.

The first and recommended way is to run multiple instances of your application and distribute the load among the instances. When the load gets heavier you (or your automation) create new instances and distribute the load among the new instances and when the load again gets lighter you just shut down some instances. If you are familiar with blue-Green Deployments, A/B Testing or Canary Releases then auto-scaling works similarly except instead of different instances the same instance is just cloned over and over again.

Scaling your application by running multiple instances needs to be done on an infrastructure level. You will not be able to write application code that duplicates itself (easily) so you need to set this up where you host the application. Both Amazon AWS and Google Cloud support this out of the box.

As setting up auto-scaling and configuring the built-in load-balancer for your hosting provider is out of scope for this article I will not delve any deeper into that subject.

However, maybe you are hosting your application on a platform that does not provide load-balancing or auto-scaling, what then?

In those cases you will want to set up some kind of load-balancing entry-point application, also called an API Gateway, by yourself or setup some kind of reverse-proxy like NGINX to do that for you. Of course NGINX is not the only one, there are a lot of these, other alternatives can be found by reading this article.

However, we can also write our own. It won't be as advanced as all ready-made solutions, you have to maintain it yourself and maybe not the best to use in an enterprise environment. But sometimes we just need a simple load balancer we have full control over and with it, you can do some short-cuts such as hard-coding IPs, performing custom logging and analytics and so on.

Spring Cloud offers a client side load balancer out of the box that can act as an entry point application.

What you basically do is create a single application that will act as a middleware and pass through your requests from the public internet to your micro-services while at the same time performing load-balancing.

Spring Cloud comes with this support in form of a Load Balancer module.

There are multiple ways you can use the Spring load balancer and it supports multiple supporting technologies like Eureka (micro-service registry server with auto-discovery) or Netflix Ribbon (inter-process load-balancer).

However, to keep things simple, if you don't have tens or hundreds of microservices (yet!) you might just want to wait to include the full stack of applications and write your own small load-balancer with Spring Boot.

There are two pieces of information your load-balancer will need:

The routing information where incoming requests will be sent
The criteria for when a request should be sent to which server

Whatever option you select here keep in mind that if you are on a cloud environment that offers a load-balancing solution the best option is always to use that instead of rolling your own as it will integrate more seamlessly in your environment and cause you less head-aches in the future

Preventing error propagation (using circuit breakers)

Handling request spikes and API over-use (rate limiting)

Preventing thread hogging API (request limiting)

Offloading frequent database access (caching)

Extra

By applying the above techniques you will be well on your way on ensuring your service can take the traffic the real world will throw towards you.

However, we here only discussed the traffic load aspect of building a service without discussing the security aspect which is a whole other subject. However, to help you out on that front as well, check these nice security cheat sheet from Snyk as bonus material.

This browser does not support PDFs. Please download the PDF to view it: Download PDF.

Written by John Ahlroos

Build. Deploy. Innovate.

Building the web one bit at a time.