Having a good software is not enough. Just ask GitLab IT guys about what recently happened.
Speaking about a production environment, you can not trust your servers. Never trust your system.
Every single service must be in HA. But, what is HA?
This short post, will introduce you to HA, this is not a full guide but a simple use case explanation.
HA in a nutshell
HA is when you have at least two resources, working alone or together. The goal is to have al least one working resource if one fails.
You can provide HA in different ways, choose the one you prefer depending on your situation and knowledges, here's a short list
- shared IP
- load balancer
- live migration
The best HA solution
There is not a best solution that fits each case. You have to understand what is best for you and read the documentation, first.
When you have to deploy, you have to guarantee your customers to not have downtime. For a zero downtime deploy, you have to rely on HA. In our server farm, we have two servers for our api service. Those servers are in a round-robin configuration, handled by a shared IP HAproxy. When an api server is shutted down, HAproxy sends all the requests to the one server alive. Then the server is started with the new version, and requests are sent to both servers again. After a few seconds, we shut down the other one and the circle is closed with no downtime.
All this process is handled with a simple deploy script. We are planning to move to Jenkins, but for now a simple bash fits as a fiddle.
As I told in a previous post, if you want to split the codebase in many microservices, you also have to think about HA them.
Here's an example for a round-robin configuration for a web service, with a third server as backup. The
backup is used only if both
server02 are down.
backend sms.freeluna.it mode http server server01.brugnara.me 192.168.10.51:8080 check port 8080 server server02.brugnara.me 192.168.10.52:8080 check port 8080 server backup01.brugnara.me 192.168.10.151:8080 check port 8080 backup balance roundrobin option forwardfor option httpclose option http-server-close option httpchk GET /alive HTTP/1.1