As I write this I am days away from transitioning to a Limited company.
I thought it would be a good time to reflect on my journey up to this point.
This is a very high level look at where it all began
Back in December 2018 I decided to create my hosting company, I went through several name changes and finally settled on Codo Digital.
How do you start to run a hosting company anyways?
There were many challenges to come as technically there were many hurdles to overcome.
The main issue was that I was using a residential BT Fibre service with a dynamic IP address.
You ran a hosting company on a dynamic IP address?
I overcame this by running a cron every minute checking the external IP address. I set up a Heroku app from the ipify service, which was able to tell me my external IP address. When the IP address changed it would then update the IP address in AWS Route53.
Don’t you need a lot of cabling?
N’ah, powerline adapters to the rescue. The speed between the Internet router and the server through the Powerline adapter was about 200 Mb/s – far greater than the line speed at that time. An issue I faced was that the powerline adapter needing a restart every 6-9 months. This would involve a tried-and-tested approach of flicking the switch next to the router then next to the server to turn them both off, then back on next to the router and back on next to the server. This was the most stressful time ever. There was a lot of trial and error getting the sequence of power down and power up correct. The frustration here was the reliance on me being in the house at the time. It often happened just before a family holiday and I’d be on edge hoping it wouldn’t happen again.
I initially worked around this by having a 4G router acting as a Failover. This kept the websites online but sometimes I would be charged circa £200 for a month of 4G data.
It became apparent that the powerline adapter was the weak link. It wasn’t feasible to move the servers nearer to the primary Internet connection, so instead I bought some Cat 6A cable and a couple of wall plates. I’m pleased to say that this solved the issue.
When High Availability does not mean 100% uptime 🙁
Back when I started this venture I invested in 2 servers which would run in High Availability mode. The idea behind this was to ensure the websites remained online even when a server suffered an issue. Although I thankfully never experienced an issue where High Availability was needed, it did more harm than good when the server required a restart after a software upgrade.
I, naively, expected the first server to reboot and then reboot the 2nd server when the 1st server was ready. This was disappointingly not the case as both servers would restart at the same time knocking out all services.
Back to the drawing board
In order to back out of the High Availability mode would require some serious planning.
Port forwarding was already set up to direct both HTTP and HTTPS traffic to the High Availability port. One weekend I was determined to resolve it. After updating the port forwarding for HTTP and HTTPS and switching off High Availability mode, all the traffic was being sent to one server.
I was now free to reformat the server. I took this opportunity to add a hot spare drive and reinstall all the software and document this process. Who knows when it’ll come in handy.
So, now I have one really clean server – all the software has been added back on and the sites are ready to switch. The next stage was to think about how websites can still be served when one server is down. The obvious choice would be to replicate the setup, but that is a lot of work – every site has to be done twice and how do you maintain the database state without conflicts. It just seemed too dangerous to have the possibility of a change being done in one database and then another change being done on a different database after the network has switched the server.
I thought about having a remote database, but I like the idea of having internal databases that cannot be reached from the outside world.
After several days of pondering I came to a solution. wget. Yes, that’s right – it seems wget can do a lot more than download files – it can backup sites. So every day on the second server it downloads all the assets and HTML and stores them in a directory.
Switching Servers elegantly
Next I needed a way of monitoring the website. What I wanted was when the website became unresponsive, it would point to the second server holding the HTML and assets. I looked at AWS Healthchecks which on paper looked like the right approach, however in my testing I found it took circa 60 seconds for the health to be deemed unhealthy. Even though I could see many health checkers around the globe reporting issues – it was still reporting healthy. After some digging I found that in order for a healthcheck to be classed as unhealthy.
If more than 18% of health checkers report that an endpoint is healthy, Route 53 considers it healthy
18%? It goes on to read:
The 18% value was chosen to ensure that health checkers in multiple regions consider the endpoint healthy. This prevents an endpoint from being considered unhealthy only because network conditions have isolated the endpoint from some health-checking locations. This value might change in a future release.
It seems I needed something a bit more immediate. Then it dawned on me. I could set up a cron (like back in the day), but this time doing a HEAD request to each of the domains. If the status code is 200 then take no action, otherwise update AWS Route53 record pointing the domain to the second server.
I’ve thoroughly enjoyed this journey. What you’ve read above is a very high level piece. I will be creating more in-depth articles focussing on certain areas.