For a while now we’ve been putting little pieces together to make our web/file/database servers as rock solid as possible. The next logical step is to implement high availability for our services.

High Availability is a great theory on paper, but with a enviroment as strange as ours, there are always hurdles that we have to tackle one by one.

The last fews days have been filled with academic conversations between Jay and myself about how we implement HA with our services and data.

Our toolset has really shaped up to be the following:

  • Linux ( We prefer Debian and Gentoo ) – both are stable and easy to manage once you understand them
  • Attached Apple XServe RAID through Fiber Channel – great cheap storage
  • VMWare VI3 (ESX) – Without this, we couldn’t even think about doing what we do
  • heartbeat – a great tool to make your file systems and services highly available
  • lsyncd –  a new project that I think has great promise, it uses the inotify file system monitor to rsync to a remote host
  • Barracuda Load Balancer – a GREAT product, allows us to load balance over many linux/windows machines. This is a great flexible device that really lets us do some stupid things
  • MySQL – who can imagine doing web hosting without this database?
  • A Colocation facility – While our two sites are still at RIT, they are in different buildings, which really is good enough for us.

So somehow we need to bring all these tools together, and make them work for us… there is one catch, it has to work with our current models, and we have to be able to ‘roll it in’ meaning, we don’t ever get an opportunity to just ‘start over’. I can image that many people out there have the luxury of starting new projects and making them highly available from the start, but our money, time and needs don’t allow us to do that. So, we have to make due with what we have.

So where do we start in our task of being a 99.99% uptime facility? Like i said, we already have a good deal invested in current servers and storage setup, not a whole lot wiggle room…

Our goal is going to be tackle small chunks at a time as we add highly available services.

Lets look at what we already have that is HA

VMWare VI3 (ESX)

I love this product. I have loved it from the first day we got it. We started out with one server, and have slowly been adding and replacing servers.

We are now at a 3 server cluster, which really suits our needs well. If need by we can now run all of our VM’s on one of our beefier nodes ( of course with some noticeable performance issues). With all the great features of ESX we have been able to do inplace upgrades from 2.5 to 3.0 and just lately 3.0 to 3.5. VMotion make our lives much easier. The ability to migrate virtual machines from one physical device has been our bread and butter. We can take machines down for hardware upgrades or even replacement. It takes around 20 minutes to stand up a brand spanking new ESX node and join it to our cluster.

So with all these inherent features to VMware that lend them selves to High Availablity it also allows us to create butt loads of VMs which we can then load balance across.

 

Barracuda Network Load Balancer 340

This has been another one of those purchases that i can’t believe we ever lived without! It was a fairly inexpensive device that we got and we use it alot!

This device allows us to load balance one public facing IP address across multiple servers, and also do some funky port forwarding. As is our motto around here, we’ve done some pretty stupid things with random port forward, and so far the Barracuda 340 has taken everything we’ve thrown at a it.

 

Well…  thats enough for today.. more to come in Part 2 when i’ll talk about how we are testing all these new parts.

Leave a Reply