Server failover

Mon Jul 7 00:03:59 MST 2008

I agree that for the web-tier (apache), network directors (ultramonkey,
localdirector, CSM) is the best way to go.  It requires extra hardware, but
they can be really small boxes.  These devices sit between the client and
your apache servers and hold the IP bound to your DNS entry (this is the way
I've implemented them; there are other ways).  Then it basically proxies
your request to one of the web servers based on a set of rules (round-robin,
load balancing, failover, content).  In this case, each of the web servers
would be running apache and would have a webroot shared with the others.
Normally for performance it's something like an rsync'd local filesystem.
If you plan on updating the files a lot you could either design the release
process to act against all of the systems at once, or used a shared
filesystem.  A shared filesystem would be something like NFS, OpenAFS, or
some kind of clustered filesystem (gpfs, vxCFS, etc).

For the data tier (mysql), you would want to either use an HA failover
cluster, or the built-in MYSQL clustering.  My experience with the MYSQL
clustering is dated, and back then it required extra hardware, so let's go
with HA clustering for now.

You need to have an NIC cabled into the same segment (or VLAN) on each of
the boxes you want to cluster.  This NIC will hold the IP of the mysql
server.  The IP will be one of the resources that fails over with the
cluster.  You will also need shared storage.  That's where drbd comes in.
You could use SAN or some kind of iSCSI implementation, but that requires
additional infrastructure.  Put your MYSQL server data in its own logical
volume and then mirror that volume active/passive with drbd to all of the
nodes in the cluster.  Active/passive simply means that only 1 box will use
the data at once.  The other boxes COULD access it, but won't until the
cluster fails to them.  That's where your HA cluster comes in.

I recommend running pacemaker with heartbeat if you're on Suse.  For this
you'll need at least 2 keepalives (I won't call them heartbeats because of
the name of the app) for the cluster.  That's normally a serial null-modem
daisy chain, a dedicated ether-NIC, or a disk-ping.  For disk ping you'll
need shared storage (see drbd8).  Or you could just use 2 ether-NIC's and
hope you never turn on your firewall without any rules :)  Add the IP from
the above paragraph, the mysql init script (it'll default to using
/etc/init.d/mysql if you don't give it any other options) and the shared
storage (to mount up the shared storage unless you're using raw devices) to
a single group.  This group means the IP, mysql server, and active drbd will
always sit on the same box (except in split-brain, see the 2 ether-NIC and
firewall thing above).  If you tweak it well enough it will even monitor
itself.

Note*  The mysql configs will be in /etc, normally and will not fail over,
so any time you update the mysql config, you'll need to replicate that to
the other nodes.  OR you can link it to files that you are sharing in your
cluster.

You could do the same kind of HA cluster with apache.  You'd always have 1
apache server up, but you'd also only ever have 1 up :)  Aka, you couldn't
split the load over multiple machines.

Anyway, that's just my opinion.  Linux HA has a problem with split-brain
because of it's lack of decent fencing, but that (and STONITH) are another
conversation.  As is app-tier.

-- 
James McPhee
jmcphe at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.PLUG.phoenix.az.us/pipermail/plug-discuss/attachments/20080707/899a1c7e/attachment.htm