Last chance failover

Fri Nov 18 22:11:12 MST 2005

I am trying to setup network fail-over on my home server but am at a  
loss as to the best way to do it.  Google is no help as either my  
situation is unique or (more likely) my search terms are horrid.

Here's my situation:  I have a 100Mbps Ethernet card on eth0 and a  
802.11b wireless card on eth1.  The server is on a UPS that will keep  
it running for some time during a power outage.  My cable modem and  
wireless router/ap are also on a (separate) UPS that will keep them  
up for over an hour after power loss.  So if the power goes off, my  
server will stay up and my Internet connection will stay up.

Unfortunately, my server is on the opposite side of the house  
compared to my Internet router and goes through two other switches  
before getting there.  Neither of these switches are UPS protected.   
So if the power goes out, the server stays up and the 'net connection  
stays up... but the server can't access the 'net because the switches  
in between are now dead.

What I want to do in this situation is automatically switch to the  
wireless eth1.  It's much much slower, but it will still work in this  
case.  The question that I'm being stumped on is how to do that.

I read up a bit on NIC bonding or teaming and I like the fact that it  
presents a unified front-end to the sockets so if one NIC goes down,  
no socket connections need to be broken.  However, bonding treats  
both NICs as equals and sends packets round-robin.  I don't want the  
wireless eth1 to be used at all unless it absolutely has to.  So  
bonding/teaming is out.

I then found a number of references to setting up a fail-over router  
by setting two routes and having the kernel automatically switch from  
the primary to the secondary route.  The exclusive nature of that  
appeals to me, but that won't work either since both eth0 and eth1  
will have the same default route in my setup.  I had the brief  
thought of maybe setting up an embedded router using UML or Xen or  
VMWare and having eth1 point to the router as the default route...  
but that's such a hack.  And besides, I would like whichever is the  
dominant NIC to have the main server IP.

I could probably approximate this with a script that just checks the  
link status on eth0 and if it drops for more than a few seconds, it  
would 'if eth0 down' and 'if eth1 up'.  That would likely work but  
would have the unfortunate problem of terminating any existing socket  
connections and would be, IMO, ugly.

So is there some elegant solution to this that I am missing?
Kurt