> transactionally correct . . . Transactionally correct. That's my new favorite term of the month. :) You said that both boxen are in the same data center, and you can't install a second NIC in the NT box. Here's a thought. If both boxen have an unused serial port, and they're within a few hundred feet of each other, how about hooking up a null modem serial cable between them. The Linux box could determine if the NT box was dead based on, say, a combination of RS-232 signals being in a specific state. Or you could get fancy and have the NT box pulse a "heartbeat" (raise and drop DTR every second or somethin like that). Trent is correct that if the web server is doing any transactions (or write operations), that will be a PITA. If that is the case, I think I would have the Linux web server simply serve up a static page that reads "We are experiencing technical difficulties. Our fsckin NT box has blue screened *AGAIN*. This is the 729th time this *WEEK*. Below, please find a mesmerizing flaming logo and links to Unix uptime statistics. Please stand by." until the NT box comes back up. Well, as "up" as any NT box can be, that is. D * On Sat, Nov 04, 2000 at 02:57:30PM -0700, Trent Shipley wrote: > > > > -----Original Message----- > > From: plug-discuss-admin@lists.PLUG.phoenix.az.us > > [mailto:plug-discuss-admin@lists.PLUG.phoenix.az.us]On Behalf Of Kevin > > Buettner > > Sent: Friday, November 03, 2000 7:08 PM > > To: plug-discuss@lists.PLUG.phoenix.az.us > > Subject: Re: Linux as backup (failover) machine > > > > > > On Nov 4, 8:11am, Ken Bowley wrote: > > > > > I've been posed with a question, and I'm a little stumped... please > > > bear with me. > > > > > > Problem: > > > Make a Linux machine automatically kick in as a failover machine for > > > http when the NT machine goes down. > > > > > > Restrictions: > > > Need to be able to monitor the NT box without installing anything > > > extra on the NT machine. Linux machine needs to be able to kick in > > > automatically when the NT box goes down, and give control back to > > > the NT box when it comes back up. No access to installing any type > > > of router/proxy between the NT and Linux box and the rest of the > > > net. > > > > > > Please send your ideas either directly to myself, or to the list if > > > this problem is of interest to others. > > > > First, I'm sure that there's some code already out there somewhere > > for this, but it doesn't sound terribly difficult to implement from > > scratch either. (Maybe about five lines of Perl?) > > > > Anyway, the NT box in pingable, right? > > > > Set up a script which continuously pings the NT box; when the > > pings stop coming back, do an ifconfig on your network interface > > to the NT box's IP address. > > > > The reqlinquishing control part is harder, but could be easily > > solved if the NT machine had two network adapters; you could ping > > the second one to know when to give up the NT machine's IP > > address. > > > > So... thinking about this some more, it'd probably be best if > > both machines had two network cards. Weird things happen > > when two machines attempt to use the same IP address. > > > > So here's how it'd look: > > > > ====+==+==============+==+========= Network > > | | | | > > A| B| C| D| > > | | | | > > -+--+- --+--+- > > | NT | | Linux | > > -------- --------- > > > > Now suppose that NT is supplying its services via interface A and > > that you want Linux to use C when it acts as the failover. > > > > So... start out with C disabled ("ifconfig eth0 down", or somesuch). > > Ping B via D. When the pings stop coming back, do "ifconfig eth0 up ..." > > Now, you continue to ping B from D, and when the pings resume, just > > do "ifconfig eth0 down" again to allow the NT machine to take over > > again. > > > > It may be possible to make it work with a single NIC on the NT box, > > but I have doubts about the reliability. (But someone who knows > > more about networking that I do might have some ideas.) > > > > Note too that you can tighten the whole arrangement up by doing: > > > > ====+=================+============ Network > > | | > > A| C| > > | | > > -+----- ---+---- > > | NT +----~----+ Linux | > > -------- B D --------- > > > > where the cable between B and D is a crossover cable. That way too > > you could assign B and D network addresses intended for private > > networks (192.168.X.Y or 10.X.Y.Z). > > > > Okay, so maybe it's around 25 lines of Perl. (It sounds interesting > > enough that I'm tempted to code it myself.) > > > > If _In Search of Clusters, Second Edition_ by Gregory F. Pfister, is any > indication you are looking at a lot more than 25 lines of code. Also, since > you are going to want to run the failover monitor on the Linux box as a > background daemon, it brings into question using a scripting language for > the implementation. > > Not being able to install a proxy or router between the dual failover boxes > is not much of a limitation. That is a dead end because it just introduces > another point of failure. > > Not being able to alter the primary may make mean that your boss just > ordered miracle-ware. This is particularly true if the failover has to be > transactionally correct . . . and if the box is mission critical, then the > accountants are going to INSIST that no data be lost or created during the > failover. (Transactional semantics may mean that the project cannot be done > in-house. . . .) > > Failback is just as problematic, though you will get to recycle a lot of > code (but not all of it. The problems are not identical.) > > Unless you can find a canned freeware solution you might want to tell them > to look at buying another NT license (you might get away with a workstation > instead of a server version), two MTS licenses, and a proprietary failover > system. > > Also, Oracle has a feature called "standby database" that is standard. It > probably won't help with your problem, but it might be useful as an example.