On Wednesday 22 January 2003 01:03 pm, David Mandala wrote:
> A few more reasons that people go to clusters are:
>
> 1) Failover/Down time. If a single unit in a cluster dies the rest keep
> on working.
This works for web clusters because of the non-persistent nature of web
connections. It does NOT work for storage clusters. If a member of a
clustered Appletalk storage network goes down, any Mac connected to that
particular node must be re-booted before it will re-connect, for example. You
can believe me or not, I have a Fortune 5 customer using one of my storage
clusters and that's what happens when a node fails.
SMB is a little more graceful, it just pops up a requester saying that the
connection has been broken, and requests that you re-attach. Do note that any
writes outstanding at the time that the node goes down are *LOST*.
Finally, virtually all clusters have a "choke point". In the case of web
clusters, that's often the database server. In the case of distributed file
servers (such as clusters built using GFS, the Global File System), that's
often the lock server or the Fiber Channel bus between the RAID array and the
leaf nodes, or the master controller board on the RAID array. This choke
point goes down, the whole cluster goes down.
So let's make it highly available, you say? Fine and dandy. Been there, done
that. It takes me approximately 90 seconds to detect that the master node of
a highly redundant pair has failed. It then takes me another 60 to 90 seconds
to bring up the services on the slave node. So that's approximately 3 minutes
that the cluster is not available. During that time, the Macs are frozen. The
Windows boxes are popping up their window saying you need to reconnect. Any
writes in progress are lost. 3 minutes is a lot better than 3 days, but is
nowhere near what a modern mainframe can achieve -- less than three minutes
of downtime PER YEAR.
In a big iron mainframe, if one CPU goes down, the rest keeps on working --
completely transparently. There is no 3 minute switchover. If one memory card
goes down, the rest keeps on working -- completely transparently. And of
course with RAID, hard drive failures aren't an issue either. You can
hot-swap CPU's or add CPU's on an as-needed basis with a big iron mainframe.
Same deal with memory cards. An IBM mainframe has uptime in the nine nines
range.
> If the cluster is big enough it may even be hard to notice
> that a single unit dies. (Google uses special clusters for their search
> engine.)
Google's search engine is a special case cluster that is enabled by the fact
that it's a web cluster. As a web cluster, a failing node results in a short
read. You click the refresh button, you get connected again to a non-failed
node, and things work again. Secondly, all interactive accesses are reads.
Writes to their back-end databases are done as a batch process then
distributed in parallel to the various nodes of the cluster, they are not
real-time updates. This approach is utterly unsuited for a general purpose
storage cluster, whether we are talking about a file storage cluster or a
database cluster. I use a similar approach to replicate storage migration
data to the multiple nodes of a storage cluster, but I have complete control
over the database and all software that accesses it -- if we were offering a
database service to the outside world, this approach would not work *at all*.
> 2) Cost, for many problems that can be split over a cluster it is
> usually cheaper to build a big cluster then buy one machine and item one
> becomes a factor too.
The operant point is "can be split over a cluster". Not all problems can be
split over a cluster (or if they can, it is in a very clunky and unreliable
way). In reality, if it is a CPU-intensive application, a cluster is cheaper.
If it is an IO-intensive application, often a big iron machine is cheaper. I
would not attempt to run my corporation's accounting systems off of an Oracle
database cluster. I'd want to run it off of a big honkin' database server
that had fail-safe characteristics.
> Some mainframes look like large clusters to the software running on
> them. The IBM 390 running Linux can have thousands of Linux instances
> running on the same machine. The software thinks it's on a cluster but
> the actual hardware is the mainframe. This is a special purpose item.
> The mainframe of course cost big $$$ but there are cases where this is
> cheaper then a cluster of real hardware when you calculate MBTF and
> floorspace and head costs.
Indeed.
--
Eric Lee Green GnuPG public key at http://badtux.org/eric/eric.gpg
mailto:eric@badtux.org Web: http://www.badtux.org