On Wednesday 22 January 2003 01:03 pm, David Mandala wrote:
> A few more reasons that people go to clusters are:
>
> 1) Failover/Down time. If a single unit in a cluster dies the rest keep
> on working. 
 
This works for web clusters because of the non-persistent nature of web 
connections. It does NOT work for storage clusters. If a member of a 
clustered Appletalk storage network goes down, any Mac connected to that 
particular node must be re-booted before it will re-connect, for example. You 
can believe me or not, I have a Fortune 5 customer using one of my storage 
clusters and that's what happens when a node fails. 

SMB is a little more graceful, it just pops up a requester saying that the 
connection has been broken, and requests that you re-attach. Do note that any 
writes outstanding at the time that the node goes down are *LOST*.

Finally, virtually all clusters have a "choke point". In the case of web 
clusters, that's often the database server. In the case of distributed file 
servers (such as clusters built using GFS, the Global File System), that's 
often the lock server or the Fiber Channel bus between the RAID array and the 
leaf nodes, or the master controller board on the RAID array. This choke 
point goes down, the whole cluster goes down.

So let's make it highly available, you say? Fine and dandy. Been there, done 
that. It takes me approximately 90 seconds to detect that the master node of 
a highly redundant pair has failed. It then takes me another 60 to 90 seconds 
to bring up the services on the slave node. So that's approximately 3 minutes 
that the cluster is not available. During that time, the Macs are frozen. The 
Windows boxes are popping up their window saying you need to reconnect. Any 
writes in progress are lost. 3 minutes is a lot better than 3 days, but is 
nowhere near what a modern mainframe can achieve -- less than three minutes 
of downtime PER YEAR. 

In a big iron mainframe, if one CPU goes down, the rest keeps on working -- 
completely transparently. There is no 3 minute switchover. If one memory card 
goes down, the rest keeps on working -- completely transparently. And of 
course with RAID, hard drive failures aren't an issue either. You can 
hot-swap CPU's or add CPU's on an as-needed basis with a big iron mainframe. 
Same deal with memory cards. An IBM mainframe has uptime in the nine nines 
range. 

> If the cluster is big enough it may even be hard to notice
> that a single unit dies. (Google uses special clusters for their search
> engine.)

Google's search engine is a special case cluster that is enabled by the fact 
that it's a web cluster. As a web cluster, a failing node results in a short 
read. You click the refresh button, you get connected again to a non-failed 
node, and things work again. Secondly, all interactive accesses are reads. 
Writes to their back-end databases are done as a batch process then 
distributed in parallel to the various nodes of the cluster, they are not 
real-time updates. This approach is utterly unsuited for a general purpose 
storage cluster, whether we are talking about a file storage cluster or a 
database cluster. I use a similar approach to replicate storage migration 
data to the multiple nodes of a storage cluster, but I have complete control 
over the database and all software that accesses it -- if we were offering a 
database service to the outside world, this approach would not work *at all*. 

> 2) Cost, for many problems that can be split over a cluster it is
> usually cheaper to build a big cluster then buy one machine and item one
> becomes a factor too.

The operant point is "can be split over a cluster". Not all problems can be 
split over a cluster (or if they can, it is in a very clunky and unreliable 
way). In reality, if it is a CPU-intensive application, a cluster is cheaper. 
If it is an IO-intensive application, often a big iron machine is cheaper. I 
would not attempt to run my corporation's accounting systems off of an Oracle 
database cluster. I'd want to run it off of a big honkin' database server 
that had fail-safe characteristics. 


> Some mainframes look like large clusters to the software running on
> them. The IBM 390 running Linux can have thousands of Linux instances
> running on the same machine. The software thinks it's on a cluster but
> the actual hardware is the mainframe. This is a special purpose item.
> The mainframe of course cost big $$$ but there are cases where this is
> cheaper then a cluster of real hardware when you calculate MBTF and
> floorspace and head costs.

Indeed. 

-- 
Eric Lee Green          GnuPG public key at http://badtux.org/eric/eric.gpg
          mailto:eric@badtux.org  Web: http://www.badtux.org