Clustering VS. Mainframe

Thu, 23 Jan 2003 23:53:18 -0700 (MST)

Am 22. Jan, 2003 schw=E4tzte Eric Lee Green so:

> This works for web clusters because of the non-persistent nature of web
> connections. It does NOT work for storage clusters. If a member of a
> clustered Appletalk storage network goes down, any Mac connected to that
> particular node must be re-booted before it will re-connect, for example.=
 You
> can believe me or not, I have a Fortune 5 customer using one of my storag=
e
> clusters and that's what happens when a node fails.
>
> SMB is a little more graceful, it just pops up a requester saying that th=
e
> connection has been broken, and requests that you re-attach. Do note that=
 any
> writes outstanding at the time that the node goes down are *LOST*.

Sounds like probs in Appletalk and SMB.

> Finally, virtually all clusters have a "choke point". In the case of web

Everything has a bottle-neck.

> clusters, that's often the database server. In the case of distributed fi=
le
> servers (such as clusters built using GFS, the Global File System), that'=
s
> often the lock server or the Fiber Channel bus between the RAID array and=
 the
> leaf nodes, or the master controller board on the RAID array. This choke
> point goes down, the whole cluster goes down.
>
> So let's make it highly available, you say? Fine and dandy. Been there, d=
one
> that. It takes me approximately 90 seconds to detect that the master node=
 of
> a highly redundant pair has failed. It then takes me another 60 to 90 sec=
onds
> to bring up the services on the slave node. So that's approximately 3 min=
utes
> that the cluster is not available. During that time, the Macs are frozen.=
 The

Hmm. Mot had high availability boxen doing sub-second failover when I left
in 1999. Pull the network cable, hard drive, CPU, or power and the other
took over.

> Windows boxes are popping up their window saying you need to reconnect. A=
ny
> writes in progress are lost. 3 minutes is a lot better than 3 days, but i=
s
> nowhere near what a modern mainframe can achieve -- less than three minut=
es
> of downtime PER YEAR.
>
> In a big iron mainframe, if one CPU goes down, the rest keeps on working =
--
> completely transparently. There is no 3 minute switchover. If one memory =
card
> goes down, the rest keeps on working -- completely transparently. And of
> course with RAID, hard drive failures aren't an issue either. You can
> hot-swap CPU's or add CPU's on an as-needed basis with a big iron mainfra=
me.
> Same deal with memory cards. An IBM mainframe has uptime in the nine nine=
s
> range.

What happens if a network card goes down?

You're right, though: cluster !=3D mainframe. You have to look at what you
need and use the appropriate tool for the job. Clusters have encroached
greatly upon the domain of big iron, but certainly haven't reached the poin=
t
where big iron is obsolete.

ciao,

der.hans
--=20
#  https://www.LuftHans.com/    http://www.TOLISGroup.com/
#  Stell dir vor, es ist Krieg und keiner geht hin...