On Wednesday 22 January 2003 02:42 pm, David Mandala wrote: > Hmm, it does work for some storage clusters. Appletalk clusters may have > a problem but other database are able to cluster and accommodate a dead > node without the need to reboot the clients. A well designed cluster > built for a purpose can be designed to avoid choke points. Again > depending upon the devices and the design it can and should take less > then 1 second to detect a failed hardware point. Anything at the time > limits you are describing (3 minutes) is unacceptable performance. > > With the correct design the customer, be it web, database or calculation > cluster never knows a node went down, nor should they. I'm sorry, but I am not aware of any current cluster designs for Linux that replicate socket state between the nodes. MOSIX was supposedly working on one, but as far as I know has never released one. Please correct me if I'm wrong. The fact of the matter is that for protocols that use a persistent socket (such as SMB or Appletalk, or databases for that matter), unless socket state is replicated you cannot have a transparent failover no matter how much other process state you replicate. The only folks I'm aware of as having anything close to socket state replication is the Mission Critical Linux folks, and they actually replicate NFS connection state for NFS RCP sockets, not the sockets themselves, so that NFS failover can occur transparently. However, note that a Kimberlite cluster can be in a failed state for as long as 60 seconds before the slave detirmines that the master has failed and assumes command of the cluster. Now, for special purpose applications, you can do your database failover on the client side (rather than on the server side). That is, if a transaction fails, you can repeat the transaction using a different database server. Similarly, for write transactions, you can perform the transaction to multiple database servers in order to maintain clustered database redundancy. Your cluster members can then check their state at bootup to make sure that they have all queued transactions, and can be "caught up" on delayed transactions at that time (but indicate to the client that they aren't available yet until the delayed transactions have been replayed). My understanding is that some of the "name" databases have this support already built into them. But this is part of the database/application, not something that can be handled transparently by a cluster. You can modify most any application to be clustered. But you can't take a non-clustered network application and have it transparently handle the situation where a cluster member goes away. > Sorry a Google cluster is not just a web cluster. A Google cluster > consists of approximately 80 machines. Some are database slices (their > database is HUGE), and some are logic and some are web. If any machine > in the cluster fails the you never see it does not matter if it is a > database, logic or web machine, Incorrect. If it is a web machine, you may get an aborted transfer error, or, rather, the transfer appears to "hang". This occurs. I've encountered it. As for the rest of a Google cluster, as I mentioned, you can modify most any application to be clustered. That does not help if you are wanting to use standard applications that were not designed to be clustered, or that you have no source code to (such as the SMB protocol stack on Windows, or the AppleTalk protocol stack on Macs). > Not all problems lend themselves to clusters, those that do can make use > of commodity hardware, and Linux and save big bucks compared to a big > iron machine. As I said: > > The operant point is "can be split over a cluster". Not all problems can > > be split over a cluster (or if they can, it is in a very clunky and > > unreliable way). -- Eric Lee Green GnuPG public key at http://badtux.org/eric/eric.gpg mailto:eric@badtux.org Web: http://www.badtux.org