On 06/19/2012 12:48 PM, Eric Shubert wrote:
> On 06/19/2012 06:28 AM, Lisa Kachold wrote:
>> Hi Mark,
>>
>> On Mon, Jun 18, 2012 at 10:05 PM, Mark Jarvis <m.jarvis@cox.net
>> <mailto:m.jarvis@cox.net>> wrote:
>>
>>
>> I'm considering buying a Dell desktop (Inspiron 620), but a few
>> years ago I was warned off them because Dell did something different
>> to their disks so that you had to buy replacement/additional disks
>> only from Dell. Any chance that it's still true?
>>
>> Unless you have a hardware RAID card, and you are buying a desktop, you
>> should not have enterprise grade drives, but check with Dell Support for
>> the model you are interested in.
>> You are referring to TLER/ERC/CCTL:
>>
>> Hard drive manufacturers are drawing a distinction between "desktop"
>> grade and "enterprise" grade drives. The "desktop" grade drives can take
>> a long time (~2 minutes) to respond when they find an error, which
>> causes most RAID systems to label them as failed and drop them from the
>> array. The solution provided by the manufacturers is for us to purchase
>> the "enterprise" grade drives, at twice the cost, which report errors
>> promptly enough so that this isn't a problem. This "enterprise" feature
>> is called TLER, ERC, and CCTL.
>>
>> *The Problem:*
>>
>> There are three problems with this situation:
>>
>> The first is that it flies in the face of the word *Inexpensive* in the
>> acronym *Redundant Arrays of /Inexpensive/ Disks (RAID)*
>> <http://www-2.cs.cmu.edu/%7Egarth/RAIDpaper/Patterson88.pdf>.
>>
>> The second is that when a drive starts to fail, you want to know about
>> it, as Miles Nordin wrote in a long thread
>> <http://opensolaris.org/jive/thread.jspa?threadID=119639&tstart=0>:
>> *
>> Posssible Solutions:*
>>
>> For a while, Western Digital released a program (WDTLER.EXE) that made
>> it possible to enable TLER on desktop grade drives. This no longer works.
>>
>> *Linux:*
>>
>> This message <http://marc.info/?l=linux-raid&m=128640221813394&w=2>
>> implies that it's impossible to tell a drive to cancel its bad read
>> operation:
>>
>> You can set the ERC values of your drives. Then they'll stop processing
>> their internal error recovery procedure after the timeout and continue
>> to react. Without ERC-timeout, the drive tries to correct the error on
>> its own (not reacting on any requests), mdraid assumes an error after a
>> while and tries to rewrite the "missing" sector (assembled from the
>> other disks). But the drive will still not react to the write request
>> as it is still doing its internal recovery procedure. Now mdraid
>> assumes the disk to be bad and kicks it.
>>
>> There's nothing you can do about this viscious circle except either
>> enabling ERC or using Raid-Edition disk (which have ERC enabled by
>> default).
>>
>> Evidence that using ATA ERC commands don't always work:
>> Both Linux and FreeBSD can use normal desktop drives without TLER, and
>> in fact you *would not even want TLER* in such a case, since *TLER can
>> be dangerous* in some circumstances. Read on.
>>
>>
>> *What is TLER/CCTL/ERC?*
>> TLER (Time-Limited Error Recovery
>> CCTL (Command Completion Time Limit)
>> ERC (Error Recovery Control)
>>
>> These basically mean the same thing: limit the number of seconds the
>> harddrive spends on trying to recover a weak or bad sector. TLER and the
>> other variants are typically configured to 7 seconds, meaning that if
>> the drive has not managed to recover that sector within 7 seconds, it
>> will give up and forfeit recovery, and return an I/O error to the host
>> instead.
>>
>> The behavior without TLER is that up to 120 seconds (20-60 is more
>> frequent) may pass before a disk gives up recovery. This behavior causes
>> haywire on all Hardware RAID and Windows-based software/onboard/driver
>> RAIDs. The RAID consider typically is configured to consider disks that
>> don't respond in 10 seconds as completely failed; which is bizarre to
>> say the least! This smells like the vendors have some sort of deal
>> causing you to buy HDDs at twice the price just for a simple firmware
>> fix. LOL!! Don't get yourself buttraped; read on!
>>
>>
>> *When do i need TLER?*
>> You need TLER-capable disks when using any Hardware RAID or any
>> Windows-based software RAID; bummer if you're on Windows platform! But
>> this also means Hardware RAID on any OS (FreeBSD/Linux) would also need
>> TLER disks; even when configured to run as 'JBOD' array. There may be
>> controllers with different firmware that allow you to set the timeout
>> limit for I/O; but i've not yet heard about specific products, except
>> some LSI 1068E in IR mode; but reputable vendors like Areca (FW1.43)
>> certainly require TLER-enabled disks or they will drop-out like candy
>> whenever you encounter a bad/weak sector that needs longer recovery than
>> 10 seconds.
>>
>> Basically, if you use a RAID platform that DEMANDS the disks to respond
>> within 10 seconds, and will KICK OUT disks that do not respond in time,
>> then you need TLER.
>>
>> *When don't I need TLER?*
>> When using FreeBSD/Linux software RAID on a HBA controller; which is a
>> RAID-less controller. Areca HW RAID running in JBOD mode is still a RAID
>> controller; it controls whether the disks are detached, not the OS. With
>> a true HBA like LSI 1068E (Intel SASUC8i) your OS would have control
>> about whether to detach the disk or not; and Linux/BSD won't, at least
>> not for a simple bad sector. Not sure about Apple OSX actually, but
>> since it's based on FreeBSD i could speculate that it would have the
>> same behavior as FreeBSD; perhaps tuned differently.
>>
>> *Why don't you want TLER even if your disks are capable?*
>>
>> If you don't need TLER, then you don't want TLER! Why? Well because
>> *TLER is dangerous!* Nonesense? Consider this:
>>
>> 1. You have a nice RAID5 array on Hardware RAID, being a valuable
>> customer you spent the premium price on TLER capable disks.
>> 2. Now one of your disk dies; oh bummer! But hey I have RAID5; I'
>> protected, RIGHT?
>> 3. So I buy a new disk, and replace the failed one! So easy,
>> 4. A bad sector on of the remaining member disks, and it caused TLER to
>> forfeit; now I got an I/O error during rebuilding my degraded array and
>> the rebuild stopped and I lost access to my data!
>>
>> The danger in TLER lies that if you lost your redundancy, then if a weak
>> sector occurs that COULD be recovered, TLER will force the drive to STOP
>> TRYING after 7 seconds. If it didn't fix it by then, and you lost your
>> redundancy, then TLER is a harmful property instead of a useful one.
>>
>> TLER works best when you got alot of redundancy and can swap disks
>> easily, and want disks that show any sign of weakness - if even just a
>> fart - to be kicked out and replaced ASAP, without causing hickups which
>> are unacceptable to a heavy-duty online money transaction server, for
>> example. So TLER can be useful, but for consumers this is more like an
>> interesting way for vendors to make some more money from you poor souls!
>>
>>
>> *What is Bit-Error Rate and how does it relate to TLER?*
>>
>> Uncorrectable Bit-Error Rate, has been steady at 10^-14, but capacities
>> are growing and the BER rate stays the same. That means that modern
>> high-capacity harddrives now are more likely to be affected by amnesia;
>> they sometimes really cannot read a sector. This could be physical
>> damage to the sector itself, or just a weak charge meaning no physical
>> damage to that sector but just unreadable.
>>
>> So 2TB 512-byte sector disks have a relative high BER rate. This makes
>> them even more susceptible to dropping out of conventional
>> Windows/Hardware RAIDs, and is why the TLER feature has become more
>> important. But i consider it to be rather a curse than a blessing.
>>
>> *So, explain again please: Why don't I need TLER on Linux/BSD?
>>
>> * Simple: the OS does not detach a disk that times out, but resets the
>> interface and re-tries the I/O. Also when using ZFS, it will write to a
>> bad sector, causing that bad sector to be instantly
>> fixed/healed/corrected since writing to a bad sector makes the disk
>> perform a sector swap right away. In the SMART data, the "Current
>> Pending Sector" (active bad sector) would then become "Reallocated
>> Sector Count" (passive bad sector which no longer causes harm and cannot
>> be seen or used by the host Operating System anymore).
>>
>> *That includes ZFS?*
>> Yes. ZFS is, of course, the most reliable and advanced filesystem you
>> can use to store your files, right now. It's free, it's available, it's
>> hot. So use it whenever you can.
>>
>> --
>
> Thanks Lisa. That's the best writeup I've read about this.
>
> I'll continue to steer clear of HW raid, as well as raid-5. :)
>

So yeah, no raid is perfect...

I've been using software raid1 (md) for a while now for my desktops and 
laptops work and home, and since my adventures in ati gpu land, I've 
twice now had video software/hardware cause my software raid to fail 
ugly, but both times survivable while I rebuilt the array manually. 
This was just a few days ago the last...

Both times were using GL functions (this time toggling compositing 
on/off, last time i think minecraft) that caused the ati fglrx drivers 
to spew hardware errors seeming to glitch the card itself.  Two separate 
cards as well now.  Getting back into desktop went into visa with gpu 
unavailable.  Then I saw my raid was degraded, again, same timestamp as 
the gpu glitch.

First time prior one of the two disks in the md for boot went offline, 
simply added sdb1/2 back.  This time one partition on each disk to the 
two md's (boot/else) to go offline alternatively (sda1/sdb2) - very odd. 
  The second disk wouldn't respond to hdparm/fdisk query until a reboot 
that was done very hesitantly and not before I backed up anything I 
cared about to an nfs share.  Data on both remained available which was 
really the odd part.

To its testament, it rebooted, both disks reported healthy (hdparm, 
ubuntu disk utility), I re-added each partition, let it rebuild, and 
works again.  Still worries me as my last set of ssd disks got unstable 
on one after less than 9 months of use and I'm probably about there with 
these that are known to get cranky.  Smart reports them as ok, so I 
wonder how bad ati taints the kernel space that it causes disk 
controller/driver exceptions.

Moral of story: know when/how to repair whatever raid, as software and 
hardware are seemingly still prone to exception from unlikely places. 
Last time a disk died with md, I just mounted the secondary in an 
enclosure, copied off data as pluggable, and copied to the new pair of 
raid disks.  Hardware is never this easy, especially fakeraids.

-mb
---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss