Re: Bad Spot on Disk

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Ed Skinner
Date:  
To: plug-discuss
Subject: Re: Bad Spot on Disk
On Thursday 04 November 2004 16:27, Rob Wultsch wrote:
> Check out http://llg.cubic.org/docs/hdrescue.html


     Thanks to one and all. The above link is one of the more valuable ones on 
this topic and, therein, the tip on dd_rescue is undoubtedly the best.
     For what it's worth to the archive, here's what I did / learned.


     IMPORTANT NOTE: This procedure only worked because 1) the old and new 
disks had identical geometry (and sizes) and 2) the "wrong" step I took 
actually did one crucial thing "right" that later on turned out to be 
essential.


     The problems first started showing up as in "input error" when attempting 
to burn an ISO image to CD-R. (I burned 2 or 3 different CD-Rs before "input" 
finally sank in and I realized it was my hard disk that was failing, not the 
CD-R I was trying to burn.)
     As root, I found that "/sbin/badblocks /dev/hdg3" (hdg3 being the correct 
partition) would find 23 bad blocks. Re-running this on a different drive and 
seeing zero bad blocks, I concluded that "23" was not good. Since the Dell 
unit was still under warranty, I booted their diagnostics (that were still in 
the first disk partition) and ran the obligatory Dell-sanctioned tests. They 
also said the disk was broken.
     Dell shipped a replacement and it arrived the next day (thank you DHL 
Airborne Express and Dell).
     I made a quick trip to Fry's and bought a $6.99 "2.5" to 3.5" Disk 
Adapter" and a spare power cable Y (not needed, after all). (With power off) 
I removed the failing drive from the notebook, spent 10 minutes figuring out 
where pin #1 was (as documented on the paper label but not on the connector), 
and connected (with power off) the adapter and the notebook disk to my other 
system, a traditional workstation (paying attention to Master / Slave, etc.).
     After booting up, I compared the output from "cat /proc/partitions" and 
"mount" and figured out where the 2.5" drive was showing up. In my system, it 
was "hdg" -- so the /proc/partitions output showed:
hdg -- the whole disk
hdg1 -- Dell's "secret" partition
hdg2 -- Windows XP partition (yes, I'm a sinner, forgive me)
hdg3 -- "/", and
hdg4 -- Swap.
     I then tried what I hoped would be the simplest "clone" operation:
dd if=/dev/hdg of=whole_enchilada bs=128k
     Not paying a lot of attention, I didn't notice I only got about 30G off 
the 60G drive. Charging ahead in ignorant but, as it turns out, slightly 
profitable bliss nonetheless, I powered off, switched to the new disk, booted 
up and ran:
dd if=/whole_enchilada of=/dev/hdg bs=128k
     This finished "successfully" but when I installed the new disk in the 
notebook and attempted to boot from it, all I got on the display was "GRUB" 
and then it stayed there.
     I then noticed the short (30G) copy.
     I went back to the failing disk and dd'd each of the four partitions. 
(Please don't tell me I didn't need to copy the swap partition. Anal people 
like me do it anyway -- and we feel much better because of it.)
dd if=/dev/hdg1 of=part1 bs=128k
dd if=/dev/hdg2 of=part2 bs=128k
dd if=/dev/hdg3 of=part3 bs=128k
dd if=/dev/hdg4 of=part4 bs=128k
     Oddly, all four of these appeared to get the correct amount of data. All 
totalled, I had the requisite 60G (approx.) bytes of data. At this point, I 
don't know why the one massive "dd" of the entire disk failed but, one 
partition at a time, it worked.
     Regardless, I then plunked each of those partition copies into the new 
drive by reversing the above.
     Before proceeding, I then ran "/sbin/e2fsck -c /dev/hdg3" to fix any 
problems that may have been created by the rather blind copy I did. Luckily 
for me, it came up clean.
     For good measure, I ran "/sbin/badblocks" on the new disk (hdg3) and it 
said the disk was perfect -- no bad blocks.
     Putting the new drive into the notebook, I then booted the Dell 
diagnostics and ran them. All tests passed. I then booted Windows and checked 
some key files, and finally Linux. Everything seems to be back to normal now.
     To the best of my knowledge, other than the one file that reported "input 
error" when accessed, no other data was lost. (And I removed that one by hand 
AFTER cloning it to the new drive.)
     The bad drive is waiting for DHL pickup. Good riddance.


     Note that copying each of the four partitions as I did in the latter part 
of this does not restore the boot block and partition table. My first but 
failing attempt at restoring the entire drive re-initialized that part of the 
disk. Doing this from scratch "next time", I supposed I could copy off the 
first few "bunch" of blocks and clone that over to the new drive to get the 
partition table and boot blocks properly copied.
     If I have to do this again, I'll look at dd_rescue which says it will 
deal with bad blocks automatically and simplify some of this. But the 
"badblocks" program was useful in proving (to me) that the disk was bad 
(SMART tools did not anything wrong!) and, if there had been unreadable 
block(s), e2fsck would've been essential in puting the file system back into 
usable condition.


     Later, dudes...


--
Ed Skinner, , http://www.flat5.net/

---------------------------------------------------
PLUG-discuss mailing list -
To subscribe, unsubscribe, or to change you mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss