(Long story ahead, but ReiserFS comes at the end...) A few months ago I had a cheapie Celeron motherboard from Camelback Computers. That machine had the DPT RAID controller installed. After 2 of the RAID drives died (due to overheating probably - they ran too hot) and I gave up on the idea, I got a new Seagate IDE disk and tried to use it with the motherboard's built-in controller, which up to that point had seen no use. Well, I started seeing kernel messages about IDE problems (0x51 blah blah blah or something like that). I tried several kernels of different vintages, tried various IDE-related kernel config options, etc., to no avail. Sometimes I'd hear some violent-sounding seek noises, like it was banging the heads against the drive case or something. (I didn't think that was possible with an IDE drive - it should be smart enough to take care of itself even if given wrong cylinder requests shouldn't it?) A few days later the drive died. I took it back to Fry's, said "so much for Seagate" and got an IBM. As soon as I got Linux installed on it, I started seeing the same errors and hearing the same kinds of unusual noises. So I figured maybe it was the controller's fault and maybe I was about to kill another hard drive, so I immediately quit using it, and now that motherboard is over at my dad's place running Win98 as the manufacturer intended, so my younger sisters can play their silly girlie games etc. So I figured I was tired of crappy hardware and got a "guaranteed overclockable" setup online, consisting of a BP6 motherboard and two Celeron 366's and two humongous 32CFM fansinks. I put the same IBM hard drive in there, and on the HPT366 IDE controller too (it has both a garden-variety UDMA33 controller and the UDMA66 controller). I tried overclocking but it wasn't stable; once I got it to run for 15 minutes or so at the next higher bus speed, but then it crashed, and for some reason I've never been able to make slight adjustments to the bus clock and have it work; it only works if I go to the next major speed bump. Odd. Anyway it seems reliable enough at 366MHz. I never got any more IDE kernel errors that I can remember, but once in a while the machine would just freeze up. Usually either X was running, and the machine would freeze with a frozen graphical display, or else the stinkin' console screen saver would have blanked the screen so that there weren't any errors on the screen either. I didn't find anything weird in any logs after rebooting, and then it would be fine for several days again. But the fsck errors kept getting worse; I was getting some fairly severe filesystem corruption, and stuff was showing up in lost+found. I tried putting the drive onto the slower IDE controller, but it still continued to hang up every few days. I figured out how to get rid of the console screen saver (setterm -blank 0), and this weekend, it hung once, while the console was up, and I saw no error messages at all. So I said "this blows" and dug out my old Buslogic SCSI controller and an extra 4 gig drive I had laying around, and installed a fresh copy of Potato on it. I tried to backup the files from the IDE drive into a directory on the SCSI drive, and twice while trying to do that using my usual 2.3.42 SMP kernel I'd been running on that machine, it hung again. So I rebooted to the old 2.0.36 kernel that came with Slink, and successfully copied the files. Then I built a fresh 2.4.0.test7+reiserfs kernel, on the SCSI drive, and booted with that. So far I have had no more hangs or crashes with this setup. But, I wanted to reformat the drive and torture-test it to see if it's going to be reliable, or if the problems have been its fault all along. Since it had an iffy chance of success anyway, I formatted it with ReiserFS. Mounted it at /var. My next step is to create a Postgres database for weather data like the one currently on electron (http://gw.kb7pwd.ampr.org) and see how long that runs. The weather data gets written fairly continuously, and last time I tried this experiment on the previous install on the IDE drive, the database got corrupted in a couple days; so this should be a good test. Then again, maybe I should do it on an ext2 fs so if it fails, I will know the drive is the problem; whereas ReiserFS might be able to mask some of its problems, or fail of its own accord. The drive is now on the slower IDE controller. If it survives I'll try the fast one again. So anyway... anybody else trying ReiserFS, and has it been stable so far? It's a little troublesome that right now acc. to the web page, reiserfsck is not working very well, so if I do get some corruption, I'll be SOL. At least the weather data is no great loss. I got this relatively large IDE drive with the idea of using it as my main NFS server, but so far I can't trust it enough for that. (17 gigs... ha, it was big when I got it, but now a 30 gig costs less than it did. Geez.) The other experiment ongoing with this machine is tv-watching, via my Hauppauge TV card. (I had Tivo envy, what can I say.) So I might try recording video on that drive too, that ought to stress it a bit. Ironically, when it crashed early last week, xawtv was running at the time, and it kept working all week, but I noticed I couldn't change channels and was suspecting the batteries in the wireless keyboard, until I noticed I couldn't ping the box either. So I guess the TV operation really is CPU independent - just two PCI cards talking to each other. Neato. My dual-Pentium has developed a habit of hanging now and then too. My dad's gateway dials in via PPP, and for a couple weeks, it's been hosed because of a power outage; I went and fixed it Saturday - just had to walk it through fsck errors (fix this? uh yeah, what else would I do!?!?) and changed the init script so hopefully next time, it won't ask dumb questions and just fix the filesystem. If ReiserFS proves stable enough, that machine's going to get it eventually. Anyway... while my dad's gateway was down, my dual-Pentium gateway machine had a perfect uptime record. And when I went to fix Dad's gateway, it managed to get one PPP connection, and then my machine hung again. So whatever it is, it seems to be PPP-related. How odd. -- _______ Shawn T. Rutledge / KB7PWD ecloud@bigfoot.com (_ | |_) http://www.bigfoot.com/~ecloud kb7pwd@kb7pwd.ampr.org __) | | \________________________________________________________________ Get money for spare CPU cycles at http://www.ProcessTree.com/?sponsor=5903