Thanks, I'll try (some of) that. --Phil M. ------------------------------- Message: 14 Date: Sat, 10 Aug 2002 14:50:12 -0400 From: George Toft To: plug-discuss@lists.plug.phoenix.az.us Subject: Re: Linux Stability Question Reply-To: plug-discuss@lists.plug.phoenix.az.us This is going to be a fun one. You need to set up some monitoring to see what's going on inside the box. How many processes are running? How much CPU are the top 10 taking up? How much RAM is there, and are you paging excessively? What is the load average? What is the CPU temperature? How many file descriptors are you using, and are your network connections being cleared in a timely fashion? Once you track these items, record them each minute to a log. Next time it happens, look at the log. I just spent several weeks in this excercise to track down *two* heat issues. My box would run for 3-6 days, then slow way down and thrash the hard drive. If I restarted X, I could salvage the machine about 1/2 the time. Otherwise, I had to reset the computer and suffer the fsck from hell. Final analysis on my problem? Video chip on my $25 S3 "nothing special" video card was overheating, and my CPU was overheating. I put a fan blowing across the video card (it was not made to be cooled), and put a 2GHZ cooler (overkill) on my 600 HHZ CPU. Box has been rock solid ever since. George -------------------------------- -- Phil Mattison Ohmikron Corp. 480-722-9595 602-820-9452 Mobile