kernel very unstable
Pete Buechler
Pete Buechler <peter.buechler@home.com>
Mon, 12 Mar 2001 21:48:34 -0700
On Sunday 11 March 2001 02:51 pm, Lucas Vogel wrote:
> > > My stock SuSE 2.2.16 kernel keeps intermittently crashing
> >
> > and hosing my
> >
> > > machine to where I have to do a hard reboot. It seems to
> >
> > generate some kind
> >
> > > of oops on the kmem_free function.
> >
> > In general, any regular crashing by a stock kernel points to hardware
> > problems on your end. No kernel that ships with any of the major
> > distributions will give these kind of problems on halfway decent
> > hardware. In general.
>
> How do I diagnose and fix something like this then?
>
Yikes. I did not see any replies from anybody who is an expert. So you will
have to take my advice.
First examine your syslogs. You already did that, because you told us that
you saw MARK in there a bunch of times. Anything else of use in there?
Second try to think of what you were doing at the time of the crashes. See if
that gives you any ideas.
Third, if you have any Western Digital drives use hdparm to turn off DMA for
them. Their implementation of UDMA-66 ignores CRC problems (dumm).
Fourth, make sure that you do not have any IRQ conflicts. Make sure you note
which IRQs are used by what and double-check this by looking at the contents
of /proc/interrupts.
Fifth, capture the oops (maybe with a pencil and paper) and run it through
ksymoops (look for directions with the Linux kernel source code, under the
Documentation directory in a file called oops-tracing.txt). If you can figure
out where the code was when it crashed maybe you can get a hint as to the
problem.
Sixth, see if you have any diagnostics for your hardware that came with them.
Or, go to the web sites of the companies
Seventh, strip your system down to the bare essentials - monitor, keyboard,
mouse, motherboard and one drive. See how that runs. Then add hardware back
in one at a time, see what causes the destabilization.
Hardware problems can be a real pain to diagnose. I say we all go out and get
computers built with self-checking pairs. That will help stimulate the
economy :-)
-Pete-