[phxjug-java] Help Wanted: Squashing JVM on Linux Bug
Steve Jovanovic
plug-discuss@lists.plug.phoenix.az.us
Tue, 10 Jun 2003 01:06:07 -0500
Hi Huang and Chris,
Thanks very much for your suggestions and thoughts!
We're running RedHat 7.2. I checked with our system administrator, and
we're definitely not having a problem with file descriptor limits. I
continue to see things like this in the process table:
tangent:~/.ssh$ ps aux | grep java
skribe 21348 0.0 0.5 224256 5204 ? SN Jun07 0:00
/home/skribe/java
skribe 21351 0.0 0.5 224256 5204 ? SN Jun07 0:00
/home/skribe/java
skribe 21352 0.0 0.0 0 0 ? ZN Jun07 0:00 [java
<defunct>]
skribe 21353 0.0 0.0 0 0 ? ZN Jun07 0:00 [java
<defunct>]
skribe 21354 0.0 0.0 0 0 ? ZN Jun07 0:00 [java
<defunct>]
skribe 21355 0.0 0.0 0 0 ? ZN Jun07 0:07 [java
<defunct>]
skribe 21356 0.0 0.0 0 0 ? ZN Jun07 0:00 [java
<defunct>]
skribe 21357 0.0 0.0 0 0 ? ZN Jun07 0:00 [java
<defunct>]
skribe 21358 0.0 0.0 0 0 ? ZN Jun07 0:01 [java
<defunct>]
skribe 21367 0.0 0.0 0 0 ? ZN Jun07 0:00 [java
<defunct>]
skribe 21368 0.0 0.0 0 0 ? ZN Jun07 0:00 [java
<defunct>]
skribe 21369 0.0 0.0 0 0 ? ZN Jun07 0:03 [java
<defunct>]
skribe 17657 0.0 0.5 224256 5204 ? SN Jun08 0:00
/home/skribe/java
skribe 17730 0.0 0.5 224256 5204 ? SN Jun08 0:00
/home/skribe/java
skribe 20572 0.0 0.5 224984 5704 ? S Jun08 0:00
/usr/local/java/b
skribe 20589 0.0 0.5 224984 5704 ? S Jun08 0:00
/usr/local/java/b
skribe 20590 0.0 0.5 224984 5704 ? S Jun08 0:00
/usr/local/java/b
skribe 20591 0.0 0.5 224984 5704 ? S Jun08 0:00
/usr/local/java/b
skribe 20592 0.0 0.5 224984 5704 ? S Jun08 0:00
/usr/local/java/b
skribe 20596 0.0 0.5 224984 5704 ? S Jun08 0:04
/usr/local/java/b
skribe 20597 0.0 0.5 224984 5704 ? S Jun08 0:00
/usr/local/java/b
skribe 20600 0.0 0.5 224984 5704 ? S Jun08 0:00
/usr/local/java/b
skribe 20601 0.0 0.5 224984 5704 ? S Jun08 0:00
/usr/local/java/b
skribe 20706 0.0 0.5 224984 5704 ? S Jun08 0:00
/usr/local/java/b
skribe 22637 0.0 0.0 1764 628 pts/121 S 22:45 0:00 grep java
Chris, we've tried several different JVM's, all with the same result.
It's possible that it's bad JNI code in Orion 2.0, and it's also
possible that there's a problem with the libs on the particular server
we're using, but so far we haven't been able to isolate it.
You're right that ideally we should test on a similarly configured
machine, but unfortunately, this particular box is the one that we're
going to be running on in production. Actually, it's not really
unfortunate, but interesting. I'm really, intensely curious about what,
exactly, the problem is, and the process for tracking it down and
solving it. I hope that whatever we come up with will be helpful to
others, and I'll share whatever we come up with when we've solved the
problem.
If any other ideas strike you, please let me know!
Thanks again for your help!
Steve
PS Chris, <laugh> you've helped more than you know!!
Steve Jovanovic
Director of Engineering
Noumenaut Software
(262) 632-7755
-----Original Message-----
From: plug-discuss-admin@lists.plug.phoenix.az.us
[mailto:plug-discuss-admin@lists.plug.phoenix.az.us] On Behalf Of Huang
Haitao-G17843
Sent: Wednesday, June 04, 2003 11:11 AM
To: 'Steve Jovanovic'; java@phxjug.org;
plug-discuss@lists.plug.phoenix.az.us
Subject: RE: [phxjug-java] Help Wanted: Squashing JVM on Linux Bug
What distribution of Linux?
If it a desktop distribution not server distribution, it could be it
runs out of file scriptor limit. check ulimit and
/etc/security/limits.conf.
Haitao Huang
Motorola
----
This probably won't be much help, but just in case, ...
You are getting a segmentation violation (signal 11) which means you are
attempting to access protected memory. Since you cannot do this in
Java, it is either: 1) a bug in the JVM, 2) buggy JNI code, or 3) a
bad configuration of shared libraries on your machine.
The best thing you can do is attempt to narrow down the problem as much
as possible. Also make sure you can reproduce the problem on another
Linux box that is not identically configured and is presumably in a
stable configuration.
I would guess that it is not a bug in the JVM only because I have found
the Linux 1.4.1 JVM to be very stable, but if you can find no other
explanation, you should send it to Sun. This won't help you much
though.
Good luck,
Chris