Methodology or Philosophy of Troubleshooting

Kevin Buettner plug-discuss@lists.PLUG.phoenix.az.us
Mon, 6 Aug 2001 10:14:08 -0700


On Aug 5,  9:44pm, Eric Van Buskirk wrote:

> I am trying to develop a bag of tricks here.  Because now I must admit that
> when I get anything other than a networking problem (I am not a coder), I am
> frozen; a deer in the headlights.  So I would like to ask whether anyone has
> some general thoughts on how to tackle a Linux problem; whether anyone has
> some heurisms or general methodolgies or truths that they would like to
> share.  So perhaps I could explain the problem, and someone could explain
> the thought process of how to solve it?
> 
> I loaded RH 7.1 on my laptop (again).  Linuxconf was somehow not installed,
> so I installed it via rpm ("rpm -ih linuxfonf-1.24r2-10.i386.rpm").  But rpm
> would not install it because it failed a dependency: it first wanted "gd."
> So I installed gd-2.0.1-1.i386.rpm.  Then I typed "linuxconf" and the
> following error message was displayed: ""linuxconf: error while loading
> shared libraries: libgd.so.1.8: cannot load shared object file: No such file
> or directory".
> 
> Not knowing what the hell that meant, I loaded libgd-1.3-4.i386.rpm which
> seemingly had not been loaded during the original install.  Still, however,
> I receive the same error message when I type "linuxconf."
> 
> So I thought maybe I should look at the source code (is that how problems
> like this are solved)??? But I am not sure which file to look at.  I don't
> even know how to look at linuxconf , as I don't know where it is ("find
> / -name linuxconf*" did not turn it up).

I don't think it's likely that the source code will help you much here
unless you come to believe that there's a problem with the dynamic
linker.  (Which isn't very likely in this case.)

The problem that you have is that an application can't find a particular
library.  How could this be?

One possibility is that the library doesn't exist on your system, or
does exist, but under a different name (i.e, with a different version
number.) You should first attempt to determine whether this library
exists or not.  Use your package management tool to list out the
contents of the library packages that you've installed to find out the
name and location at which the library was installed.

It is possible that the library in question doesn't exist, in which
case you've simply installed the wrong software package.  OTOH, what's
more likely (in this case anyway) is that it does exist, but using a
different path or version number than expected.  You should attempt to
verify (using ``ls'' or some other tool) that the library in question
does indeed exist on your system and that the permissions are set
reasonably.

If the version number differs from the version that your application
is expecting, it would be best to attempt to find the correct version
of the library.  If that can't be done, it may be the case that an
older or newer version of the library will work, particularly if the
interfaces that your application uses haven't changed.  You can make your
application use an older or newer version of the library by creating
symbolic links to the existing library.

It could also be a path problem.  I.e, the dynamic loader looks in
some specific predetermined places when attempting to find the library
to load.  Some of these places are listed in ld.so.conf, but the
directories which are searched also depends upon whether or not you
have an LD_LIBRARY_PATH variable set or whether the application was
compiled with a particular setting of LD_RUN_PATH or if -rpath options
were passed to the linker.  If the path to the library looks like it
ought to be one of the standard paths that the dynamic linker is
searching, you ought to check /etc/ld.so.conf to see if the path in
question is indeed in the list.  If it's not, you can add it and rerun
ldconfig (as root).  OTOH, if it looks like a nonstandard path that you
don't want searched, you can add this path to LD_LIBRARY_PATH.

For debugging this type of problem, it is also useful to know about
``ldd''.  This is a tool which displays the the names of the shared
libraries that an application expects to be able to load and the
locations at which these shared libraries may be found.  (It should
be noted, however, that libraries opened via (only) dlopen() are not
listed by ldd.)  E.g, on my Red Hat 7.1 system, I see the following:

[root@mesquite kev]# ldd /sbin/linuxconf
        libncurses.so.5 => /usr/lib/libncurses.so.5 (0x4002d000)
        libdl.so.2 => /lib/libdl.so.2 (0x4006f000)
        libcrypt.so.1 => /lib/libcrypt.so.1 (0x40073000)
        libgd.so.1.8 => /usr/lib/libgd.so.1.8 (0x400a1000)
        libpng.so.2 => /usr/lib/libpng.so.2 (0x400d4000)
        libz.so.1 => /usr/lib/libz.so.1 (0x400f5000)
        libstdc++-libc6.2-2.so.3 => /usr/lib/libstdc++-libc6.2-2.so.3 (0x40103000)
        libm.so.6 => /lib/i686/libm.so.6 (0x40147000)
        libc.so.6 => /lib/i686/libc.so.6 (0x4016b000)
        libttf.so.2 => /usr/lib/libttf.so.2 (0x4029b000)
        libjpeg.so.62 => /usr/lib/libjpeg.so.62 (0x402c5000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

Furthermore, I see...

[root@mesquite kev]# rpm -q -f /usr/lib/libgd.so.1.8
gd-1.8.3-7

So, it looks to me like you've install a much newer version of the ``gd''
packages that you ought to have.  (I.e, the version number is wrong.)
You'll probably want to uninstall the newer one and install the older
one (that came with your distribution) instead.  (BTW, that's yet another
way of diagnosing this sort of thing... simply use ldd on an existing
box to find out where the location and version information of the library.)

Kevin