GDB and debugging techniques

Kevin Buettner kev@primenet.com
Sat, 10 Feb 2001 13:48:58 -0700


On Feb 9,  4:00pm, Jason wrote:

> Ive found that if you understand the sourcecode, often *adding* lines
> that output information to a file somewhere as the program runs beats
> the snot out of any "after the fact" debugger. Doing it that way makes
> it possible to spot verify any area of a program ...
> 
> Whats your opinion on this approach to debugging? (obviously you dont
> want those code snippets showing up in a production release, however!)

I think it depends upon what it is that you're debugging.  If you're
debugging problems in your program's logic and the problem in question
is reproducible, I much prefer a tool like gdb which allows me to put
a breakpoint near the suspected problem area and then step through the
code and examine variables to see precisely what the problem is.

If the problem is somewhat elusive, then creating a mechanism for
logging what the program is doing at various stages is entirely
appropriate and, if done right, probably ought to be left in (though
disabled) even in the production version of the code.

A logging mechanism is often useful for debugging problems with
daemons in which the problem (probably) isn't in the program's logic.
E.g. I've found the debug output from pppd to be amazingly useful
when debugging problems with connecting with my ISP.  Often, the
bug is in the configuration and not in the program.

Another mechanism which should not be discounted is the liberal use
of assert() in your code.  Basically, the idea is to identify certain
conditions that *ought* to be true at various points in your program
and then use assert() to make sure that these conditions actually
do hold.  If they don't hold, the assert can be made to do different
things; if you do "man assert", you'll see that assert will abort
(thus producing a core dump) if one of these conditions isn't met.
But it is easy to write your own version of assert which (e.g.) logs
a message to a file.  Personally, I think that calling abort() and
creating a core dump is quite appropriate in many settings since this
gives you a core dump to work from at a fairly early point at which
the problem was noticed.

Now back to gdb... your use of the phrase "after the fact" seems
to indicate that you think that gdb is only a post-mortem debugger.
It can certainly be used in this fashion, but I usually use gdb
to debug programs while they are running.  You can even use gdb
to attach to an already running program.  This is particularly
handy if a program appears to be hung and you want to find out
where the problem is.  Simply attach to it and get a backtrace.
Often that's enough.  If it isn't, you can set breakpoints, single
step, examine variables, etc.

Also, it is possible to use gdb to add debugging printfs to your
program without really adding them.  E.g. suppose you want to
print out the value of i every time your program executes line
42 of your program.  You can use the following gdb commands to
do this:

    break 42
    commands
     print i
     continue
    end

The above puts a breakpoint a line 42.  The ``commands'' command
causes the ``print i'' and ``continue'' commands to be run every time
the breakpoint at line 42 is reached.

I once used a similar approach to debug a memory leak.  The program in
question was a client in a client/server relationship (though the same
approach would just as easily work for the server) and it was observed
that the program would leak memory even with no network activity.  I
put breakpoints on the C++ memory allocation and deallocation routines
and used the ``commands'' command to add commands for printing out the
memory that was being either allocated or deallocated along with a
limited (four or five frame) backtrace (e.g.  use ``bt 5'' to do
this).  I then let the program run for a while, stopped it, and then
used a perl script to filter the output.  The perl script removed the
lines which had matching allocations and deallocations.  The
allocations which were left had to be the leak.  The backtrace was
necessary to find the location of the allocate - there were multiple
layers of allocation functions in the program.  It turned out that the
problem was in the customer supplied library.  They (fortunately) had
left debugging symbols in their library that they had supplied to us
and were very much surprised when I was able to tell them the exact
file and line number at which to look for the problem (even though
they had not suppled source code).

Kevin