Transcript www.csci.psu.edu
Application Debugging
Debugging
methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware thus making it behave as expected.
Basic Steps
Recognize that a bug exists Isolate the source of the bug Identify the cause of the bug Determine a fix for the bug Apply the fix and test it
Common Debug Method
print statements
Preferable Debug Method
Use a debugger monitor an application program in situ catch memory errors can't attach print statements to a running program Graphical debuggers can provide visual aids
Available Debuggers
dbx gdb PGI pgdbg Intel idb ladebug TotalView DDT Plus many others available
Step 1- Identify there is a bug
If an error is severe enough to cause the program to terminate abnormally, the existence of a bug becomes obvious! if the error is minor and only causes the wrong results, it becomes much more difficult to detect that a bug exists this is especially true if it is difficult or impossible to verify the results of the program Goal identify the symptoms of the bug under what conditions the problem is detected will greatly help the remaining steps to debugging the problem.
Example
[nucci@lionxo debug]$ ./dgemm_ex1 Segmentation fault
Steps to follow
recompile to enable debug support often this option is
'-g'
check compiler documentation to be sure!
all modules need to be compiled with this option re-run application
Steps to follow
want failure to generate a core dump by default, core dumps are disabled on HPC machines re-enable with the command: ulimit -c unlimited
Example
[nucci@lionxo debug]$ ./dgemm_ex1 Segmentation fault (core dumped)
Core File
contains the memory image of a particular process along with other information such as the values of processor registers very useful debugging tool name format is: core.PID
Using the Core File
examine its contents with a debugging tool such as gdb command format is: gdb exe_file core.PID
if application compiled with
'-g'
then odds are good you will be taken directly to the offending source line
Example
[nucci@keuka debug]$ gdb ./dgemm_ex1 core.11469
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh) Copyright 2004 Free Software Foundation, Inc.
.
.
.
Core was generated by `./dgemm_ex1'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from BLAH BLAH BLAH .
.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
#0 0x0000003d5534d03f in _IO_vfscanf_internal () from /lib64/tls/libc.so.6
(gdb) where #0 0x0000003d5534d03f in _IO_vfscanf_internal () from /lib64/tls/libc.so.6
#1 0x0000003d55358406 in fscanf () from /lib64/tls/libc.so.6
#2 0x0000000000400c25 in main (argc=1, argv=0x7fbffff648) at /home1/nucci/proj/debug/dgemm_ex1.c:41