Debugging Techniques

Download Report

Transcript Debugging Techniques

Debugging: Locating and correcting faults and bugs Dr. Pedro Mejia Alvarez CINVESTAV-IPN

What You Won’t Learn

   how to solve all your problems the one true way to avoid all bugs Many platform specific debugging techniques

Avoiding bugs in the first place

       Coding style: use clear, consistent style and useful naming standards.

Document everything, from architecture and interface specification documents to comments on code lines.

Hold code reviews.

Program defensively.

Use/implement exception handling liberally; think constantly about anomalous conditions.

Be suspicious of cut/paste.

Consider using an integrated development environment (IDE) with dynamic syntax checking

Code reviews

     Primary programmer(s) for some piece of code presents and explains that code, line by line.

Audience of programmers experienced in language, code’s general domain. Audience may also contain designers, testers, customers and others less versed in code but concerned with quality and consistency.

Review is a dialogue: audience pushes presenters to reevaluate and rationalize their implementation decisions.

Extremely useful: reviews often turn up outright errors, inconsistencies, inefficiencies and unconsidered exceptional conditions. Also useful in familiarizing a project team with a member’s code.

Debugging

 Debugging is a black art. Some things to go over, though, so they’ll be concrete in our brains: – relation to testing – why debugging is hard – types of bugs – process – techniques – tools – avoiding bugs

Debugging and testing

 Testing and debugging go together like peas in a pod: – Testing finds errors; debugging localizes and repairs them.

– Together these form the “testing/debugging cycle”: we test, then debug, then repeat.

– Any debugging should be followed by a reapplication of

all

relevant tests, particularly regression tests. This avoids (reduces) the introduction of new bugs when debugging. – Testing and debugging need not be done by the same people (and often should not be).

Why debugging is hard

   There may be no obvious relationship between the external manifestation(s) of an error and its internal cause(s).

Symptom and cause may be in remote parts of the program.

Changes (new features, bug fixes) in program may mask (or modify) bugs.

 Symptom may be due to human mistake or misunderstanding that is difficult to trace.

 Bug may be triggered by rare or difficult to reproduce input sequence, program timing (threads) or other external causes.  Bug may depend on other software/system state, things others did to you systems weeks/months ago.

Designing for Debug/Test

 when you write code think about how you are going to test/debug it - lack of thought

always

translates into bugs       write test cases when you write your code if something should be true assert() it create functions to help visualize your data design for testing/debugging from the start test early, test often test at abstraction boundaries

Fault Injection

  many bugs only happen in the uncommon case make this case more common having switches that cause routines to fail - file open, file write, memory allocation, are all good candidates - Have “test drivers” which test with the uncommon data. If deeply buried, test with a debugger script

Finding and Fixing Bugs

 in order to create quality software you need to find your bugs - testing - user reports  the best bugs are those that are always reproducible

Types of bugs

 Types of bugs (gotta love em): – Compile time: syntax, spelling, static type mismatch.

• Usually caught with compiler – Design: flawed algorithm.

• Incorrect outputs – Program logic (if/else, loop termination, select case, etc).

• Incorrect outputs – Memory nonsense: null pointers, array bounds, bad types, leaks.

• Runtime exceptions – Interface errors between modules, threads, programs (in particular, with shared resources: sockets, files, memory, etc).

• Runtime Exceptions – Off-nominal conditions: failure of some part of software of underlying machinery (network, etc).

• Incomplete functionality – Deadlocks: multiple processes fighting for a resource.

• Freeze ups, never ending processes

The ideal debugging process

 A debugging algorithm for software engineers: – Identify test case(s) that reliably show existence of fault (when possible) – Isolate problem to small fragment(s) of program – Correlate incorrect behavior with program logic/code error – Change the program (and check for other parts of program where same or similar program logic may also occur) – Regression test to verify that the error has really been removed - without inserting new errors – Update documentation when appropriate (Not all these steps need be done by the same person!)

General Advice

    try to understand as much of what is happening as possible “it compiles” is

NOT

the same as “it works” when in doubt, ask. Then test the answer!

 Error messages are generally just a vague hint and can be misleading.

Don’t always trust the “comments/documents”, they can be out-of-date.

What is a Debugger?

“A software tool that is used to detect the source of program or script errors, by performing step-by-step execution of application code and viewing the content of code variables.” -MSDN

What is a Debugger? (con't)

 A debugger is

not an IDE

– Though the two can be integrated, they are separate entities.

 A debugger loads in a program (compiled executable, or interpreted source code) and allows the user to trace through the execution.

 Debuggers typically can do disassembly, stack traces, expression watches, and more.

Why use a Debugger?

 No need for precognition of what the error might be.

 Flexible – Allows for “live” error checking – no need to re-write and re-compile when you realize a certain type of error may be occuring – Dynamic – Can view the entire relevant scope

Why people don’t

use a Debugger?

 With simple errors, may not want to bother with starting up the debugger environment.

– Obvious error – Simple to check using prints/asserts  Hard-to-use debugger environment  Error occurs in optimized code  Changes execution of program (error doesn’t occur while running debugger)

    

Ways

NOT

Guess at what’s causing it

to Debug

Don’t try to understand what’s causing it Fix the symptom instead of the cause – Special case code Blame it on someone else’s code – Only after extensive testing/proof Blame it on the compiler/computer – Yes, it happens, but almost never is this the real cause

Debugging techniques, 1

 Execution tracing – running the program – print – trace utilities – single stepping in debugger – hand simulation

Debugging techniques, 2

  Assertions: include range constraints or other information with data.

 Interface checking – check procedure parameter number/type (if not enforced by compiler) and value –

defensive programming

: check inputs/results from other modules – documents assumptions about caller/callee relationships in modules, communication protocols, etc Skipping code: comment out suspect code, then check if error remains.

Other Functions of a Debugger

 Disassembly (in context and with live data!)  Execution Tracing/Stack tracing  Symbol watches

Disassembly

 Most basic form of debugging  Translating machine code into assembly instructions that are more easily understood by the user.

 Typically implementable as a simple lookup table  No higher-level information (variable names, etc.)  Relatively easy to implement.

Execution Tracing

 Follows the program through the execution. Users can step through line-by-line, or use breakpoints.

 Typically allows for “watches” on – registers, memory locations, symbols  Allows for tracing up the stack of runtime errors (back traces)  Allows user to trace the causes of unexpected behavior and fix them

Symbol Information

 Problem – a compiler/assembler translates variable names and other symbols into internally consistent memory addresses  How does a debugger know which location is denoted by a particular symbol?

 We need a “debug” executable.

Debug vs. Release Builds

 Debug builds usually are

not optimized

 Debug executables contain: – program's symbol tables – location of the source file – line number tags for assembly instructions.

 GCC/GDB allows debugging of optimized code.

Bug hunting with print

• Weak form of debugging, but still common • How bug hunting with print can be made more useful: – print variables other than just those you think suspect.

– print valuable statements (not just “hi\n”).

– use exit() to concentrate on a part of a program.

– move print through a through program to track down a bug.

Debugging with print (continued)

• Building debugging with print into a program (more common/valuable): – print messages, variables/test results in useful places throughout program.

– use a ‘debug’ or ‘debug_level’ global flag to turn debugging messages on or off, or change “levels” – possibly use a source file preprocessor (#ifdef) to insert/remove debug statements.

– Often part of “regression testing” so automated scripts can test output of many things at once.

Finding Reproducible Bugs

     a bug happens when there is a mismatch between what you (someone)

think

is happening and what is

actually

happening confirm things you believe are true narrow down the causes one by one make sure you understand your program state keep a log of events and assumptions

Finding Reproducible Bugs

 try explaining what should be happing - Verbalization/writing often clarifies muddled thoughts   have a friend do a quick sanity check don’t randomly change things, your actions should have a purpose. – If you are not willing to check it into CVS with a log that your boss may read, then you are not ready to make that change to the code.

– Think it through first, both locally and globally.

(semi) irreproducible bugs

   sometimes undesired behavior only happens sporadically tracking down these heisenbugs is hard the error could be a any level - Circuits (e.g. bad ram chip at high memory address) - compiler - os - Linker - Irreproducible external “data” and timing

Finding HeisenBugs

    Use good tools like Purify. Most common “Heisenbugs” are memory or thread related.  try to make the bug reproducible by switching platforms/libraries/compilers  insert checks for invariants and have the program stop everything when one is violated verify each layer with small, simple tests find the smallest system which demonstrates the bug Test with “canned data”, replayed over net if needed.

Timing and Threading Bugs

  ensure the functionality works for a single thread if adding a printf() removes the bug it is almost certainly a timing/threading bug or a trashed memory bug  try using coarse grained locking to narrow down the objects involved  try keeping an event (transaction) log

Memory Bugs and Buffer Overflow

  Trashing stack/heap causes often difficult to find bugs. Manifestation can be far from actual bug. – “Free list” information generally stored just after a “malloced” chunck of data. Overwriting may not cause problem until data is “freed”, or until something else does a malloc after the free. – Stack variables, overwriting past end, changes other variables, sometimes return address. (Buffer overflow) – Bad “string” ops notorious, using input data can also be problematic.

An example….

void myinit(int startindex, int startvalue, int length, int* vect){ int i; for(i=startindex; i< startindex+length; i++) *vect++ = startvalue++; } void whattheheck(){ printf("How did I ever get here????\n"); exit(2); } int main(int argc, char**argv){ float d; int a,b[10],c, i, start,end; if(argc != 3) {printf("Usage:%s start, end\n",argv[0]);exit(-1); } start=atoi(argv[1]); end=atoi(argv[2]); /* bad style but shorter */ a=0; c=0; d=3.14159; /* bad style but shorter */ printf("Initally a %d, c %d, d %f, start %d, end %d\n",a,c,d, start,end); myinit(start,start,end,b+start); printf("finally a %d, c %d, d %f start %d, end %d \n",a,c,d, start, end); if(end>10) b[end-1]=134513723; return 0; }

An Approach to Debugging

1. Stabilize the error 2. Locate the source 3. Fix the defect 4. Test the fix 5. Look for similar errors  Goal: Figure out completely

why

it occurs and fix it

1. Stabilize the Error

Find a simple test case to reliably produce the error – Narrow it to as

simple

a case as possible  Some errors resist this – Failure to initialize – Pointer problems – Timing issues

1. Stabilizing the Error

Converge on the actual (limited) error – Bad: “It crashes when I enter data” – Better: “It crashes when I enter data in non-sorted order” – Best: “It crashes when I enter something that needs to be first in sorted order”  Create hypothesis for cause – Then test hypothesis to see if it’s accurate

 

2. Locate the Source

This is where good code design helps Again, hypothesize where things are going wrong in code itself – Then, test to see if there are errors coming in there – Simple test cases make it easier to check

When it’s Tough to Find Source

      Create multiple test cases that cause same error – But, from different “directions” Refine existing test cases to simpler ones Try to find source that encompasses all errors – Could be multiple ones, but less likely Brainstorm for sources, and keep list to check Talk to others Take a break

Finding Error Locations

Process of elimination – Identify cases that work/failed hypotheses – Narrow the regions of code you need to check – Use unit tests to verify smaller sections  Process of expansion: – Be suspicious of: • areas that previously had errors • code that changed recently – Expand from suspicious areas of code

Alternative to Finding Specific Source

  Brute Force Debugging – “Guaranteed” to find bug – Examples: • Rewrite code from scratch • Automated test suite • Full design/code review • Fully output step-by-step status Don’t spend more time trying to do a “quick” debug than it would take to brute-force it.

 

3. Fix the Defect

Make sure you understand the

problem

– Don’t fix only the symptom (e.g. no magic “subtract one here” fixes) Understand what’s happening in the program, not just the place the error occurred – Understand interactions and dependencies  Save the original code – Be able to “back out” of change

Fixing the Code

Change only code that you have a good reason to change – Don’t just try things till they work  Make one change at a time

4. Check Your Fix

After making the change, check that it works on test cases that caused errors  Then, make sure it still works on other cases – Regression test – Add the error case to the test suite

5. Look for Similar Errors

 There’s a good chance similar errors occurred in other parts of program 

Before

moving on, think about rest of program – Similar routines, functions, copied code – Fix those areas immediately

Debugging: Finding and Fixing Errors

Material for this lecture has been taken from Code Complete by Steve McConnell

46

What is Debugging?

    Debugging is the process of identifying the cause of an error and correcting it. Debugging is not a way to improve software quality.

You should view debugging as a last resort - you should be trying to develop programming habits that greatly reduce the need to find errors.

If you spend the time observing and analyzing the errors that you do make then maybe you can discover ways to avoid making those errors in the future.

47

Errors as Opportunities

 Learn about the program you're working on.

– Is the origin of the defect in the requirements, specifications, design or implementation?

 Learn about the kind of mistakes you make.

– Can one debugging experience help to eliminate future defects of a similar nature?

48

Errors as Opportunities

  Learn about the quality of your code from the point of view of someone who has to read it.

– The ability to read programs is not a well developed skill in most programmers.

– The result of this inability is also the inability to write readable code.

Learn about how you solve problems.

– Taking the time to observe and analyze how you debug can decrease the total amount of time that it takes you the next time you develop a program.

49

Errors as Opportunities

 Learn about how you fix errors.

– You should strive to make systematic corrections.

– This demands an accurate diagnosis of the problem and an appropriate prescription that attacks the root cause of the defect.

50

The Devil's Guide to Debugging

  Find the error by guessing.

– Scatter print statements randomly throughout the code.

– If the print statements do not reveal the error, start making changes until something appears to work.

– Do not save the original version of the code and do not keep a record of the changes that have been made.

Debugging by superstition – Why blame yourself when you can blame the computer, the operating system, the compiler, the data, other programmers (especially those ones who write library routines!), and best of all, the stupid users!

51

The Devil's Guide to Debugging

 Don't waste time trying to understand the problem.

– Why spend an hour analyzing the problem (in your head and on paper) and evaluating possible solutions or methodologies when you can spend days trying to debug your code?

 Fix the error with the most obvious fix.

– For example, why try to understand why a particular case is not handled by a supposedly general subroutine when you can make a quick fix? 52

The Scientific Method of

Debugging

 Gather data through repeatable

experiments

.

– Stabilize the error, narrowing possible explanations.

Form a

hypothesis

that accounts for as much of the relevant data as possible.

– Locate what seems to be the source of the error.

53

The Scientific Method of Debugging

 Design another

experiment

to prove or disprove the hypothesis.

– Fix the error according to hypothesis, then test the fix.

 Repeat as needed.

– Look for similar errors.

54

Tips for Finding Errors

 Use all the data available to make your hypothesis.

 Refine the test cases that produce the error.

– Reproducing the error several ways helps to diagnose the cause of the error.

– Errors often arise from a combination of factors; one test case is often not enough to find the root of the problem.

55

Tips for Finding Errors

 Generate more data to generate more hypotheses.

– Run more test cases to help in hypothesis development and refinement.

 Use the results of negative tests.

– Negative results can eliminate parts of your search space.

– You have gained more knowledge about the program and the problem.

56

Tips for Finding Errors

 Brainstorm for possible hypotheses.

– Do not limit yourself to just one hypothesis; generate several.

– Do not limit the problem solvers to just yourself; other people can bring a new perspective and experience to your problem.

– Concentration on a single line of reasoning can result in a mental logjam.

57

Tips for Finding Errors

 Narrow the suspicious region of code.

– Systematically eliminate parts of your program to isolate the part that contains the defect.

– Yet another argument for modularization!

 Be suspicious of modules that have had errors before.

– Re-examine error-prone modules.

 Check code that has recently been changed.

– Compare the old and new versions of your code.

58

Tips for Finding Errors

 Expand the suspicious region of the code.

– Do not focus on a narrow piece of the system, even if you are sure that the error must be in that particular section.

 Integrate incrementally.

– Add pieces to the system, one at a time.

– After each addition, test the system to detect errors.

– If there are errors, you know the area to focus on!

59

Tips for Finding Errors

 Set a maximum time limit for quick and dirty debugging.

– If you cannot find the error within your time limit, then it has to be admitted that the error is a hard one and a quick and dirty fix is probably not appropriate.

 Check for common errors.

– Use code-quality checklists to stimulate your thinking about possible errors.

60

Tips for Finding Errors

  Talk to someone else about the problem.

Confessional debugging

can help even by just forcing you to organize your thoughts about the problem in a way that someone else can understand. This process may in fact be enough for you to arrive at the solution yourself without any input from others.

Take a break from the problem.

– The subconscious can be a great problem solver and, of course, food, sleep and sunlight are necessary for sustaining life functions.

– If you have

debugging anxiety

, stop and take a break.

61

Syntax Errors

 Do not trust the line numbers in compiler messages.

 Never completely trust any software tool; always understand the tool -- the more you know about

software tool x

, the more use it can be.

 Do not trust compiler messages; particularly second messages.

62

Syntax Errors

 Divide and conquer.

– Separate out parts of the code and run it through the compiler to isolate the syntax error and/or get a more reasonable error message from the compiler.

 Study and understand the syntax errors peculiar to the programming language that you are using.

63

Fixing an Error

  Understand the problem before you fix it.

Understand the program, not just the problem.

– A study done with short programs found that programmers who achieve a global understanding of program behaviour have a better chance of modifying it successfully than programmers who focus on local behaviour, learning about the program only as they need to.

– A large program may not be understood in total, but the code in the vicinity (a few hundred lines) of the error should be understood.

64

Fixing an Error

   Confirm the error diagnosis.

Relax -- Never debug standing up.

Save the original code.

– It is easy to forget which change in a group of changes is the most significant.

– It is always useful to be able to compare the old and new versions of code to verify all changes.

 Remember that defect corrections have more than a 50 percent chance of being wrong the first time.

65

Fixing an Error

 Fix the problem, not the symptom.

– If you do not understand the problem, then you are not fixing the code -- you are fixing the symptom and could be making the code worse.

– Problems with this approach include: • The fixes will not work most of the time.

• The system will become unmaintainable.

– Too many special cases becomes unmanagable.

• It is a misuse of computers -- understanding the problem involves the programmer not the computer.

66

Fixing an Error

    Change the code only for good reasons.

Make one change at a time.

Check your fix -- do regression testing.

Look for similar errors.

– If you cannot figure out how to look for similar errors, then you probably do not completely understand the problem.

67

Bug Identification & Elimination

1 Bug reports should contain a test case, output, and the version number of the software.

2 Reproduce the bug using the same version the customer used.

3 Find the root cause of the bug.

4 Check if the bug still occurs with the latest version. If it does, fix it.

5 If it doesn’t, make sure it is not just masked by other changes to the software.

6 Add test cases used to reproduce the bug to the regression test suite.

7 Keep Records!

Debugging Techniques

    methodology is key knowing about lots of debugging tools helps the most critical tool in your arsenal is your brain second most important tool is a debugger – Core dumps are your friends.. Learn how to use them.

  Memory debugging tools third most important Profiler fourth

Debugging Pointer and Dynamic Data Structure Problems

 Pointers and explicitly allocated dynamic data structures are a central feature in many popular procedural and object-oriented languages – Great power - especially in extreme cases (eg C/C++) – Can be very painful to debug

Common Pointer Problems

     Pointer to bogus memory Corrupt data structure segments Data sharing errors Accessing data elements of the wrong type Attempting to use memory areas after freeing them

Pointers to Bogus Memory

   Uninitialized pointers Failing to check memory allocation errors    Using stomped pointers corrupted by previous memory operations  Reminder: Bogus memory access does not necessarily trigger a memory protection fault Remedy: Add data type info to dynamic data structures Special Case: Indices above/below array space Remedy: index checks

Corrupt Data Structure Segments

  Incorrect Adds/Deletes in trees/lists/etc.

Stomped pointer values from previous memory operations   Remedy 1: Add type info to dynamic data structures Remedy 2: Create routines to check integrity of data structures  Remedy 3: Flag deleted memory areas

Data Sharing Errors

        Often share data between logically separate program entities Problem 1: Bogus pointer handoff Problem 2: Incorrect data format assumptions Problem 3: Multiple ownership issues Remedy 1: Type info in dynamic data Remedy 2: Owner count in memory areas Remedy 3: Flag deleted data structures Remedy 4: Think through synchronization problems in the design stage

Accessing Elements of Wrong Type

 Access data element of type x, but think you are accessing one of type y  Can be a source of frequent headaches depending on application/implementation  Remedy: Include type info in memory allocations

Accessing Data After Freeing It

  Can be a source of many headaches   Big Brother Problem: Accessing data structure after adding it to a “free” list for quick future reuse  Remedy 1: Include freed flag in memory (not a guaranteed solution Remedy 2: Create list of “freed” memory, but do not deallocate it. Check list when dereferencing pointers (very expensive in both time and space) Remedy: Remedy 1 plus a use counter (also not a guaranteed solution)

Final Pointer Comments

 Pointers are powerful, but are often a major source of program errors  Adding extra state and data structure walk routines can be a big help in debugging (degrades performance/increases memory footprint, but can be removed in release)

Debugging Multitasking Programs

 Multiple process/multi-threaded code ubiquitous in modern programs   Fallback method: Put new processes to sleep and then attach a debugger to them before they awake.

 Many debuggers will work with these programs, but it is not always elegant or easy.

Better solution: Read debugger documentation, find better one if it is weak in this area.

A Few Tips

 Pointers and multithreading together can be extremely difficult to debug   Analogous strategies to those used in pointer debugging can be a big help  Try to debug parts by themselves before tackling combined system Thread/process timing an important concern in the debugging process

Core Dumps

 (Unix) If you run your code outside of the debugger and there is a fault a core file may be generated (depending on your system settings) where the current program state is stored.

 Can debug your code post-mortem via: gdb executable-file core-file

Debug Prompts

  Windows does not use core files.

If you run your code outside of a debugger and a problem occurs you will be given the option of either debugging the code or killing the executing process.

Abort Signal (Unix)

 You can use the abort signal to help determine the cause of your problem  SIGBUS: Likely a dereference of a NULL pointer  SIGSEGV: Likely a dereference of a bogus pointer, an invalid write to code space, or a bad branch to data space  SIGFPE: Division by zero

Blame the Compiler

 Sometimes software crashes in debugged code but not in optimized code  The tendency is to blame the compiler and de optimize the file or function where the bug occurred  Most often the problem is in the code and is just exposed by the optimizer, typically an uninitialized global variable  Of course, sometimes it really is an optimizer bug. In that case, please submit a bug report to the compiler vendor with a nice short test program

Memory Leaks

 Memory bugs – Memory corruption: dangling refs, buffer overflows – Memory leaks • Lost objects: unreachable but not freed • Useless objects: reachable but not used again

Memory Leaks

 Memory bugs – Memory corruption: dangling refs, buffer overflows – Memory leaks • Lost objects: unreachable but not freed • Useless objects: reachable but not used again Managed Languages   80% of new software in Java or C# by 2010 [Gartner] (personally TB does not believe it..) Type safety & GC eliminate many bugs

Memory Leaks

 Memory bugs – Memory corruption: dangling refs, buffer overflows – Memory leaks • Lost objects: unreachable but not freed • Useless objects: “reachable” but not used again Managed Languages   80% of new software in Java or C# by 2010 [Gartner] Type safety & GC eliminate many bugs

Memory Leaks

 Memory bugs languages [Cork, JRockit, JProbe, LeakBot, .NET Memory Profiler] – Memory leaks • Lost objects: unreachable but not freed • Useless objects: reachable but not used again Managed Languages   80% of new software in Java or C# by 2010 [Gartner] Type safety & GC eliminate many bugs

War Stories

Nasty Problems

   overwriting return address on the stack overwriting virtual function tables compiler bugs - rare but my students and I have a talent for finding them - try -O, -O2, -Os  wrong version of a shared library gets loaded - make the runtime linker be verbose

Nasty Problems

 Difference in MS “debug/development” stacks vs performance/runtime stacks  OS / library bugs - create small examples - verify preconditions - check that postconditions fail   static initialization problems with C++ Processor bugs

some fun (simple) bugs

– In java (or whatever): public void foo(int p, int q) { int x = p; int y = p; } – In perl: $foo = $cgi->param(‘foo’); if (!$foo) { webDie (“missing parameter foo!”); } – In C: char *glue(char *left, char sep, char *right) { char *buf = malloc(sizeof(char) * (strlen(left) + 1 strlen(right))); sprintf(buf, “%s%c%s”, left, sep, right); return buf; }

Tricks

     write a custom assert() stub routines on which to break dump to a file instead of standard out make rand() deterministic by controlling the seed fight memory corruption with tools that use sentinels, etc. (and if no tools do it yourself)

The Future of Debugging

   better debuggers and programs to help you visualize your programs state simple model checkers programs keep getting bigger, finding bugs is going to get harder!

 Parallel/distributed debuggers as we move to more parallel/distributed systems.