Transcript 投影片 1
Presenter: Tai-Feng, Chen April, 2012 1 2015/7/17 2 It is possible to debug programs by inserting code that prints values of selected interesting variables. Indeed, in some situations, such as debugging kernel drivers, this may be the preferred method. There are low level debuggers that allow you to step through the executable program, instruction by instruction, displaying registers and memory contents in binary. But it is much easier to use a source level debugger which allows you to step through a program's source, set breakpoints, print variable values, and perhaps a few other functions such as allowing you to call a function in your program while in the debugger. The problem is how to coordinate two completely different programs, the compiler and the debugger, so that the program can be debugged. 2015/7/17 3 How to coordinate between compiler and debugger, that the program can be debugged What information is needed when we translate source code to executable files? What are normally required debug process and extra features? 2015/7/17 4 COFF Stands for Common Object File Format But not so “common” PE-COFF For Microsoft Windows from Windows 95 Too sketchy and hard to obtain OMF Stands for Object Module Format Only rudimentary support for debuggers IEEE-695 Original standard easy to obtain, but extensions are poorly documented, and IEEE never revised for them 2015/7/17 Debugging With Arbitrary Record Formats Block Structured Extensible in a uniform fashion Debugger can recognize and ignore an extension, even if doesn’t understand its meaning Associated with ELF object file format 5 DIE (Debugging Information Entry) Parent DIEs, children DIEs and sibling DIEs (like Tree) But independent from it Can be and already has been used in other format 2015/7/17 Consists of Tags and Attributes Tag: Attribute: Fill in details and further describes the entity Contained in or owned by a parent DIE May have sibling DIEs and/or children DIEs Attributes may contain: 6 Specifies what the DIE describes Constants (e.g. function names) Variables (e.g. start address for a function) Reference to another DIE (e.g. type of a return value) 2015/7/17 Simple program example-Hello.c Topmost DIE – compilation unit Child of the Topmost DIE DIE – Subprogram -> main DIE – Base Type -> int (the return value of main) 7 2015/7/17 Generally splits into two types: Describing data including data types 。Base Types 。Type Composition 。Array 。Structures, Classes, Unions and Interfaces 。Variables 。Location Expressions Describing functions and other executable code 。Function and Subprograms 。Compilation Unit 。Data Encoding 8 2015/7/17 Basic types like int or double in C and Java An int can be 16, 32 or even 64 bits 9 Makes it difficult to have compatibility between compiler and debugger Thus, DWARF provides the lowest level mapping between simple data types and how they are implemented on the target machine’s hardware 2015/7/17 DWARF TAG Attribute name as word Size as 4 bytes (32 bits) Occupies 16 bit of it Started from offset position 0 10 Encoding as signed number 2015/7/17 Uses the base types to construct other data type definitions by Composition TAG for variables Name of variable from here TAG for pointers 11 2015/7/17 Column major order (e.g. Fortran) Row major order (e.g. C / C++) DW_AT_subrange for index control 12 Zero as the lowest index (C / C++) Any value for low and high bound (Pascal or Ada) 2015/7/17 Grouped data in a structure or class Union is different from structure like : In DWARF, take class for example 13 struct in C / C++, class in C++ and record in Pascal The parent of the DIEs which describe class members Looks much like simple variable attributes Additional attributes like private, public … etc. 2015/7/17 Has a name represents a chunk of memory or register that can contain some kind of a value The kind of value can add restrictions (e.g. const) Distinguish variables by scope Variables declared in function => function scope Outside of function => global or file scope Allows variables with same name without conflict DWARF documents them with triplets of : 14 (file, line, column) 2015/7/17 Function name first (as a type of subprogram) Then variable b as type int declared in section <4> and location = reg 0 Continuing with variable c with also type int and function’s stack frame offset of “-12” Finally deals with variable a because it’s in fixed memory location that will be assigned by linker later, DW_AT_external gives scope 15 2015/7/17 In C terminology : DWARF take them as the same thing Subprogram DIE 16 Functions : DO return value Subroutines : DO NOT return value Name Source location triplet Attribute external: visible outside of current compilation? Low/High memory address or a list of memory ranges Low PC is assumed to be the entry point Return value type, return address, frame base…etc. 2015/7/17 DIE <6> to <9> : Parameters of subprogram and local variables in subprogram 17 2015/7/17 Definition : each separately compiled source file DIE contains: 18 Directory where the source file is Name of the source file Programming Language used Producer of the generated DWARF data Offsets into the DWARF data section Low / High memory address Compilation Unit DIE is the parent of all DIEs that describes it 2015/7/17 19 Without compression, DWARF is unwieldy Solution 1 : Flatten the tree structure of DIEs Solution 2 : Using abbreviations 2015/7/17 Straight-forward way to describe a program Given that it needs to express the many different nuances for a wide range of programming languages and different machine architecture Future directions: 20 A tree with nodes representing various functions, data and types in a language-/machine-independent fashion Improve description of optimized code 2015/7/17 21 Briefly but clear enough to explain how DWARF works Giving me inspirations of combining DWARF format into hardware (e.g. HDL) and attempts to machine-independent Reviewing my memory about OS and compiler Didn’t mention about multi-threaded debugging 2015/7/17 22 2015/7/17 23 2015/7/17