Mapping PL Objects to the Machine – managing the address space David E.

Download Report

Transcript Mapping PL Objects to the Machine – managing the address space David E.

Mapping PL Objects to the Machine –
managing the address space
David E. Culler
CS61CL
Feb 9, 2009
Lecture 3
UCB CS61CL F09 Lec 3
11/7/2015
1
Big Ideas
• Review:
– Computers manipulate finite representations of things.
– A bunch of bits can represent anything, it is all a matter of
what you do with it.
– Finite representations have limitations.
• Today
– Type constructors to compose complex type
– Mapping of program objects to machine storage
– An object, its value, its location, its reference
• Pointers are THE most subtle concept in C
– Very powerful
– Easy to misuse
– Completely hidden in Java
11/7/2015
UCB CS61CL F09 Lec 3
2
C Types - the big picture
• Basic Types
char
– “understood” by the machine
unsigned
int
float
double
• Array
– sequence of indexed objects of
homogeneous type
• Struct
– collection of named objects of
heterogeneous types
• Pointer
– reference to an object of
specified type
• Union
– an object of one of a specific
collection of types
11/7/2015
UCB CS61CL F09 Lec 3
3
Composing Complex Types in C
• Complex types are really tools for composing
new types
–
–
–
–
–
–
–
Strings – sequences of characters
Vectors – sequences of numbers
Matrixes – 2D collections of numbers
Records – finite sets of strings and numbers
Lists, Tables
Sounds, Images, Graphs
…
– Think induction
• Pointers are fundamentally “understood” by the
machine as well
– address
11/7/2015
UCB CS61CL F09 Lec 3
4
Where do Objects live and work?
000..0:
00..1AA0:
0F..FAC0:
n:
°°°
FFF..F:
Memory
Processor
load
store
register
operate
word
11/7/2015
UCB CS61CL F09 Lec 3
5
Where do complex objects reside?
• Arrays are stored in memory
• The variable (i.e., name) is
associated with the location
(i.e., address) of the collection
000..0:
A:
– Just like variables of basic type
• Elements are stored
consecutively
°°°
FFF..F:
– Can locate each of the elements
• Can operate on the indexed
object just like an object of
that type
– A[2] = x + Y[i] – 3;
11/7/2015
UCB CS61CL F09 Lec 3
6
Where do complex objects reside?
• Struct are stored in memory
• The variable (i.e., name) is
associated with the location
(i.e., address) of the
collection
000..0:
S:
°°°
– Just like variables of any type
• Elements are stored at fixed
offsets
FFF..F:
– Can locate each of the elements
• Can operate on the named
member object just like an
object of that type
– S.row = x + S.col – 3;
11/7/2015
UCB CS61CL F09 Lec 3
7
All objects have a size
• The size of their representation
• The size of static objects is given by sizeof
operator
#include <stdio.h>
int main() {
char c = 'a';
int x = 34;
int y[4];
printf("sizeof(c)=%d\n",
sizeof(c) );
printf("sizeof(char)=%d\n",sizeof(char));
printf("sizeof(x)=%d\n",
sizeof(x) );
printf("sizeof(int)=%d\n", sizeof(int) );
printf("sizeof(y)=%d\n",
sizeof(y) );
printf("sizeof(7)=%d\n",
sizeof(7) );
}
11/7/2015
UCB CS61CL F09 Lec 3
8
What can be done with a complex object?
• Access its elements
– A[i], S.row
• Pass it around
– Sort(A)
– x = max(A, n)
• Copy it
– T=S
– z = munge(S, 3)
• Note the name of an array behaves as a
reference to the object
• The name of a struct behaves as the object
11/7/2015
UCB CS61CL F09 Lec 3
9
Administration
• HW3 due at start of next week’s lab
– Try to have it give practice in test tools
• Lab changers and waitlisters must give target TA
your prioritized lab request list this week
• Readings are shifting for K&R to P&H
• Project 1 goes out on Tuesday, Due Friday 10/1
11/7/2015
UCB CS61CL F09 Lec 3
10
An object and its value…
000..0:
x:
3
X = X + 1;
°°°
The value of variable X
FFF..F:
000..0:
The storage that holds the value X
x:
4
°°°
FFF..F:
11/7/2015
UCB CS61CL F09 Lec 3
11
Every object in memory has an address
• That address is a pointer to
the object.
• It is a fixed size object itself
• Just like basic type
000..0:
S:
°°°
FFF..F:
11/7/2015
UCB CS61CL F09 Lec 3
12
What can be done with a reference?
• Dereference it
– Obtain the object that it refers (points) to
– X = *P; Y = S->row; z = A[0]; z = A[i];
• Pass it around, copy it, store it
– Q = P;
– clearfields(S);
• Do type-based arithmetic on it
– P-1
– Q++
• Do both
– S->next = P;
– A[i] = 3;
• Cast it to an uint and mess with it (!!!)
11/7/2015
UCB CS61CL F09 Lec 3
13
Array variables are also a reference for
the object
000..0:
int main() {
char *c = "abc";
char ac[4] = "def";
printf("c[1]=%c\n",c[1] );
printf("ac[1]=%c\n",ac[1] );
}
c:
*
‘a’ ‘b’ ‘c’ \0
ac:
°°°
‘d’ ‘e’ ‘f’ \0
• Array name is essentially the address of (pointer
to) the zeroth object in the array
• There are a few subtle differences
– Can change what c refers to, but not what ac refers to
11/7/2015
UCB CS61CL F09 Lec 3
14
What kinds of variables (storage)?
• Visibility vs Lifetime
• Variables declared within a function
–
–
–
–
int ave(int A, int B) {
Arguments and Local Variables
int C = (A + B)/2;
Visible in remainder of function
return C;
Lifetime = Function Call
}
Each call obtains a new set of variables
» Recursive calls too
– C “internals”
• Variables declared outside any function
– Visible in remainder of file (!!!)
» include .h file
» extern vs static
– Lifetime = Whole Program
– C “externals”
• Malloc’d objects
11/7/2015
int count = 0;
…
int fib(int n) {
count++;
if (n <= 2) return 1;
return fib(n-1)+fib(n-2);
}
UCB CS61CL F09 Lec 3
15
Where does the program itself reside?
• In memory, just like
the data
• Processor contains a
special register – PC
000..0:
00401B20:
– Program counter
– Address of the instruction
to execute (i.e. ptr)
• Instruction Execution
Cycle
–
–
–
–
–
–
n:
0020FAC0:
main:
°°°
FFF..F:
Instruction fetch
Decode
Operand fetch
Execute
Result Store
Update PC
Instruction Fetch
Execute
PC
11/7/2015
UCB CS61CL F09 Lec 3
16
What’s a Process
• Address Space + a thread of control
11/7/2015
UCB CS61CL F09 Lec 3
17
Logical Structure of an Executing
Program
code
printf:
main:
nextword:
regs
static data
PC
stack
heap
11/7/2015
UCB CS61CL F09 Lec 3
18
Address Space
0000000:
0040000:
1000000:
reserved
code
<= instructions
static data
<= externs
1008000:
heap
PC
regs
<= malloc
gp
sp
7FFFFFFF:
stack
<= Local variables
<= OS, etc.
unused
FFFFFFFF:
11/7/2015
UCB CS61CL F09 Lec 3
19
Breaking the Abstraction…
• Attack
– Cause the OS (or service or
application) to do things it should
not.
– Pass unterminated strings, bad
length paramters, bad ptrs
– Corrupting system data may
cause it to do other harm
• “Smashing the stack”
0000000:
0040000:
1000000:
code
<= instructions
static data
<= externs
1008000:
heap
– Send bad mesg causing system
code to overwrite parts of its stack
» Local vars and rtn address
» Bad return
– OS or app starts executing out of
stack as if it were instructions
7FFFFFFF:
– Message contains jump
FFFFFFFF:
instructions to send it off to
attacker code
11/7/2015
reserved
UCB CS61CL F09 Lec 3
<= malloc
RA
stack
unused
<= Local variables
<= OS, etc.
20
Summary
• Arrays, Structs, and Pointers allow you define
sophisticated data structures
– Compiler protects you by enforcing type system
– Avoid dropping beneath the abstraction and munging the bits
• All map into untyped storage, ints, and
addresses
• Executing program has a specific structure
–
–
–
–
Code, Static Data, Stack, and Heap
Mapped into address space
“Holes” allow stack and heap to grow
Compiler defines what the bits mean by enforcing type
» Chooses which operations to perform
• Poor coding practices, bugs, and architecture
limitations lead to vulnerabilities
11/7/2015
UCB CS61CL F09 Lec 3
21