Storage 6-Nov-15 Stacks       Stacks obey a simple regimen—last in, first out (LIFO) When you enter a function or procedure or method, storage is allocated.

Download Report

Transcript Storage 6-Nov-15 Stacks       Stacks obey a simple regimen—last in, first out (LIFO) When you enter a function or procedure or method, storage is allocated.

Storage

1-May-20

Stacks

      Stacks obey a simple regimen—last in, first out ( LIFO ) When you enter a function or procedure or method, storage is allocated for you on the stack When you leave, the storage is released In Java, this is even more fine-grained—storage is allocated and deallocated for individual blocks, and even for for statements Since this is so well-defined, your compiler writes the code to do it for you Since virtually every language supports recursion these days (and

all

the popular languages do), computers typically provide machine-language instructions to simplify stack operations 2

Heaps

    Stacks are great, but they have their limitations Suppose you want to write a method to read in an array   You enter the method, and declare the array, thus dynamically allocating space for it You read values into the array  You return from the method and

POOF!

your array is gone You need something more flexible—something where

you

have control over allocation and deallocation The invention that allows this (which came somewhat later than the stack) is the heap  You explicitly get storage via malloc (C) or new (Java)  The storage remains until you are done with it 3

Stacks vs. heaps

  Stack allocation and deallocation is very regular Heap allocation and deallocation is unpredictable        Stack allocation and deallocation is handled by the compiler Heap allocation is at the whim of the programmer Heap deallocation

may

also be up to the programmer (C, C++) or by the programming language system (Java) Values on stacks are typically small and uniform in size  In Java, arrays and objects don’t go in the stack—

references

to them do Values on the heap can be any size Stacks are tightly packed, with no wasted space Deallocation can leave gaps in the heap 4

Implementing a heap

      A heap is a single large area of storage When the program requests a block of storage, it is given a pointer (reference) to some part of this storage that is not already in use The task of the heap routines is to keep track of which parts of the heap are available and which are in use To do this, the heap routines create a

linked list

of blocks of varying sizes Every block, whether available or in use, contains header information about the block We will describe a simple implementation in which each block header contains two items of information:   A pointer to the next block, and The size of this block user gets from here on down pointer to next size of block

User data (an Object)

5

Anatomy of a block

 Here is our simple block: user gets N words from here ( ptr ) to end of block ptr-2 ptr-1 ptr ptr+1 ptr+2 : : ptr+N-1 pointer to next size of block

User data (an Object)

   Java Objects hold more information than this (for example, the class of the object) Notice that our implementation will return a pointer to the first word available to the user Data with negative offsets are header data   ptr-1 ptr-2 contains the size of this block, including header information will be used to construct a free space list of available blocks 6

free

The heap, I

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 next = 0 size = 20  Initially, the user has no blocks, and the free space list consists of a single block   In our implementation, we will allocate space from the

end

of the block To begin, let’s assume that the user asks for a block of two words 7

The heap, II

free

given to user

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 next = 0 size =

16 next = 0 size = 4

//////////// ////////////   The user has asked for a block of size 2 The “free” block is reduced in size from 20 to 16 (two words asked for by the user, plus two for a new header)  The new block has size 4 and the next field is not used  Next, assume the user asks for a block of three words 8

The heap, III

free

given to user

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 next = 0 size =

11 next = 0 size = 5

//////////// //////////// //////////// next = 0 size = 4 //////////// ////////////   The user has asked for a block of size 3 The “free” block is reduced in size from 16 to 11 (three words asked for by the user, plus two for a new header)  The new block has size 5 and the next field is not used  Next, assume the user asks for a block of just one word 9

The heap, IV

free

given to user

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 next = 0 size =

8 next = 0 size = 3

//////////// next = 0 size = 5 //////////// //////////// //////////// next = 0 size = 4 //////////// ////////////   The user has asked for a block of size 1 The “free” block is reduced in size from 11 to 8 (one word for the user, plus two for a new header)  The new block has size 3 and the next field is not used  Next, the user

releases

the second block (at 13) 10

free

The heap, V

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 next = 0 size = 8 next = 0 size = 3 //////////// next =

2

size = 5 next = 0 size = 4 //////////// ////////////     The user has released the block of size 5 The freed block is added to the front of the free space list:  Its next field is set to the old value of free  free is set to point to this block Next, the user requests a block of size 4 The first block on the free list isn’t large enough, so we have to go to the next free block 11

The heap, VI

given to user

free 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 next = 0 size =

2 next = 0 size = 6

//////////// //////////// //////////// //////////// next = 0 size = 3 //////////// next = 2 size = 5 next = 0 size = 4 //////////// ////////////      The user requests a block of size 3 The size of the first free block is now 3, and its next field does not change The user gets a pointer to the new block Now the user releases the smallest block (at 10) Again, this will be added to the

beginning

of the free space list 12

free

The heap, VII

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 next = 0 size = 2 next = 0 size = 6 //////////// //////////// //////////// //////////// next =

13

size = 3 next = 2 size = 5 next = 0 size = 4 //////////// ////////////     The user releases the smallest block (at 10) The freed block is added to the front of the free space list:   Its next field is set to the old value of free free is set to point to this block Now the user requests a block of size 4 Currently, we cannot satisfy this request     We have enough space, but no single block is large enough (free space is fragmented ) However, free blocks 10 and 13 are adjacent to each other We can coalesce blocks 10 and 13 Coalescing blocks is somewhat expensive, because adjacent blocks are not necessarily adjacent nodes in the free space list 13

free

The heap, VIII

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 next = 0 size = 2 next = 0 size = 6 //////////// //////////// //////////// //////////// next =

2

size =

8

next = 0 size = 4 //////////// ////////////     Blocks at 10 and 13 have now been coalesced The size of the new block is the sum of the sizes of the old blocks We had to adjust the links Now we can give the user a block of size 4 14

Declaring variables in Java

    In Java, all

variables

occupy space on the stack All

Objects

occupy space on the heap  In Java, you create an object (on the heap) with new Example of defining a variable whose value is a primitive:   int count = 0; count is the name of a location on the stack  The name is used by the compiler; it doesn't "really" exist (occupy storage) at run time  The named location occupies memory on the stack; it contains a zero Example of defining a variable whose value is an object:     Person p = new Person(); p is a variable; it is the name of a location on the stack  That location occupies memory on the stack; it contains a

reference to the object

The Person object is on the heap Thus, Person p = new heap Person(); allocates space on

both

the stack and the 15

Pointers

   In C and C++ you get a

pointer

to the new storage; in Java you get a

reference

 The implementation is identical; the difference is that there are more operations on pointers than on references C and C++ provide

operations

on pointers  C and C++ let you do arithmetic on pointers, for example, p++; Pointers are pervasive in C and C++; you can't avoid them 16

Advantages/disadvantages

   Pointers give you:  Greater flexibility and (maybe) convenience    A much more complicated syntax More ways to create hard-to-find errors Serious security holes References give you:     Less flexibility (no pointer arithmetic) Simpler syntax, more like that of other variables

Much

safer programs with fewer mysterious bugs More opportunities for the compiler to optimize the compiled code Pointer arithmetic is inherently unsafe   You can accidentally point to the wrong thing You cannot be sure of the type of the thing you are pointing to 17

Deallocation

  There are two potential errors when de-allocating (freeing) storage yourself:   De-allocating too soon, so that you have dangling references (pointers to storage that has been freed and possibly reused)  A dangling reference is not a null link—it points to

something

don’t know

what

) (you just Forgetting to de-allocate, so that unused storage accumulates and you have a memory leak If you have to de-allocate storage yourself, a good strategy is to keep track of which function or method “owns” the storage  The function that owns the storage is responsible for de-allocating it  Ownership can be transferred to another function or method  You just need a clearly defined policy for determining ownership  In practice, this is easier said than done 18

Discipline

 Most C/C++ advocates say:  It's just a matter of being disciplined 

I'm

disciplined, even if other people aren't  Besides, there are good tools for finding memory problems  However:  Virtually all large C/C++ programs have memory problems 19

Garbage collection

   

Garbage

is storage that has been allocated but is not longer available to the program It's easy to create garbage:  Allocate some storage and save the pointer to it in a variable  Assign a different value to that variable A

garbage collector

automatically finds and de-allocates garbage    This is

far

safer (and more convenient) than having the programmer do it Dangling references cannot happen Memory leaks, while not impossible, are pretty unlikely Practically every modern language, not including C++, uses a garbage collector 20

Garbage collection algorithms

 There are two well-known algorithms (and several not so well known ones) for doing garbage collection:  Reference counting  Mark and sweep 21

Reference counting

  When a block of storage is allocated, it includes header data that contains an integer

reference count

   The reference count keeps track of how many references the program has to that block Any assignment to a reference variable modifies reference counts   If the variable previously referenced an object (was not null ), the reference count of that object is decremented If the new value is an object (not null ), the reference count for the new object is incremented When a reference count reaches zero, the storage can immediately be garbage collected For this to work, the reference count has to be at a known displacement from the reference (pointer)  If arbitrary pointer arithmetic is allowed, this condition cannot be guaranteed 22

Problems with reference counting

 If object object A points to object B , and object B points to A , then each is referenced, even if nothing else in the program references either one  This fools the garbage collector, which doesn't collect either object A or object B  Thus, reference counting is imperfect and unreliable; memory leaks still happen  However, reference counting is a simple technique and is occasionally used 23

Mark and sweep

   When memory runs low, languages that use mark-and sweep temporarily pause the program and run the garbage collector   The collector

marks

every block It then does an exhaustive search, starting from every reference variable in the program, and

unmarks

all the storage it can reach When done, every block that is still marked must not be accessible from the program; it is garbage that can be freed In order for this technique to work,    It must be possible to find every block (so they are in a linked list) It must be possible to find and follow every reference The mark has to be at a known displacement from the reference  Again, this is not compatible with arbitrary pointer arithmetic 24

Problems with mark and sweep

 Mark-and-sweep is a complex algorithm that takes substantial time  Unlike reference counting, it must be done all at once— nothing else can be going on  The program stops responding during garbage collection  This is unsuitable for many real-time applications 25

Garbage collection in Java

    Java uses mark-and-sweep Mark-and-sweep is highly reliable, but may cause unexpected slowdowns You can ask Java to do garbage collection at a time you feel is more appropriate  The call is System.gc();  But not all implementations respect your request This problem is known and is being worked on  There is also a “Real-time Specification for Java” 26

No garbage collection in C or C++

   C and C++ do not have garbage collection—it is up to the programmer to explicitly free storage when it is no longer needed by the program C and C++ have pointer arithmetic, which means that pointers might point

anywhere

  There is no way to do reference counting if the programming language does not have strict control over pointers There is no way to do mark-and-sweep if the programming language does not have strict control over pointers Pointer arithmetic and garbage collection are

incompatible

--it is essentially impossible to have both 27

The End

28