Transcript Document

Chapter 6
Structured Data
Types
Arrays
Records
Definitions
•
data type
– collection of data objects
– a set of predefined operations
•
•
descriptor : collection of attributes for a
variable
object : instance of a user-defined (abstract
data) type
Copyright © 2007 Addison-Wesley. All rights reserved.
1–2
Structured Data Types
• Built out of other types
– usually composed of multiple elements.
– homogeneous : all elements have the
same type
– heterogeneous : elements have different
types
Copyright © 2007 Addison-Wesley. All rights reserved.
1–3
Structured Data Types
• Arrays
– aggregate of homogeneous data elements
indexed by its position
• Associative arrays
– unordered collection of key-value pairs
• Records
– heterogeneous aggregate of data elements
indexed by element name
Copyright © 2007 Addison-Wesley. All rights reserved.
1–4
Array Operations
• Whole array operations:
– assignment
– catenation
• Elemental operations same as those of
base type
• Indexing : mapping from indexes to
elements
array_name (index_value_list) 
element
an
Copyright © 2007 Addison-Wesley. All rights reserved.
1–5
Array Design Issues
• What types are legal for subscripts?
• Are subscripting expressions in
element references range checked?
• When are subscript ranges bound?
• When does allocation take place?
• What is the maximum number of
subscripts?
• Can array objects be initialized?
• Are any kind of slices allowed?
Copyright © 2007 Addison-Wesley. All rights reserved.
1–6
Binding Time Choices
• Static: compile-time binding of subscript range
and memory
• Fixed stack-dynamic: subscript ranges static,
allocated at declaration time (C, C++)
• Stack-dynamic: run-time binding of subscript
range and memory
• Fixed heap-dynamic: storage binding is dynamic
but fixed after allocation (Java, C and C++)
• Heap-dynamic: binding of subscript ranges and
storage allocation is dynamic (Perl and
JavaScript)
Copyright © 2007 Addison-Wesley. All rights reserved.
1–7
Array Initialization
• Some language allow initialization at the
time of storage allocation
– C, C++, Java, C# example
int list [] = {4, 5, 7, 83}
– Character strings in C and C++
char name [] = “freddie”;
– Arrays of strings in C and C++
char *names [] = {“Bob”, “Jake”,
“Joe”};
Copyright © 2007 Addison-Wesley. All rights reserved.
1–8
Memory for arrays
• For 1D arrays, contiguous block of
memory with equal amount of space for
each element
• Two approaches for multi-dimensional
arrays
– Single block of contiguous memory for all
elements
• Arrays must be rectangular
• Address of array is starting memory location
– Implement as arrays of arrays (Java)
• Jagged arrays are possible
• Array variable is a pointer (reference)
Copyright © 2007 Addison-Wesley. All rights reserved.
1–9
Element Access
• Access function maps subscript expressions to
an address in the array
• Access function for single-dimensioned arrays:
address(list[k]) = address (list[lower_bound])
+ ((k-lower_bound) * element_size)
• Two common ways to organize 2D arrays
– Row major order (by rows) – used in most languages
– Column major order (by columns) – used in Fortran
Copyright © 2007 Addison-Wesley. All rights reserved.
1–10
Memory Allocation for 2D Array
• Row major (by
rows) or column
major order (by
columns) for 2D
array
• Access function
maps subscript
expressions to an
address in the array
Copyright © 2007 Addison-Wesley. All rights reserved.
1–11
Row-major access formula
Location (a[I,j])
= address of a [row_lb,col_lb]
+ (((I - row_lb) * n) + (j - col_lb)) *element_size
Copyright © 2007 Addison-Wesley. All rights reserved.
1–12
2D Arrays in Java
Copyright © 2007 Addison-Wesley. All rights reserved.
1–13
Rectangular and Jagged Arrays
• A rectangular array is a multidimensioned array in which all of the
rows have the same number of
elements and all columns have the
same number of elements
• A jagged matrix has rows with varying
number of elements
– Possible when multi-dimensioned arrays
actually appear as arrays of arrays
Copyright © 2007 Addison-Wesley. All rights reserved.
1–14
Slices
• A slice is some substructure of an array;
it is nothing more than a referencing
mechanism
• Slices are only useful in languages that
have array operations
– Java allows row slices from 2D arrays
– Fortran 95
Integer, Dimension (10) :: Vector
Integer, Dimension (3, 3) :: Mat
Integer, Dimension (3, 3) :: Cube
Vector (3:6) is a four element array
Copyright © 2007 Addison-Wesley. All rights reserved.
1–15
Slices Examples in Fortran 95
Copyright © 2007 Addison-Wesley. All rights reserved.
1–16
Compile-Time Descriptors
Single-dimensioned array
Multi-dimensional array
Copyright © 2007 Addison-Wesley. All rights reserved.
1–17
Associative Arrays
• An associative array is an unordered
collection of data elements that are
indexed by an equal number of values
called keys
– A hash table has the same behavior
• Design Issues:
1. What is the form of references to
elements?
2. Is the size static or dynamic?
Copyright © 2007 Addison-Wesley. All rights reserved.
1–18
Associative Arrays in Perl
• Names begin with %; literals are
delimited by parentheses
%hi_temps = ("Mon" => 77, "Tue"
=> 79, “Wed” => 65, …);
• Subscripting is done using braces and
keys
$hi_temps{"Wed"} = 83;
– Elements can be removed with delete
delete $hi_temps{"Tue"};
Copyright © 2007 Addison-Wesley. All rights reserved.
1–19
Other Languages
• Ruby has hashes
– ht = {key1=> vlaue1, …}
– use ht[key1] to access
• Python has dictionary type
– ht = {key1 : value1, …}
– use ht[key1] to access
• In C++, Java provide library classes
• In C, must use user-defined type
Copyright © 2007 Addison-Wesley. All rights reserved.
1–20
Record Types
• A possibly heterogeneous aggregate of
data elements
• Individual elements identified by field
name
• Like a class with no methods and only
public data.
• Design issues:
– What is the syntactic form of references to
the field?
– Are elliptical references allowed
Copyright © 2007 Addison-Wesley. All rights reserved.
1–21
Definition of Records in Ada
• Record structures are indicated in an
orthogonal way
type Emp_Rec_Type is record
First: String (1..20);
Mid: String (1..10);
Last: String (1..20);
Hourly_Rate: Float;
end record;
Emp_Rec: Emp_Rec_Type;
Copyright © 2007 Addison-Wesley. All rights reserved.
1–22
structs in C
• Define a record in C using the struct syntax
struct record {
int var1;
double var2;
}
• Structs can be copied
struct record r1, r2 // mem for 2
records
r1.var1 = 1; r1.var2 = 2.3;
r2 = r1; // copy data from r1 into r2
Copyright © 2007 Addison-Wesley. All rights reserved.
1–23
References to Record Fields
• Most language use dot notation
Emp_Rec.Name
• Fully qualified references must include all
record names
• Elliptical references allow leaving out
record names as long as the reference is
unambiguous, for example in COBOL
FIRST, FIRST OF EMP-NAME, and
FIRST of EMP-REC are elliptical
references to the employee’s first name
Copyright © 2007 Addison-Wesley. All rights reserved.
1–24
Operations on Records
• Assignment is very common if the types
are identical
• Ada allows record comparison
• Ada records can be initialized with
aggregate literals
• COBOL provides MOVE
CORRESPONDING
– Copies a field of the source record to the
corresponding field in the target record
Copyright © 2007 Addison-Wesley. All rights reserved.
1–25
Records vs. Arrays
• Straight forward and safe design
• Use records when collection of data
values is heterogeneous
• Access to array elements is much
slower than access to record fields
– subscripts are dynamic
– field names are static
Copyright © 2007 Addison-Wesley. All rights reserved.
1–26
Implementation of Record Type
Offset address relative to
the beginning of the records
is associated with each field
Copyright © 2007 Addison-Wesley. All rights reserved.
1–27
Union Types
• A type whose elements are allowed to
store different types at different times
during execution
• Fortran, C, and C++ provide free union
– no language support for type checking
• Type checking requires extra element
– Type indicator called a discriminant
– Supported by Ada
Copyright © 2007 Addison-Wesley. All rights reserved.
1–28
Evaluation of Unions
• Potentially unsafe construct
– Do not allow type checking
• Java and C# do not support unions
– Reflective of growing concerns for safety in
programming language
Copyright © 2007 Addison-Wesley. All rights reserved.
1–29
Type Equivalence
• Consider the problem of two structured types:
– Are two record types compatible if they are
structurally the same but use different field
names?
– Are two array types compatible if they are
the same except that the subscripts are
different?
(e.g. [1..10] and [0..9])
– Are two enumeration types compatible if
their components are spelled differently?
Copyright © 2007 Addison-Wesley. All rights reserved.
1–30
Two approaches
• Name type compatibility : two variables have
compatible types if they are in either the same
declaration or in declarations that use the same type
name
– Easy to implement but highly restrictive:
– Subranges of integer types are not compatible with integer types
– Formal parameters must be the same type as their corresponding
actual parameters (Pascal)
• Structure type compatibility means that two variables
have compatible types if their types have identical
structures
– More flexible, but harder to implement
Copyright © 2007 Addison-Wesley. All rights reserved.
1–31