MT311 Java Application Development and Programming Languages Li Tak Sing (李德成)

Download Report

Transcript MT311 Java Application Development and Programming Languages Li Tak Sing (李德成)

MT311 Java Application Development and Programming Languages

Li Tak Sing (

李德成

)

Data types

 Primitive data types: Primitive data types not only are useful by themselves; they also become building blocks for defining user defined data types, e.g. record structures, arrays, in languages that allow them. The following primitive data types are commonly available:

Primitive data type

  

Numeric types

— integer, floating-point and decimal. The size of integer is usually one word. The size of floating-point is usually four bytes.

Boolean types

— usually has a size of one byte for efficient access.

Character types

— usually has a size of one byte except those for Unicode character set.

Primitive data type

 The language C is special in that the differences between these three primitive types are very vague. First of all, it has no Boolean types, and variables of both numeric types and character types can be used where a Boolean expression is required.

Primitive data types

 Secondly, variables of character types and integer types are interchangeable. The only constraint regarding this is the size difference between an integer variable and a character variable. This philosophy makes the language very flexible. For example, we can change the value of a character variable from ‘a’ to ‘b’ by adding 1 to it.

Primitive data types

 With other languages, you have to call a function to do that. The disadvantage is that the type checking mechanism of the compiler is defected because a mixture of different primitive types in an expression is still considered to be valid. This is another example of the conflict between writability and reliability of a language.

Character string types

 The key questions that you should ask as you analyse the design of character string types in a programming language are: – Are character strings a primitive type in the language or are they constructed as an array of characters?

– Are character strings in the language declared with fixed lengths, or can they have variable lengths?

– What operations are allowed on the character string type?

User-defined ordinal types

 The two kinds of user-defined ordinal types are the enumeration type and the subrange type. The main advantage of using these types is the improved readability and reliability of the program. However, the enumeration type provided in C only increases readability because the data of enumeration type is internally converted into integer.

User-defined ordinal types

 Therefore, function that accepts a parameter of an enumeration type would also accept any integer. Therefore reliability is not increased by using enumeration type in C.

Array types

 The key points in the design of array types in a language can be emphasized by asking these questions: –

What types are legal as subscripts?

Readability and reliability increase if enumerated types are accepted as subscripts.

Array types

Are subscripts ranges checked at run time?

Some compilers will include run time range checks into generated code to check if an array reference is out of range. Some compilers, including most C compilers, will not. Such checking increases the reliability and running cost.

Array types

When are subscript ranges bound?

Some arrays can have sizes determined at time, others must be determined at run time.

When is storage allocated?

The storage can be bound statically (at compile time) or dynamically (at run time). For dynamically bound array, the storage could be allocated from the stack or from the heap.

Array types

How many subscripts are allowed?

Most modern languages do not put any limit on the number of subscripts. –

Can arrays be initialized at storage allocation?

Allowing this would increase the writability because if a language does not have this facility then initialization has to be done with a number of assignment statements.

Array types

Is there a way of defining an array type with no subscript bounds?

Consider the case when we need to write a subprogram to sort an array of integers. In Pascal, we would have the following fragment:

type

arr_type =

array

[1..10]

of integer

; ......

procedure

sort(

var

a:arr_type)

begin

.......

Array types

The problem of this code is that sort is only suitable for sorting arrays that are of type arr_type. This means that it cannot be used to sort an integer array of integers that has other than ten members. We would need another procedure for sorting an array with 11 members and one for 12 members, etc. Ada solves this problem by defining an

unconstrained array

. The same fragment in Ada would be:

Array types

type

arr_type

is array

(

Integer range

<>)

of Integer

; ......

procedure

sort(a:

in out

arr_type) begin .......

Array type

– Now, arr_type is an array and its subscripts range is not specified. Now, if we declare two variables A and B as: A: arr_type(0..9); B: arr_type(3..11); Then both A and B are of type arr_type and therefore can be sorted by using sort. Within sort, the lower and upper bounds of the array can be accessed using different standard attributes of arrays in Ada:

Array type

A’First is the index of the first element in A.

A’Last is the index of the last element in A.

Since C uses pointers to access array, the problem does not apply. However, there is a problem of getting the size of the array. Therefore, in C, we have to explicitly pass the size of the array to the function. Therefore, the same fragment in C would be:

Array Type

void sort(int *a, int size) { } .. .. .. ..

We can see that if there is a way of defining an array type without bounds, then the writability would be increased.

Row-major order

 In row-major storage, a multidimensional array in linear memory is accessed such that rows are stored one after the other. It is the approach used by the C programming language as well as many other langauges, with the notable exception of Fortran. When using row-major order, the difference between addresses of array cells in increasing rows is larger than addresses of cells in increasing columns.

Row-major order

 For example, consider this 2 ×3 array: 1 2 3 4 5 6 Declaring this array in C as int A[2][3];

Row-major order

would find the array laid-out in linear memory as: 1 2 3 4 5 6

Row-major order

The difference in offset from one column to the next is 1 and from one row to the next is 3. The linear offset from the beginning of the array to any given element A[row][column] can then be computed as:

offset = row*NUMCOLS + column

where NUMCOLS represents the number of columns in the array —in this case, 3.

Row-major order

To generalize the above formula, if we have the following C array: int A[n1][n2][n3][n4][n5] Then, the offset of the element A[m1][m2][m3][m4][m5] are: offset = m1*n2*n3*n4*n5+ m2*n3*n4*n5+m3*n4+m3*n4*n5+m4*n5+m5

Column-major order

Column-major order

is a similar method of flattening arrays onto linear memory, but the columns are listed in sequence. The programming language Fortran uses column major ordering.

Column-major order

The array 1 2 3 4 5 6 7 8 9 if stored in memory with column-major order would look like the following: 1 4 7 2 5 8 3 6 9

Column-major order

With columns listed first. The memory offset could then be computed as:

offset = row + column*NUMROWS Where NUMROWS is the number of rows in the array.

Column-major order

To generalize the above formula, if we have the following C array: int A[n1][n2][n3][n4][n5] Then, the offset of the element A[m1][m2][m3][m4][m5] are: offset = m1+ m2*n1+m3*n2*n1+m4*n3*n2*n1+m4*n3*n2* n1+m5*n4*n3*n2*n1

Example

 Consider the following array: int A[3][7][8]; Assume that A[0][0][0] is at address 20000. What is the address of A[2][3][4] (i) if row-major order is used?

(ii) if column-major order is used?

Example

 (i) an integer has 4 bytes, so the address of A[2][3][4] is: 20000+(2*7*8+3*8+4)*4  (ii) if column-major order is used, the address is: 20000+(2+3*3+4*3*7)*4