Transcript Document

Array and String Addresses
We have seen the notation “p=&a; ” used to set the pointer
“p” to the address of another variable. For arrays and strings,
the technique for setting the pointer to the address of the
start of an array or string is a little different.
Consider the following declarations:
char s[100];
int x[10];
char *cptr;
int *p;
To set cptr to point to the first element of s[], we can simply
say:
cptr = s;
This is equivalent to:
cptr = &s[0];
Array and String Addresses
If an array name is used without a subscript, the meaning is
“the address of the first element of the array”.
We can set p to point to the start of the x[] array with:
p = x;
Once again the statement “p=x;” is the same as “p=&x[0];”
The statement “p=&x;” is meaningless and would be
syntactically wrong.
A string constant returns the address of the first character of
the string. So the statement:
cptr = “This is a string”;
will set cptr to the “address of” the first character of the
string, i.e. cptr would point to the character ‘T’ in memory.
Array and String Addresses
Note that the variable cptr does not contain the string (it can’t
– cptr is only 4 bytes long and is a pointer), rather the string
is placed somewhere in memory by the compiler and the
address of the allocated location is assigned to cptr.
Dynamically Allocating Space – the malloc() function
The malloc() function is frequently used to request that a
block of memory be allocated somewhere in memory and that
a pointer be returned to where the allocated space is. For
example:
char *s;
int *p;
s = (char *) malloc(100);
p = (int *) malloc(100*sizeof(int));
The argument passed to malloc() is the number of bytes that
we want to have allocated. In the first example we are
requesting that 100 bytes be allocated and that the pointer
variable s be set to point to the space. The function malloc()
is, by default, of type void *, meaning that the function
returns a pointer but the type of what it points to is
undefined.
Dynamically Allocating Space – the malloc() function
By prefixing the call to malloc() with “(char *)” we are typecasting the function’s return value (i.e. overriding the default).
This eliminates some compile time warning messages that we
would get if we did not do the type-cast.
The second malloc is being used to allocate space for 100
integer values. Once again, remember that the argument to
malloc() is the number of bytes to be allocated. The
sizeof(int) function returns the byte size of a single int
variable, so this malloc request is requesting 400 bytes of
storage.
As with any uninitialized variable or storage area, we can
make no assumptions about the initial values in the allocated
space.
Address Arithmetic
We have seen that we can add and subtract values from a
pointer. For example, in the function zstrcpy(), we used the
pointers src and dst to step through the strings by
incrementing each at the end of the loop.
The meaning of doing an arithmetic operation on a pointer is
actually a little more involved (and natural) than this example
would suggest. Arithmetic operations on pointers are always
done in units related to the size (in bytes) of the object the
pointer points to.
Address Arithmetic
For example, consider the following code:
char s[] = “The cat sat on the mat”;
int x[] = {1, 2, 3, 4, 5};
double f[] = {3.2, 5.43, 6.99, 4.24, 6.11};
char *sptr = s;
int *xptr = x;
double *fptr = f;
Each element of “s” is 1 byte long, each element of “x” is 4
bytes long, and each element of “f” is 8 bytes long (a double
precision floating point number occupies 64 bits).
Address Arithmetic
Let’s assume that the first element of s[] is at location 20,000
in memory, the first element of x[] is at location 22,000 in
memory and the first element of f[] is at location 24,000. With
the above initialization statement, then initially sptr=20,000,
xptr=22,000 and fptr=24,000.
Now consider the result of each of the following increment
statements:
sptr++;
xptr++;
fptr++;
In each case, the BYTE length of the data type of what the
pointer is pointing to is added. That is, after these increments
sptr=20,001, xptr=22,004, and fptr=24,008.
Address Arithmetic
This is fortunate, since this means that the increment
positions the pointer to the start of the next element of the
array – which is generally just what we want.
This holds for all arithmetic operations on pointers. For
example the statement:
xptr+=3;
would actually add “3*4”=”12” to the value of xptr, and will
move xptr 3 elements forward in the array.
Using a pointer to process an array of numbers
This program demonstrates
(1) how to process command line parameters and
(2) how to process an array of ints using a pointer.
/* p20.c */
/* This program illustrates the use of pointers in */
/* reading and processing an array of integers */
/* It also shows how to access command line args */
/* An upper bound on the number of ints to be read */
/* must be specified on the command line.. */
/* p2 1000 */
Using a pointer to process an array of numbers
#include <stdio.h>
int main(
int argc, /* number of command line args */
char* argv[]) /* array of pointers to args */
{
int* base; // points to start of the array
int* loc; // array index
int max; // maximum number of values to read in
int count; // actual number of values read in
int largest; // largest number in the array
int i;
if (argc < 2) /* Make sure at least one arg was given */
{
printf("Usage is p20 upper-bound \n");
exit(1);
}
Processing the command line argument
The argc parameter contains the number of command line
parameters, including the program name itself. Thus a
command such as
./a.out 300
will cause argc to be set to 2. It's very important to ensure
that the user has provided the number of parameters you
need before you attempt to process them!
if (argc < 2) /* Make sure at least one arg was given */
{
printf("Usage is p20 upper-bound \n");
exit(1);
}
Processing the command line argument
This program expects that a numeric value representing the
maximum number of values that are present in the standard
input will be present on the command line.
/*
/*
/*
/*
The pointer argv[0] points to the name of the program (p20) */
argv[1] points to the first command line argument */
The atoi() function converts the ascii character */
representation an integer to a binary int value. */
max = 0;
max = atoi(argv[1]);
if (max <= 0)
{
printf("upper-bound must be a positive integer \n");
exit(2);
}
Allocating storage for the array of ints.
Note that the size of the area allocated must be specified in
bytes. A better way to do this would be to
malloc(max * sizeof(int));
count = 0;
base = (int *)malloc(4 * max);
loc = base;
Reading the input values
Note that loc and not &loc is passed to scanf(). What would
happen if a programmer “accidentally'' passed &loc??
Also note that as each integer is read loc is incremented by 1
and not by 4. The C language automagically takes into
account the size of the element pointed to when doing pointer
arithmetic! If you were to printf() the value of loc using the p
format code, you would see that the actual value does
increase by 4 each time it is incremented.
/* Read in the integers from standard input making sure */
/* not to overrun the size of the array */
while ((scanf("%d", loc) == 1) && (count != max))
{
loc = loc + 1;
count = count + 1;
}
Identifying the largest value in the array
Before starting the search for the largest number the value of
loc is reset to point to the base of the array.
loc = base; // point loc back to the start of the array
largest = *loc; // init largest to the first value in the array
loc = loc + 1;
for (i = 0; i < count; i++)
{
if (*loc > largest)
largest = *loc;
loc = loc + 1;
}
}
printf("Largest was %d \n", largest);
Identifying the largest value in the array
Exercise: the previous program actually has a nasty bug in it.
Use gdb to find and fix it.
Other ways to consume command line parameters
The standard runtime library of functions that is normally
distributed with C compilers provides a variety of ways to
consume command line parameters. In some ways they are
better than atoi() because they are better at indicating that
the user entered incorrect data. Nevertheless, atoi() is
probably the most widely used.
The sscanf() function
The sscanf() function may be used to attempt to convert
ASCII strings in a memory resident buffer to a numeric value.
Since argv[1] is a pointer to a memory resident buffer
containing the string entered as parameter 1 we could replace
the atoi() call in the previous example by:
code = sccanf(argv[1], “%d”, &max);
if (code != 1)
{
fprintf(stderr, “Yeow! bad string in parm 1 \n”);
}
Since sscanf() returns the number of values it converted, the
variable code will be 1 if it was successful.
The strtol() function
The strtol() function is more powerful still. It will fill in a
pointer to the first illegal character it encounters in the string.
If it was successful in producing a valid value, badchar
will point to the NULL character that terminates the string.
char *badchar = NULL;
long max = 0;
max = strtol(argv[1], &badchar, 10)
if (*badchar != 0)
{
fprintf(stderr, “Yeow! bad character %c in value\n”,
*badchar);
}
Representation of multidimensional data
Some data, for example, a grayscale image is most naturally
represented as a two dimensional array of the form:
#define NUM_ROWS 768
#define NUM_COLS 1024
unsigned char pixmap[NUM_ROWS][NUM_COLS];
In this representation each byte represents a single pixel.
value of 0 represents completely black and a value of 255
represents the brightest possible white. Intermediate values
provide a more or less linear brightness ramp.
To access a specific pixel in such an array one would use:
pixval = pixmap[row][col];
where row and col identify the location of the target pixel
within the image.
Representation of multidimensional data
As with all arrays in C, legal values of row range from 0 to
NUM_ROWS - 1
A disadvantage of this representation is that the value of
NUM_ROWS and NUM_COLS must generally be established at
compile time.
We would like to be able to read in the dimensions of the
image from the .ppm header and then declare the pixmap
array using the actual dimensions of the image.
There is no good way to do that in C because dimensions of
static arrays are not allowed to be variables.
Possible approaches to the unknown dimension problem
Thus there are two ways to approach the problem:
1. The naive way:
unsigned char pixmap[MAX_ROWS][MAX_COLS];
where MAX_ROWS and MAX_COLS represent the
dimensions of the largest image our program is able to
handle.
This approach
• wastes space
• forces us to read or write the image one row at a time.
• constrains the program to work only with images
having size within the predefined limits
2. The correct way:
Use a single dimensional malloc'd array and handle the
indexing ourselves
Using a single dimension array to represent 2-D data
Suppose the integer variables numrows and numcols
represent the number of rows and columns in the image and
that they have been correctly set using information in the
.ppm header.
Grayscale images
Space for a grayscale image encoded in binary can be
allocated by:
unsigned char* imageloc;
imageloc = (unsigned char *)malloc(numrows * numcols);
To read in the grayscale image from the standard input:
pixcount = fread(imageloc, 1, numrows * numcols, stdin);
if (pixcount != numrows * numcols)
{
fprintf(stderr, “pix count err - wanted %d got %d \n”,
numrows * numcols, pixcount);
exit(1);
}
Processing an image one row at a time
In this example we print pixel values of an entire image with
one line of output per row of pixel data:
for (row = 0; row < numrows; row = row + 1)
{
unsigned char* loc;
loc = imageloc + row * numcols; // first pix in row
printf(“\n”);
for (col = 0; col < numcols; col = col + 1)
{
printf(“%03x”, *loc);
loc = loc + 1;
}
}
Color images
A color image is often called an rgb image because the red,
green, and blue intensities of each pixel are stored together.
Space for a color image in binary rgb format can be allocated
by:
unsigned char* imageloc;
imageloc = (unsigned char *)malloc(3 * numrows * numcols);
To read in the rgb image:
pixcount = fread(imageloc, 3, numrows * numcols, stdin);
if (pixcount != numrows * numcols)
{
fprintf(stderr, “pix count err - wanted %d got %d \n”,
numrows * numcols, pixcount);
exit(1);
}
Accessing a specific element in malloc'd 2-dimensional data
The value of any grayscale pixel at location (row, col) within
the image is accessed in the following way:
*(imageloc + row * numcols + col)
or equivalently
imageloc[row * numcols + col]
For example, if the value of numcols is 10, then there are 10
pixels per image row. To reach the pixel whose (row, column)
address is (3, 5) it is necessary to pass over three complete
rows (row 0, row 1, and row 2) and 5 pixels in row three
(pixels 0, 1, 2, 3, and 4).
Thus, the offset of the pixel at (3, 5) is 3 * 10 + 5 as shown
above.
Floating point images
Even though the pixels of a binary grayscale image require
one byte of storage and those of a floating point image
require four bytes of storage. The value of a floating point
pixel at location (row, col) is also
*(imageloc + row * numcols + col)
Why? Because pointer arithmetic automagically takes into
account the size of the object pointed to and the C compiler
knows that unsigned chars require one byte and floats
require 4.
Accessing the individual pixels of the binary rgb image
Here the process is slightly grubbier because each pixel is
represented by three constituent components (r, g, and b),
where each of (r, g, and b) are represented by a single
unsigned char. Nevertheless, a bit of reflection yields:
red = *(imageloc + 3 * row * numcols + 3 * col);
green = *(imageloc + 3 * row * numcols + 3 * col + 1);
blue = *(imageloc + 3 * row * numcols + 3 * col + 2);
It is necessary to multiply by 3 because each pixel occupies
three bytes of memory. Since our pointer is a pointer to type
unsigned char which is a single byte, there is no magic
available from the compiler to help us out here.