Transcript gyan.frcrce.ac.in
Programming Fundamentals
Software Life Cycle
●
History
– The "waterfall model", documented in 1970 by Royce was the first publicly documented life cycle model. The model was developed to help cope with the increasing complexity of aerospace products. The waterfall model followed a documentation driven paradigm. The next revolutionary new look at the development lifecycle was the "spiral model", presented by Boehm in 1985. The spiral model is focused on risk management.
Software Life Cycle
●
DEFINITION
– System Development Life Cycle (SDLC) is the overall process of developing information systems through a multistep process from investigation of initial requirements through analysis, design, implementation and maintenance. There are many different models and methodologies, but each generally consists of a series of defined steps or stages.
Software Life Cycle
●
Why use Software Models?
– Once upon a time, software development consisted of a programmer writing code to solve a problem or automate a procedure. Nowadays, systems are so big and complex that teams of architects, analysts, programmers, testers and users must work together to create the millions of lines of custom-written code that drive our enterprises
Software Life Cycle
●
Why use Software Models?
– To manage this, a number of system development life cycle (SDLC) models have been created: waterfall, fountain, spiral, build and fix, rapid prototyping, incremental, and synchronize and stabilize.
– The oldest of these, and the best known, is the waterfall: a sequence of stages in which the output of each stage becomes the input for the next.
Software Life Cycle
●
Why use Software Models?
– These stages can be characterized and divided up in different ways, including the following: ● Project planning, feasibility study: Establishes a high-level view of the intended project and determines its goals.
● # Systems analysis, requirements definition: Refines project goals into defined functions and operation of the intended application. Analyzes end-user information needs.
● Systems design: Describes desired features and operations in detail, including screen layouts, business rules, process diagrams, pseudocode and other documentation.
Software Life Cycle
●
Why use Software Models?
– These stages can be characterized and divided up in different ways, including the following: ● Implementation: The real code is written here.
● Integration and testing: Brings all the pieces together into a special testing environment, then checks for errors, bugs and interoperability.
● Acceptance, installation, deployment: The final stage of initial development, where the software is put into production and runs actual business.
● Maintenance: What happens during the rest of the software's life: changes, correction, additions, moves to a different computing platform and more. This, the least glamorous and perhaps most important step of all, goes on seemingly forever.
Software Life Cycle
●
Why use Software Models?
– But It Doesn't Work!
● The waterfall model is well understood, but it's not as useful as it once was. In a 1991 Information Center Quarterly article, Larry Runge says that SDLC "works very well when we are automating the activities of clerks and accountants. It doesn't work nearly as well, if at all, when building systems for knowledge workers -- people at help desks, experts trying to solve problems, or executives trying to lead their company into the Fortune 100."
Software Life Cycle
●
Why use Software Models?
– But It Doesn't Work!
● Another problem is that the waterfall model assumes that the only role for users is in specifying requirements, and that all requirements can be specified in advance. Unfortunately, requirements grow and change throughout the process and beyond, calling for considerable feedback and iterative consultation. Thus many other SDLC models have been developed.
● The fountain model recognizes that although some activities can't start before others -- such as you need a design before you can start coding -- there's a considerable overlap of activities throughout the development cycle.
Software Life Cycle
●
Why use Software Models?
– But It Doesn't Work!
● The spiral model emphasizes the need to go back and reiterate earlier stages a number of times as the project progresses. It's actually a series of short waterfall cycles, each producing an early prototype representing a part of the entire project. This approach helps demonstrate a proof of concept early in the cycle, and it more accurately reflects the disorderly, even chaotic evolution of technology.
● Build and fix is the crudest of the methods. Write some code, then keep modifying it until the customer is happy. Without planning, this is very open-ended and can by risky.
Software Life Cycle
●
Why use Software Models?
– But It Doesn't Work!
● In the rapid prototyping (sometimes called rapid application development) model, initial emphasis is on creating a prototype that looks and acts like the desired product in order to test its usefulness. The prototype is an essential part of the requirements determination phase, and may be created using tools different from those used for the final product. Once the prototype is approved, it is discarded and the "real" software is written.
● The incremental model divides the product into builds, where sections of the project are created and tested separately. This approach will likely find errors in user requirements quickly, since user feedback is solicited for each stage and because
●
Software Life Cycle Why use Software Models?
– Big Time, Real Time ● The synchronize and stabilize method combines the advantages of the spiral model with technology for overseeing and managing source code. ● This method allows many teams to work efficiently in parallel. This approach was defined by David Yoffie of Harvard University and Michael Cusumano of MIT. ● They studied how Microsoft Corp. developed Internet Explorer and Netscape Communications Corp. developed Communicator, finding common threads in the ways the two companies worked. ● For example, both companies did a nightly compilation (called a build) of the entire project, bringing together all the current
Software Life Cycle
●
Why use Software Models?
– Big Time, Real Time ● They established release dates and expended considerable effort to stabilize the code before it was released. ● The companies did an alpha release for internal testing; one or more beta releases (usually feature-complete) for wider testing outside the company, and finally a release candidate leading to a gold master, which was released to manufacturing. At some point before each release, specifications would be frozen and the remaining time spent on fixing bugs.
Software Life Cycle
●
Methods
– Life cycle models describe the interrelationships between software development phases. The common life cycle models are: ● spiral model ● waterfall model ● ● throwaway prototyping model evolutionary prototyping model ● incremental/iterative development ● reusable software model ● automated software synthesis
Software Life Cycle
●
Methods
– Life cycle models describe the interrelationships between software development phases. The common life cycle models are: ● spiral model ● waterfall model ● ● throwaway prototyping model evolutionary prototyping model ● incremental/iterative development ● reusable software model ● automated software synthesis
Software Life Cycle
●
Methods
– Because the life cycle steps are described in very general terms, the models are adaptable and their implementation details will vary among different organizations. – The spiral model is the most general. – Most life cycle models can in fact be derived as special instances of the spiral model. – Organizations may mix and match different life cycle models to develop a model more tailored to their products and capabilities.
Software Life Cycle
●
Learn
– A software life cycle model depicts the significant phases or activities of a software project from conception until the product is retired. It specifies the relationships between project phases, including transition criteria, feedback mechanisms, milestones, baselines, reviews, and deliverables. – Typically, a life cycle model addresses the following phases of a software project: requirements phase, design phase, implementation, integration, testing, operations and maintenance. – Much of the motivation behind utilizing a life cycle model is to provide structure to avoid the problems of the "undisciplined hacker".
Software Life Cycle
●
Spiral Model
– The spiral model is the most generic of the models. Most life cycle models can be derived as special cases of the spiral model. – The spiral uses a risk management approach to software development.
Software Life Cycle
● Spiral Model – Some advantages of the spiral model are: ● defers elaboration of low risk software elements ● incorporates prototyping as a risk reduction strategy ● gives an early focus to reusable software ● accommodates life-cycle evolution, growth, and requirement changes ● incorporates software quality objectives into the product ● focus on early error detection and design flaws ● sets completion criteria for each project activity to answer the question: "How much is enough?" ● uses identical approaches for development and maintenance
Software Life Cycle
● Waterfall Model – The least flexible and most obsolete of the life cycle models. – Well suited to projects that has low risk in the areas of user interface and performance requirements, but high risk in budget and schedule predictability and control.
Software Life Cycle
● Throwaway Prototyping Model – Useful in "proof of concept" or situations where requirements and user's needs are unclear or poorly specified. – The approach is to construct a quick and dirty partial implementation of the system during or before the requirements phase.
Software Life Cycle
● Evolutionary Prototyping Model – Use in projects that have low risk in such areas as losing budget, schedule predictability and control, large-system integration problems, or coping with information sclerosis, but high risk in user interface design.
Software Life Cycle
● Incremental/iterative Development – The process for constructing several partial deliverables, each having incrementally more functionality.
Software Life Cycle
● Automated Software Synthesis – This process relies on tools to transform requirements into operational code. Formal requirements are created and maintained using specification tools. – This is an active research area, and practical tools for this approach are yet to be developed.
Software Life Cycle
● Automated Software Synthesis – This process relies on tools to transform requirements into operational code. Formal requirements are created and maintained using specification tools. – This is an active research area, and practical tools for this approach are yet to be developed.
What Is Programming?
● Simply put, programming is the process of creating a set of instructions for the computer to follow.
● These instructions are written in a language or languages which people can understand, and then "compiled" (literally, translated) into machine language, or interpreted by the computer from reading a script. HTML is a programming language, as are Javascript and VBScript. ● The "syntax" of a programming language consists of the various language elements, conventions, and operators that are used to write the instructions.
Definitions of algorithm
● A mathematical relation between an observed quantity and a variable used in a step-by-step mathematical process to calculate a quantity. In the context of remote sensing, algorithms generally specify how to determine higher-level data products from lower-level source data. For example, algorithms prescribe how atmospheric temperature and moisture profiles are determined from a set of radiation observations originally sensed by satellite sounding instruments.
● is a procedure or formula for solving problems
Definitions of algorithm
● A computer program (or set of programs) which is designed to systematically solve a certain kind of problem. WSR-88D radars (NEXRAD) employ algorithms to analyze radar data and automatically determine storm motion, probability of hail, VIL, accumulated rainfall, and several other parameters ● A finite set of well-defined rules for the solution of a problem in a finite number of steps.
● A set of rules that a search engine uses to rank the listings contained within its index, in response to a particular query.
Definitions of algorithm
● A computer program (or set of programs) which is designed to systematically solve a certain kind of problem. WSR-88D radars (NEXRAD) employ algorithms to analyze radar data and automatically determine storm motion, probability of hail, VIL, accumulated rainfall, and several other parameters ● A finite set of well-defined rules for the solution of a problem in a finite number of steps.
● A set of rules that a search engine uses to rank the listings contained within its index, in response to a particular query.
●
Difference between Algorithm & Program
For small programs there's little value in planning in fact, you'll probably end up wasting time. ● As far as individual algorithms go, it's usually best to go with whatever seems the easiest to understand and only do more advanced things once you know that you need more speed or something like that.
●
Difference between Algorithm & Program
That's a pseudocode implementation of an algorithm. You can describe an algorithm in human language, in pseudocode or in a programming language.
● create an empty jumble word while the chosen word has letters in it extract a random letter from the chosen word add the random letter to the jumble word
●
Difference between Algorithm & Program
That's actually *longer* than just writing the algorithm in Python directly, which is why we say that Python is executable pseudocode :). ● Pseudo-code design has IMO little value in Python and should stop at a higher level if it's used at all. In this example I'd have said this before going to Python code: randomize letters in word
●
Difference between Algorithm & Program
By going into too much detail, you've actually overlooked the fact that you're recreating functionality from the Python standard library: random.shuffle.
● If the prog. is like
Difference between Algorithm & Program
#Game List list = ("GT4", "GTA Sanandreas", "Takken 5") print "My Games: " for i in list: print i print "\nI've got", len(list), "games."
Difference between Algorithm & Program
add = raw_input(\n\nAdd more games: ") adding = (add) list += adding print "Total games are..." print list raw_input("exit.")
●
Difference between Algorithm & Program
The pseudocode could be something like this: list all games show nr. of games get the name of a new game add the new game list all games exit
●
Difference between Algorithm & Program
If you go into more detail, you'll end up writing a long version of the Python code, which kind of defeats the purpose of pseudocode.
●
Difference between Algorithm & Program
Useful advice for algorithm – It's more important (at least for larger progams) to think about good design in terms of modules, classes and functions than in terms of individual lines of code. Always stop designing when designing becomes as low-level as just writing the code.
Sequential Algorithm
● Write an algorithm to find the average marks of a students he has obtained in three subjects: Step 1: Start Step 2: Accept num1, num2, num3 Step 3: sum = num1 + num2 + num3 Step 4: avg = sum/3 Step 5: display avg Step 6: Stop
Good Programming Style
● Consistency – One of the hallmarks of good programming style is consistency--the fewer surprises, the better. – Consistency makes it easier to read the program primarily by reducing distractions. – But it also helps guide the reader's eyes, for example, by being consistent about ordering functions or including files to simplify finding them in the future. – It also makes it easier to deal with other style problems by making it easier for the reader to get used to them.
Good Programming Style
● Clarity – Good style is about making your program clear and understandable as well as easily modifiable. – When in doubt, choose the most easily understood technique for a problem. – Remember that whatever you write, you will likely have to read at some point after you remember exactly what you were thinking. – Do your future self a favor and be clear the first time.
●
Good Programming Style
Whitespace and Formatting – Whitespace can set things off and reduce the strain on the reader's eyes. Because the compiler ignores whitespace, you're free to place things anywhere you want and format it however you want.
– If you choose wisely, this can be a boon.
– Whitespace comes in several forms, including indentation, how you put space around operators, how you lay out function ????declarations???, and where you place arguments to functions. – Of course, this is hardly everything covered by whitespace, but it should give you an idea of the many places whitespace can be used to improve readability.
Good Programming Style
● Whitespace and Formatting – Indentation } ● If you don't indent your code, you soon will. It's just a natural thing to do because it lets you use your eye to quickly pick out the flow of control or mistakes in that flow of control. ● The quick primer on indentation is that you should indent every block of code: { if ( true ) // code block
Good Programming Style
● Whitespace and Formatting – Brace Styles ● There are a variety of ways to indent code, and many of them are tied to where you place your braces. Some people prefer the style used above. ● Some people prefer the style below: } if ( true ) { // code block
Good Programming Style
● Whitespace and Formatting – Brace Styles ● Other styles include if ( true ) { // code block } ● or even } if ( true ) { // code block
Good Programming Style
● Whitespace and Formatting – Brace Styles ● Any brace style you choose is up to you, although I'd recommend using the same brace style as everyone else working on your project. At any rate, there are arguments for each style. A good consideration is to use a brace style that allows you to put as much code on a screen at a time, but consistency with past practices is probably just as important.
Good Programming Style
● Whitespace and Formatting – Indentation Depth ● How far you choose to indent is a matter of personal preference--in any case, it's usually best to choose an indentation size that's small enough to comfortably fit your code on one screen. ● I've found that any indentation width from 2 to 8 is reasonable and readable, although I've found that anything over four spaces for indentation can lead to lines that are too long. ● In general, the best solution to dealing with lines that are too long is to reduce the complexity of the code or at least to pull out some functionality into separate functions. ● Doing so will reduce the number of levels of indentation and can make the code more readable (if it is done correctly).
Good Programming Style
● Whitespace and Formatting – Tabs vs. Spaces ● There is something of an argument over whether to indent using tabs or spaces.
● Note that this is not the same as asking whether you indent with the spacebar or the tab--most people let their editors take care of that problem for them (or choose to have tabs expanded to spaces). ● The real issue of tabs vs. spaces is what happens when someone else opens your code. ● Since you can set tabs to take up any number of columns, the person opening your code might not have the same tabstop width. ● This can play havoc with otherwise well-formatted code. Using
Good Programming Style
● Whitespace and Formatting – Tabs vs. Spaces ● Sometimes a decent code formatter, such as might be found in your text editor, can help mitigate this problem by reformatting the code. ● You can also play with your own tab settings to display the code correctly (although this is obviously annoying). ● The best solution, if you do decide to use tabs, is to be very careful with what you use tabs for. ● The real problem, in fact, comes up when tabs are used not just for indentation but also as a quick way of moving four or eight characters to the right.
Good Programming Style
● Whitespace and Formatting – Tabs vs. Spaces ● For instance, let's look at the following code: if ( a_long_case_number_one_stretching_across_most_of_the_sc reen && } { a_long_case_number_two_also_stretching_across_most_of_th e_screen ) // code
Good Programming Style
● Whitespace and Formatting – Tabs vs. Spaces } ● If the second condition had been formatted using a tab with a four-space indent followed by a space, then when loaded with a tab-width of eight, it would look ugly: { if ( a_long_case_number_one_stretching_across_most_of_the_sc reen && a_long_case_number_two_also_stretching_across_most_of_th e_screen ) // code
Good Programming Style
● Whitespace and Formatting – Tabs vs. Spaces ● If spaces were used for the formatting, it would open correctly: if ( a_long_case_number_one_stretching_across_most_of_the_sc reen && } { a_long_case_number_two_also_stretching_across_most_of_th e_screen ) // code
Good Programming Style
● Whitespace and Formatting – Mis-Use of Whitespace ● How much white space you use is, to some extent, a personal choice. There are some issues to be aware of. First, the more your whitespace helps emphasize the logic of your program, the better; you don't want your whitespace to "lie". This might not confuse the you-at-this-moment, but it can confuse the you of-the-future or just someone else reading your code. What does lying look like?
if ( true ) ++i; ++j;
Good Programming Style
● Whitespace and Formatting – Mis-Use of Whitespace ● The indentation makes it look like both statements get executed when the if statement executes--but that's not what actually happens. ● When you're tracking down a bug or syntax error in hundreds or thousands of lines of code, you may end up doing a visual scan instead of checking every line carefully.
● The easier you make it to scan the code and pick out salient details, the faster you can scan without making mistakes.
●
Programming Style: Writing for Readability
Using Functions – Unlike prose, where repeating the same word or phrase may seem redundant, in programming, it's perfectly fine to use the same construction over and over again. Of course, you may want to turn a repeated chunk of code into a function: this is even more readable because it gives the block of code a descriptive name. (At least you ought to make it descriptive!) – You can also increase readability by using standard functions and data structures (such as the STL). Doing so avoids the confusion of someone who might ask, "why did you create a new function when you had a perfectly good one already available?" The problem is that people may assume that there's a reason for the new function and that it somehow differs from the
●
Programming Style: Writing for Readability
Using Functions – Moreover, by using standard functions you help your reader understand the names of the arguments to the function. There's much less need to look at the function prototype to see what the arguments mean, or their order, or whether some arguments have default values.
●
Programming Style: Writing for Readability
Use Appropriate Language Features – There are some obvious things to avoid: don't use a loop as though it were an if statement. Choose the right data type for your data: if you never need decimal places in a number, use an integer. If you mean for a value to be unsigned, used an unsigned number. When you want to indicate that a value should never change, use const to make it so. – Try to avoid uncommon constructions unless you have good reason to use them; put another way, don't use a feature just because the feature exists. One rule of thumb is to avoid do-while loops unless you absolutely need one.
●
Programming Style: Writing for Readability
Use Appropriate Language Features – People aren't generally as used to seeing them and, in theory, won't process them as well. I've never run into this problem myself, but think carefully about whether you actually need a do-while loop. Similarly, although the ternary operator is a great way of expressing some ideas, it can also be confusing for programmers who don't use it very often. – A good rule of thumb is to use it only when necessary (for instance, in the initialization list of a constructor) and stick with the more standard if-else construction for everything else. Sure, it'll make your program four lines longer, but it'll make it that much easier for most people
●
Programming Style: Writing for Readability
Use Appropriate Language Features – There are some less obvious ways of using standard features. When you are looping, choose carefully between while, do-while, and for. – For loops are best when you can fill in each part (initialization, conditional, and increment) with a fairly short expression. While loops are good for watching a sentinel variable whose value can be set in multiple places or whose value depends on some external event such as a network event.
●
Programming Style: Writing for Readability
Use Appropriate Language Features } – While loops are also better when the update step isn't really a direct "update" to the control variable--for instance, when reading lines from a text file, it might more sense to use a while loop than a for loop because the control depends on the result of the method call, not the value of the variable of interest: { while (fgets(buf, sizeof(buf), fp) != NULL) /* do stuff with buf */
Programming Style: Writing for Readability
● Unpack Complex Expressions – There's no reason to put everything on a single line. If you have a complex calculation with multiple steps and levels of parentheses, it can be extremely helpful to go from a one-line calculation to one that uses temporary variables. This gives you two advantages; first, it makes it easier to follow the expression. Second, you can give a distinct name to each intermediate step, which can help the reader follow what is happening. Often, you'll want to reuse those intermediate calcuations anyway. In addition to mathematical calculations, this principle also applies to nested function calls. The fewer events that take place on a single line of code, the easier it is to follow exactly what's happening.
●
Programming Style: Writing for Readability
Unpack Complex Expressions – Another advantage to unpacking an expression is that you can put more comments in-line to explain what's going on and why.
●
Programming Style: Writing for Readability
Avoid Magic Numbers } – Magic numbers are numbers that appear directly in the code without an obvious reason. For instance, what does the number 80 in the following expression mean?
{ for( int i = 0; i < 80; ++i ) printf( "-" );
●
Programming Style: Writing for Readability
Avoid Magic Numbers – It might be the width of the screen, but it might also be the width of a map whose wall is being drawn. You just don't know. The best solution is to use macros, in C, or constants in C++. This gives you the chance to descriptively name your numbers. Doing so also makes it easier to spot the use of a particular number and differentiate between numbers with the same value that mean different things. Moreover, if you decide you need to change a value, you have a single point where you can make the change, rather than having to sift through your code.
●
Programming Style, Naming Conventions
Good naming can make a huge difference in program readability. Names like proc1, proc2, and proc3 mean next-to-nothing, but even apparently decent names can be ambiguous. ● For instance, to name a function that takes two ranges and computes the amount of the first that lies within the second, names like computeRangeWithinRange might sound reasonable.
●
Programming Style, Naming Conventions
Unfortunately, this name gives no information about the order of arguments to the function. ● Sometimes this won't be a problem because an intelligent IDE should help you determine the names of the arguments to the function, but it can be confusing when the code is printed, or when you try to quickly read through the code. ● A better name might be computeRangeWithinFirstRange, which at least gives a sense of the order of the arguments to the function.
●
Programming Style, Naming Conventions
General Naming Conventions – It's usually best to choose a consistent set of naming convnetions for use throughout your code. Naming conventions usually govern things such as how you capitalize your variables, classes, and functions, whether you include a prefix for pointers, static data, or global data, and how you indicate that something is a private field of a class.
●
Programming Style, Naming Conventions
General Naming Conventions – There are a lot of common naming conventions for classes, functions and objects. Usually these are broken into several broad categories: c-style naming, camelCase, and CamelCase. C-style naming separates words in a name using underscores: this_is_an_identifer. There are two forms of camelCase: one that begins with a lowercase letter and then capitalizes the first letter of every ensuing word, and one that capitalizes the first letter of every single word.
●
Programming Style, Naming Conventions
General Naming Conventions – One popular convention is that leading capital letter CamelCase is used for the names of structs and classes, while normal camelCase is used for the names of functions and variables (although sometimes variables are written in c-style to make the visual separation between functions and variables more clear).
●
Programming Style, Naming Conventions
General Naming Conventions – It can be useful to use prefixes for certain types of data to remind you what they are: for instance, if you have a pointer, prefixing it with "p_" tells you that it's a pointer. If you see an assignment between a variable starting with "p_" and one that doesn't begin with "p_", then you immediately know that something fishy is going on. It can also be useful to use a prefix for global or static variables because each of these has a different behavior than a normal local variable. In the case of global variables, it is especially useful to use a prefix in order to prevent naming collisions with local variables (which can lead to confusion).
●
Programming Style, Naming Conventions
General Naming Conventions – Finally, a common convention is to prefix the private fields and methods of a class with an underscore: e.g., _private_data. This can make it easier to find out where to look in the body of a class for the declaration of a method, and it also helps keep straight what you should and should not do with a variable. For instance, a common rule is to avoid returning non-const references to fields of a class from functions that are more public than the field. For instance, if _age is a private field, then the public getAge function probably shouldn't return a non-const reference since doing so effectively grants write access to the field!
●
Programming Style, Naming Conventions
Hungarian Notation – Hungarian notation has commonly been associated with prefixing variables with information about their type--for instance, whether a variable is an integer or a double. This is usually not a useful thing to do because your IDE will tell you the type of a variable, and it can lead to bizarre and complicated looking names. The original idea behind Hungarian notation, however, was more general and useful: to create more abstract "types" that describe how the variable is used rather than how the variable is represented. This can be useful for keeping pointers and integers from intermixing, but it can also be a powerful technique for helping to separate concepts that are often used together, but that should not be mixed.
●
Programming Style, Naming Conventions
Abbrevations – Abbreviations are dangerous--vowels are useful and can speed up code reading. Resorting to abbreviations can be useful when the name itself is extremely long because names that are too long can be as hard to read as names that are too short. When possible, be consistent about using particular abbreviations, and restrict yourself to using only a small number of them.
●
Programming Style, Naming Conventions
Abbrevations – Common abbreviations include "itr" for "iterator" or "ptr" for pointer. Even names like i, j, and k are perfectly fine for loop counter variables (primarily because they are so common). Bad abbreviations include things like cmptRngFrmRng, which at the savings of only a few letters eliminates a great deal of readability. If you don't like typing long names, look into the auto-complete facilities of your text editor. You should rarely need to type out a full identifier. (In fact, you rarely want to do this: typos can be incredibly hard to spot.)
●
Programming Style, Naming Conventions
Abbrevations – Common abbreviations include "itr" for "iterator" or "ptr" for pointer. Even names like i, j, and k are perfectly fine for loop counter variables (primarily because they are so common). Bad abbreviations include things like cmptRngFrmRng, which at the savings of only a few letters eliminates a great deal of readability. If you don't like typing long names, look into the auto-complete facilities of your text editor. You should rarely need to type out a full identifier. (In fact, you rarely want to do this: typos can be incredibly hard to spot.)
Procedural and Object-Oriented Programming
● The first item of business is to identify the 2 kinds of programming that exist, and how each one fits into the overall picture: – Procedural Programming – Object Oriented Programming
Procedural Programming
● Procedural Programming was the first kind of programming to develop, and involves a relatively simple way of doing things. ● In procedural programming, the computer is given a set of instructions which it executes, and then waits for input, which it reacts to by executing another set of instructions, and so on.
Procedural Programming
● There are 3 basic elements of procedural programming: – Sequence - The order in which instructions are executed is the sequence of the programming. This is far more important than it might seem, but having things in the proper sequence is essential.
– Selection - If/else conditional statements and other forms of selection form the second element. This is how a program makes and reacts to choices.
– Iteration - The use of "loops" and other forms of repetitive sets of instructions forms the last
Object Oriented Programming
● Object Oriented Programming came along later, with the advent of multi-tasking operating systems such as Windows. ● In point of fact, procedural programming is at the heart of all programming, including obect oriented, but because certain kinds of objects have many characteristics in common (such as windows, for example), it is more convenient to treat them as objects, rather than as sets of instructions.
Object Oriented Programming
● An object has properties, methods, and event handlers.
– Properties - Properties of an object are the characteristics which define how the object behaves. In a web page, for example, the page itself has certain properties, which are defined in the
tag, such as the background color, style source page, etc.Object Oriented Programming
● An object has properties, methods, and event handlers.
– Methods - Methods are actually blocks of instructions that can be executed by an object itself, and each object has its' own set of methods. A simple example of this would be the submit() method of a form. When You click on the Submit button of a form, or invoke the submit() method for a form with a Javascript command, the form is submitted using the form's submit() method.
Object Oriented Programming
● An object has properties, methods, and event handlers.
– Event Handlers - An "event" is when something happens (duh!), either something that the user has done, or something the program itself has done, or another program has done. The simplest example I can think of for an event is using the form example above. When you click the "Submit" button, you have generated an event, and the event handler is the set of instructions which is programmed to execute when that event occurs. Because of the nature of multi-tasking systems like Windows, one is never sure where the next event is going to come from, so event handlers are designed to react to events in the
Object Oriented Programming
● One last thing that is important to remember regarding object-oriented programming: – It employs procedural programming. In other words, an object-oriented program is going to contain both object-oriented and procedural code. – An event handler, for instance, may simply be a procedure that executes when the event occurs. This procedure may or may not include object oriented code.
Object Oriented Programming
● The program uses an interface for communicating with the user. ● The interface is a method of the computer talking to the user in a way that the user can understand, and the user talking back to the program in a way that the computer can understand. In a web application, the web pages themselves comprise the interface.
Object Oriented Programming
● The pages contain information which the user reads, and contain active elements, such as hyperlinks and forms, that enable the user to input instructions to the computer. ● Think of the interface as something like the old Star Trek universal translator: It helps the user and the computer to communicate, even though they think and communicate quite differently.
Functions :: The Basics
● Why should we make functions in our programs when we can just do it all under main? ● Think for a minute about high-end stereo systems. These stereo systems do not come in an all-in-one package, but rather come in separate components: pre-amplifier, amplifier, equalizer, receiver, cd player, tape deck, and speakers. The same concept applies to programming. ● Your programs become modularized and much more readable if they are broken down
Functions :: The Basics
● This type of programming is known as top down programming, because we first analyze what needs to be broken down into components. Functions allow us to create top-down modular programs.
● Each function consists of a name, a return type, and a possible parameter list. This abstract definition of a function is known as it's interface.
Functions :: The Basics
● Here are some sample function interfaces: char *strdup(char *s) int add_two_ints(int x, int y) void useless(void) ● The first function header takes in a pointer to a string and outputs a char pointer. The second header takes in two integers and returns an int. The last header doesn't return anything nor take in parameters.
Functions :: The Basics
● A function can return a single value to its caller in a statement using the keyword return. The return value must be the same type as the return type specified in the function's interface.
Functions :: Prototypes
● Function prototypes are abstract function interfaces. These function declarations have no bodies; they just have their interfaces.
● Function prototypes are usually declared at the top of a C source file, or in a separate header file ● For example, if you wanted to grab command line parameters for your program, you would most likely use the function getopt. But since this function is not part of ANSI C, you must declare the function prototype, or you will get implicit declaration warnings when compiling with our flags.
Functions :: Prototypes
● /* This section of our program is for Function Prototypes */ int getopt(int argc, char * const argv[], const char *optstring); extern char *optarg; extern int optind, opterr, optopt;
Functions :: Prototypes
● So if we declared this function prototype in our program, we would be telling the compiler explicitly what getopt returns and it's parameter list. ● What are those extern variables? Recall that extern creates a reference to variables across files, or in other words, it creates file global scope for those variables in that particular C source file. ● That way we can access these variables that getopt modifies directly. More on getopt on the next section about Input/Output.
Functions :: Functions as Parameters
● This is a little more advanced section on functions, but is very useful. ● Take this for example: int applyeqn(int F(int), int max, int min) { int itmp; itmp = F(int) + min; itmp = itmp - max; return itmp; }
Functions :: Functions as Parameters
● What does this function do if we call it with applyeqn(square(x), y, z);? ● What happens is that the int F(int) is a reference to the function that is passed in as a parameter. ● Thus inside applyeqn where there is a call to F, it actually is a call to square! ● This is very useful if we have one set function, but wish to vary the input according to a particular function.
Functions :: Functions as Parameters
● So if we had a different function called cube we could change how we call applyeqn by calling the function by applyeqn(cube(x), y, z);.
Functions :: The Problem
● There are four major ways that parameters are passed into functions. ● The two that we should be concerned with are Pass by Value and Pass by Reference. ● In C, all parameters are passed by value.
Functions :: The Problem
● In simplistic terms, functions in C create copies of the passed in variables. ● These variables remain on the stack for the lifetime of the function and then are discarded, so they do not affect the inputs! ● This is important. Let's repeat it again. Passed in arguments will remain unchanged.
Functions :: The Problem
● Let's use this swapping function as an example: void swap(int x, int y) { int tmp = 0; tmp = x; x = y; y = tmp; } ● If you were to simply pass in parameters to this swapping function that swaps two integers, this would fail horribly. You'll just get the same values back.
Functions :: The Problem
● You can circumvent this pass by value limitation in C by simulating pass by reference. ● Pass by reference changes the values that are passed in when the function exits. ● This isn't how C works technically but can be thought of in the same fashion. ● So how do you avoid pass by value side effects? ● By using pointers and in some cases using
The C Preprocessor :: Overview
● The C Preprocessor is not part of the compiler, but is a separate step in the compilation process. In simplistic terms, a C Preprocessor is just a text substitution tool.
● All preprocessor lines begin with #.
The C Preprocessor :: Overview
● The unconditional directives are: #include - Inserts a particular header from another file #define - Defines a preprocessor macro #undef - Undefines a preprocessor macro
The C Preprocessor :: Overview
● The conditional directives are: #ifdef - If this macro is defined #ifndef - If this macro is not defined #if - Test if a compile time condition is true #else - The alternative for #if #elif - #else an #if in one statement #endif - End preprocessor conditional
The C Preprocessor :: Overview
● Other directives include: # - Stringization, replaces a macro parameter with a string constant ## - Token merge, creates a single token from two adjacent ones
The C Preprocessor :: Overview
● Some examples of the above: #define MAX_ARRAY_LENGTH 20 Tells the CPP to replace instances of MAX_ARRAY_LENGTH with 20. Use #define for constants to increase readability. Notice the absence of the ;.
The C Preprocessor :: Overview
● Some examples of the above: #include
The C Preprocessor :: Overview
● Some examples of the above: #undef MEANING_OF_LIFE #define MEANING_OF_LIFE 42 Tells the CPP to undefine MEANING_OF_LIFE and define it for 42.
The C Preprocessor :: Overview
● Some examples of the above: #ifndef IROCK #define IROCK "You wish!" #endif Tells the CPP to define IROCK only if IROCK isn't defined already.
The C Preprocessor :: Overview
● Some examples of the above: #ifdef DEBUG /* Your debugging statements here */ #endif Tells the CPP to do the following statements if DEBUG is defined. This is useful if you pass the -DDEBUG flag to gcc. This will define DEBUG, so you can turn debugging on and off on the fly!
The C Preprocessor :: Parameterized Macros
● One of the powerful functions of the CPP is the ability to simulate functions using parameterized macros. ● For example, we might have some code to square a number: } int square(int x) { return x * x;
●
The C Preprocessor :: Parameterized Macros
We can instead rewrite this using a macro: #define square(x) ((x) * (x)) ● First square(x) The left parentheses must "cuddle" with the macro identifier. ● The next thing that should catch your eye are the parenthesis surrounding the x's. ● These are necessary... what if we used this macro as square(1 + 1)?
The C Preprocessor :: Parameterized Macros
● Imagine if the macro didn't have those parentheses? ● It would become ( 1 + 1 * 1 + 1 ). Instead of our desired result of 4, we would get 3. ● Thus the added parentheses will make the expression ( (1 + 1) * (1 + 1) ). ● This is a fundamental difference between macros and functions. You don't have to worry about this with functions, but you must consider this when using macros.
The C Preprocessor :: Parameterized Macros
● Remeber that pass by value vs. pass by reference issue earlier? I said that you could go around this by using a macro. Here is swap in action when using a macro: #define swap(x, y) { int tmp = x; x = y; y = tmp } ● Now we have swapping code that works. Why does this work? It's because the CPP just simply replaces text. Wherever swap is called, the CPP will replace the macro call with the defined text.
Functions and C Preprocessor Review
● Functions allow for modular programming. You must remember that all parameters passed into function in C are passed by value!
● The C Preprocessor allows for macro definitions and other pre-compilation directives. It is just a text substitution tool before the actual compilation begins.
Input/Output and File I/O
● With most of the basics of C under our belts, lets focus now on grabbing Input and directing Output.
I/O :: printf(3)
● printf(3) is one of the most frequently used functions in C for output.
● The prototype for printf(3) is: int printf(const char *format, ...); ● printf takes in a formatting string and the actual variables to print.
I/O :: printf(3)
● An example of printf is: int x = 5; char str[] = "abc"; char c = 'z'; float pi = 3.14; printf("\t%d %s %f %s %c\n", x, str, pi, "WOW", c);
I/O :: printf(3)
● You can format the output through the formatting line.. ● By modifying the conversion specification, you can change how the particular variable is placed in output.
I/O :: scanf(3)
● scanf(3) is useful for grabbing things from input. Beware though, scanf isn't the greatest function that C has to offer.
● Some people brush off scanf as a broken function that shouldn't be used often.
● The prototype for scanf is: int scanf( const char *format, ...);
I/O :: scanf(3)
● scanf's major "flaw" is it's inability to digest incorrect input. ● If scanf is expecting an int and your standard in keeps giving it a string, scanf will keep trying at the same location. ● If you looped scanf, this would create an infinite loop.
I/O :: scanf(3)
● Take this example code: } int x, args; for ( ; ; ) { printf("Enter an integer bub: "); if (( args = scanf("%d", &x)) == 0) { printf("Error: not an integer\n"); continue; } else { if (args == 1) printf("Read in %d\n", x); else break;
I/O :: scanf(3)
● The code above will fail. Why? ● It's because scanf isn't discarding bad input. ● So instead of using just continue;, we have to add a line before it to digest input. ● We can use a function called digestline().
I/O :: scanf(3)
void digestline(void) { scanf("%*[^\n]"); /* Skip to the End of the Line */ scanf("%*1[\n]"); /* Skip One Newline */ } ● Using assignment suppression, we can use * to suppress anything contained in the set [^\n]. This skips all characters until the newline. The next scanf allows one newline character read. Thus we can digest bad input!
File I/O :: fgets(3)
● One of the alternatives to scanf/fscanf is fgets. ● The prototype is: char *fgets(char *s, int size, FILE *stream); ● fgets reads in size - 1 characters from the stream and stores it into *s pointer. The string is automatically null-terminated.
● fgets stops reading in characters if it reaches an EOF or newline.
File I/O :: sscanf(3)
● To scan a string for a format, the sscanf library call is handy. ● It's prototype: int sscanf(const char *str, const char *format, ...); ● sscanf works much like fscanf except it takes a character pointer instead of a file pointer.
● Using the combination of fgets/sscanf instead of scanf/fscanf you can avoid the "digestion" problem
File I/O :: fprintf(3)
● It is sometimes useful also to output to different streams. fprintf(3) allows us to do exactly that.
● The prototype for fprintf is: int fprintf(FILE *stream, const char *format, ...); ● fprintf takes in a special pointer called a file pointer, signified by FILE *. It then accepts a formatting string and arguments. The only difference between fprintf and printf is that fprintf can redirect output to a particular stream. These streams can be stdout, stderr,
File I/O :: fprintf(3)
● An example: fprintf(stderr, "ERROR: Cannot malloc enough memory.\n"); ● This outputs the error message to standard error.
File I/O :: fscanf(3)
● fscanf(3) is basically a streams version of fscanf. ● The prototype for fscanf is: int fscanf( FILE *stream, const char *format, ...);
File I/O :: fflush(3)
● Sometimes it is necessary to forcefully flush a buffer to its stream. If a program crashes, sometimes the stream isn't written. You can do this by using the fflush(3) function. The prototype for fflush is: int fflush(FILE *stream); ● Not very difficult to use, specify the stream to fflush.
File I/O :: fopen(3), fclose(3), and File Pointers
● fopen(3) is used to open streams. This is most often used with opening files for input. ● fopen's prototype is: FILE *fopen (const char *path, const char *mode); ● fopen returns a file pointer and takes in the path to the file as well as the mode to open the file with. Take for example: FILE *Fp; Fp = fopen("/home/johndoe/input.dat", "r");
File I/O :: fopen(3), fclose(3), and File Pointers
● This will open the file in /home/johndoe/input.dat for reading. Now you can use fscanf commands with Fp. ● For example: fscanf(Fp, "%d", &x); ● This would read in an integer from the input.dat file. If we opened input.dat with the mode "w", then we could write to it using fprintf: fprintf(Fp, "%s\n", "File Streams are cool!");
File I/O :: fopen(3), fclose(3), and File Pointers
● To close the stream, you would use fclose(3). ● The prototype for fclose is: int fclose( FILE *stream ); ● You would just give fclose the stream to close the stream. ● Remember to do this for all file streams, especially when writing to files!
I/O and File I/O :: Return Values
● We have been looking at I/O functions without any regard to their return values. This is bad. Very, very bad. So to make our lives easier and to make our programs behave well, let's write some macros!
I/O and File I/O :: Return Values
● Let's write some wrapper macros for these functions. First let's create a meta-wrapper function for all of our printf type functions: #define ERR_MSG( fn ) { (void)fflush(stderr); \ (void)fprintf(stderr, __FILE__ ":%d:" #fn ": %s\n", \ __LINE__, strerror(errno)); } #define METAPRINTF( fn, args, exp ) if( fn args exp ) ERR_MSG( fn )
I/O and File I/O :: Return Values
● This will create an ERR_MSG macro to handle error messages. ● The METAPRINTF is the meta-wrapper for our printf type functions.
I/O and File I/O :: Return Values
● So let's define our printf type macros: #define PRINTF(args) METAPRINTF(printf, args, < 0) #define FPRINTF(args) METAPRINTF(fprintf, args, < 0) #define SCANF(args) METAPRINTF(scanf, args, < 0) #define FSCANF(args) METAPRINTF(fscanf, args, < 0) #define FFLUSH(args) METAPRINTF(fflush, args, < 0)
I/O and File I/O :: Return Values
● Now we have our wrapper functions. Because args is sent to METAPRINTF, we need two sets of parentheses when we use the PRINTF macro. ● Examples on using the wrapper function: PRINTF(("This is so cool!")); FPRINTF((stderr, "Error bub!")); Now you can this code into a common header file and be able to use these convenient macros and still be able to check for return
I/O and File I/O :: Return Values
● Note: We did not write macros for fopen and fclose. You must manually check for return values on those functions.
● Other I/O Functions – There are many other Input/Output functions, such as fputs, getchar, putchar, ungetc.
Command Line Arguments and Parameters :: getopt(3)
● We have already seen getopt, but now lets actually make some code that makes this function useful. Let's see the prototype again: int getopt(int argc, char * const argv[], const char *optstring); extern char *optarg; extern int optind, opterr, optopt;
Command Line Arguments and Parameters :: getopt(3)
● In order for us to utilize argc and argv, we must allow these as parameters on our main() function: int main(int argc, char **argv) ● Now that we have everything set up, lets get this show on the road.
Command Line Arguments and Parameters :: getopt(3)
int ich; while ((ich = getopt (argc, argv, "ab:c")) != EOF) { switch (ich) { case 'a': /* Flags/Code when -a is specified */ break; case 'b': /* Flags/Code when -b is specified */ /* The argument passed in with b is specified */ /* by optarg */ break; case 'c': /* Flags/Code when -c is specified */ break; default: /* Code when there are no parameters */ break; } }
Command Line Arguments and Parameters :: getopt(3)
} if (optind < argc) { printf ("non-option ARGV-elements: "); while (optind < argc) printf ("%s ", argv[optind++]); printf ("\n");
Command Line Arguments and Parameters :: getopt(3)
● This code might be a bit confusing if taken in all at once.
– So if we had a program called "junk" and we called it from the command prompt as ./junk -b gradient yeehaw the variables would look like: Variable Contains ----------------- --------- argc 4 argv[0] "./junk" argv[1] "-b" argv[2] "gradient" argv[3] "yeehaw" optarg at case 'b' "gradient" optind after while 3 getopt loop
Input/Output and File I/O Review
● printf and scanf can be used for Input and Output, while the "f versions" of these can be used to modify streams. Make sure you check the return values!
● You can grab command line arguments and parameters through getopt
Pointers
● Pointers provide an indirect method of accessing variables. The reason why pointers are hard to understand is because they aren't taught in a manner that is understandable. Think for a minute about a typical textbook. It will usually have a table of contents, some chapters, and an index. What if you were looking in the Weiss book for information on printf? You'd look at the index for printf. The index will tell you where information on printf is located within the book. Conceptually, this is how pointers work! Re read this analogy if you didn't get it the first time. This is important. You must understand that pointers are
Pointers
● Why don't I just make all variables without the use of pointers? It's because sometimes you can't. What if you needed an array of ints, but didn't know the size of the array before hand? What if you needed a string, but it grew dynamically as the program ran? What if you need variables that are persistent through function use without declaring them global (remember the swap function)? They are all solved through the use of pointers. Pointers are also essential in creating larger custom data structures, such as linked lists.
Pointers
● So now that you understand how pointers work, let's define them a little better.
– A pointer when declared is just a reference. DECLARING A POINTER DOES NOT CREATE ANY SPACE FOR THE POINTER TO POINT TO. We will tackle this dynamic memory allocation issue later.
– A pointer is a reference to an area of memory in the heap. The heap is a dynamically allocated area of memory when the program runs..
Pointers :: Declaration and Syntax
● Pointers are declared by using the * in front of the variable identifier.
● For example: int *ip; float *fp = NULL;
Pointers :: Declaration and Syntax
● This declares a pointer, ip, to an integer. Let's say we want ip to point to an integer. The second line declares a pointer to a float, but initializes the pointer to point to the NULL pointer. ● The NULL pointer points to a place in memory that cannot be accessed. NULL is useful when checking for error conditions and many functions return NULL if they fail.
Pointers :: Declaration and Syntax
int x = 5; int *ip; ip = &x; ● We first encountered the & operator first in the I/O section. The & operator is to specify the address-of x. Thus, the pointer, ip is pointing to x by assigning the address of x. This is important. You must understand this concept.
Pointers :: Declaration and Syntax
● This brings up the question, if pointers contain addresses, then how do I get the actual value of what the pointer is pointing to?
● This is solved through the * operator. The * dereferences the pointer to the value. So, printf("%d %d\n", x, *ip); would print 5 5 to the screen.
Pointers :: Declaration and Syntax
● There is a critical difference between a dereference and a pointer declaration: int x = 0, y = 5, *ip = &y; x = *ip; ● The statement int *ip = &y; is different than x = *ip;. The first statement does not dereference, the * signifies to create a pointer to an int. ● The second statement uses a dereference.
Pointers :: Pointers and const Type Qualifier
● The const type qualifier can make things a little confusing when it is used with pointer declarations.
const int * const ip; /* The pointer *ip is const and what it points at is const */ int * const ip; /* The pointer *ip is const */ const int * ip; /* What *ip is pointing at is const */ int * ip; /* Nothing is const */ ● As you can see, you must be careful when specifying the const qualifier when using pointers.
Pointers :: void Pointers
● Void pointers can be assigned to any pointer value. It sometimes necessary to store/copy/move pointers without regard to the type it references.
● You cannot dereference a void pointer.
● Functions such as malloc, free, and scanf utilize void pointers.
Pointers :: Pointers to Functions
● Earlier, we said that you can pass functions as parameters into functions. This was essentially a reference, or pointer, passed into the function.
● There is an alternative way to declare and pass in functions as parameters into functions.
●
Pointers :: Pointer Arithmetic
C is one of the few languages that allows pointer arithmetic. In other words, you actually move the pointer reference by an arithmetic operation. For example: int x = 5, *ip = &x; ip++; ● On a typical 32-bit machine, *ip would be pointing to 5 after initialization. But ip++; increments the pointer 32-bits or 4-bytes. So whatever was in the next 4 bytes, *ip would be pointing at it.
● Pointer arithmetic is very useful when dealing with arrays, because arrays and pointers share a special relationship in C.
Pointers Review
● Pointers are an indirect reference to something else. They are primarily used to reference items that might dynamically change size at run time.
● Pointers have special operators, & and *. The & operator gives the address-of a pointer. The * dereferences the pointer (when not used in a pointer declaration statement).
● You must be careful when using const type qualifier. You have to also be cautious about the void pointer.
● C allows pointer arithmetic, which gives the programmer the freedom to move the pointer using simple arithmetic. This is very powerful, yet can lead to disaster if not used properly.
Arrays
● Arrays are a collection of items (i.e. ints, floats, chars) whose memory is allocated in a contiguous block of memory.
● Arrays and pointers have a special relationship. This is because arrays use pointers to reference memory locations. Therefore, most of the times, pointer and array references can be used interchangeably.
Arrays :: Declaration and Syntax
● A simple array of 5 ints would look like: int ia[5]; ● This would effectively make an area in memory (if availble) for ia, which is 5 * sizeof(int). We will discuss sizeof() in detail in Dynamic Memory Allocation. Basically sizeof() returns the size of what is being passed. On a typical 32-bit machine, sizeof(int) returns 4 bytes, so we would get a total of 20 bytes of memory for our array.
Arrays :: Declaration and Syntax
● How do we reference areas of memory within the array? By using the [ ] we can effectively "dereference" those areas of the array to return values.
Printf("%d ", ia[3]); ● This would print the fourth element in the array to the screen. Why the fourth? This is because array elements are numbered from 0.
Arrays :: Declaration and Syntax
● Note: You cannot initialize an array using a variable. ANSI C does not allow this. For example: int x = 5; int ia[x]; ● This above example is illegal. ANSI C restricts the array intialization size to be constant. So is this legal?
● int ia[]; ● No. The array size is not known at compile time.
Arrays :: Declaration and Syntax
● How can we get around this? By using macros we can also make our program more readable!
#define MAX_ARRAY_SIZE 5 /* .... code .... */ int ia[MAX_ARRAY_SIZE]; ● Now if we wanted to change the array size, all we'd have to do is change the define statement!
Arrays :: Declaration and Syntax
● But what if we don't know the size of our array at compile time? That's why we have Dynamic Memory Allocation.
● Can we initialize the contents of the array? Yes!
int ia[5] = {0, 1, 3, 4}; int ia[ ] = {0, 2, 1}; ● Both of these work. The first one, ia is 20 bytes long with 16 bytes initialized to 0, 1, 3, 4. The second one is also valid, 12 bytes initialized to 0, 2, 1. (Examples on a typical
●
Arrays :: Relationship with Pointers
So what's up with all this pointers are related to arrays junk? This is because an array name is just a pointer to the beginning of the allocated memory space. This causes "problems" in C Let's take this example and analyze it: int ia[6] = {0, 1, 2, 3, 4, 5}; /* 1 */ int *ip; /* 2 */ ip = ia; /* equivalent to ip = &ia[0]; */ /* 3 */ ip[3] = 32; /* equivalent to ia[3] = 32; */ /* 4 */ ip++; /* ip now points to ia[1] */ /* 5 */ printf("%d ", *ip); /* prints 1 to the screen */ /* 6 */ ip[3] = 52; /* equivalent to ia[ 4] = 52 */ /* 7 */
Arrays :: Relationship with Pointers
● Ok, so what's happening here? Let's break this down one line at a time. Refer to the line numbers on the side: 1. Initialize ia 2. Create ip: a pointer to an int 3. Assign ip pointer to ia. This is effectively assigning the pointer to point to the first position of the array.
Arrays :: Relationship with Pointers
4. Assign the fourth position in the array to 32. But how? ip is just a pointer?!?! But what is ia? Just a pointer! (heh) 5. Use pointer arithmetic to move the pointer over in memory to the next block. Using pointer arithmetic automatically calls sizeof().
6. Prints ia[1] to the screen, which is 1 7. Sets ia[4] to 52. Why the fifth position? Because ip points to ia[1] from the ip++ line.
● Now it should be clear. Pointers and arrays have a special relationship because arrays are actually just a pointer to a block of memory!
Arrays :: Multidimensional
● Sometimes its necessary to declare multidimensional arrays. In C, multidimensional arrays are row major. In other words, the first bracket specifies number of rows. Some examples of multidimensional array declarations: int igrid[2][3] = { {0, 1, 2}, {3, 4, 5} }; int igrid[2][3] = { 0, 1, 2, 3, 4, 5 }; int igrid[ ][4] = { {0, 1, 2, 3}, {4, 5, 6, 7}, {8, 9} }; int igrid[ ][2];
Arrays :: Multidimensional
● The first three examples are valid, the last one is not. As you can see from the first two examples, the braces are optional. The third example shows that the number of rows does not have to be specified in an array initialization.
Arrays :: Multidimensional
● But what if we stored pointers in our arrays? ● This would effectively create a multidimensional array! ● Since reinforcement of material is key to learning it, let's go back to getopt. ● Remember the variable argv? It can be declared in the main function as either **argv or *argv[]. What does **argv mean? ● It looks like we have two pointers or something.
Arrays :: Multidimensional
● This is actually a pointer to a pointer. The *argv[] means the same thing, right? Imagine (pardon the crappy graphics skills): argv +---+ | 0 | ---> "./junk" +---+ | 1 | ---> "-b" +---+ | 2 | ---> "gradient" +---+ | 3 | ---> "yeehaw" +---+
Arrays :: Multidimensional
● So what would argv[0][1] be? ● It would be the character '/'. Why is this? It's because strings are just an array of characters. So in effect, we have a pointer to the actual argv array and a pointer at each argv location to each string. A pointer to a pointer.
Arrays :: Limitations
● Because names of arrays represents just a pointer to the beginning of the array, we have some limitations or "problems." 1. No Array Out of Bounds Checking. For example: int ia[2] = {0, 1}; printf("%d ", ia[2]); – The above code would segfault, because you are trying to look at an area of memory not inside the array memory allocation.
– 2. Array Size Must be Constant or Known at Compile time. – 3. Arrays Cannot be Copied or Compared. Why? Because they are pointers. See Weiss pg. 149 for a more in-depth explanation.
Arrays :: Limitations
● Another limitation comes with arrays being passed into functions. Take for example: void func(int ia[]) void func(int *ia) ● Both are the same declaration (you should know why by now). But why would this cause problems? Because only the pointer to the array is passed in, not the whole array. So what if you mistakenly did a sizeof(ia) inside func? Instead of returning the sizeof the whole array, it would only return the size of a single element in the array.
Arrays Review
● Arrays are, in simple terms, just a pointer! Remember that!
● There isn't much to review. Remember arrays have limitations because they are inherently just a pointer
Dynamic Memory Allocation :: sizeof()
● We have already seen this function in the array section. To recap, sizeof() returns a size_t of the item passed in. So on a typical 32-bit machine, sizeof(int) returns 4 bytes. size_t is just an unsigned integer constant.
● Sizeof() is helpful when using malloc or calloc calls. Note that sizeof() does not always return what you may expect
Dynamic Memory Allocation :: malloc(3), calloc(3), bzero(3), memset(3)
● The prototype for malloc(3) is: void *malloc(size_t size); ● malloc takes in a size_t and returns a void pointer. Why does it return a void pointer? Because it doesn't matter to malloc to what type this memory will be used for.
●
Dynamic Memory Allocation :: malloc(3), calloc(3), bzero(3), memset(3)
Let's see an example of how malloc is used: int *ip; ip = malloc(5 * sizeof(int)); /* .. OR .. */ ip = malloc(5 * sizeof(ip)); ● Pretty simplistic. sizeof(int) returns the sizeof an integer on the machine, multiply by 5 and malloc that many bytes. The second malloc works because it sends what ip is pointing to, which is an int.
●
Dynamic Memory Allocation :: malloc(3), calloc(3), bzero(3), memset(3)
Wait... we're forgetting something. AH! We didn't check for return values. Here's some modified code: #define INITIAL_ARRAY_SIZE 5 /* ... code ... */ int *ip; } if ((ip = malloc(INITIAL_ARRAY_SIZE * sizeof(int))) == NULL) { (void)fprintf(stderr, "ERROR: Malloc failed"); (void)exit(EXIT_FAILURE); /* or return EXIT_FAILURE; */ ● Now our program properly prints an error message and exits gracefully if malloc fails.
●
Dynamic Memory Allocation :: malloc(3), calloc(3), bzero(3), memset(3)
calloc(3) works like malloc, but initializes the memory to zero if possible. The prototype is: void *calloc(size_t nmemb, size_t size);
●
Dynamic Memory Allocation :: malloc(3), calloc(3), bzero(3), memset(3)
bzero(3) fills the first n bytes of the pointer to zero. Prototype: void bzero(void *s, size_t n); ● If you need to set the value to some other value (or just as a general alternative to bzero), you can use memset: void *memset(void *s, int c, size_t n); ● where you can specify c as the value to fill for n bytes of pointer s.
Dynamic Memory Allocation :: realloc(3)
● What if we run out of allocated memory during the run-time of our program and need to give our collection of items more memory?
● Enter realloc(3), it's prototype: void *realloc(void *ptr, size_t size); ● realloc takes in the pointer to the original area of memory to enlarge and how much the total size should be.
Dynamic Memory Allocation :: realloc(3)
● So let's give it a try: ip = realloc(ip, sizeof(ip) + sizeof(int)*5); ● Ah... Now we have some more space, by simply giving it the sizeof the complete array and then adding 5 spaces for ints.
●
Dynamic Memory Allocation :: realloc(3)
First, sizeof(ip) does not give the size of the allocated space originally allocated by malloc (or a previous realloc). Using sizeof() on a pointer only returns the sizeof the pointer, which is probably not what you intended.
● Also, what happens if the realloc on ip fails? ip gets set to NULL, and the previously allocated memory to ip now has no pointer to it. Now we have allocated memory just floating in the heap without a pointer. This is called a memory leak. This can happen from sloppy realloc's and not using free on malloc'd space.
Dynamic Memory Allocation :: realloc(3)
● So what is the correct way? Take this code for example: int *tmp; if ((tmp = realloc(ip, sizeof(int) * (INITIAL_ARRAY_SIZE + 5))) == NULL) { /* Possible free on ip? Depends on what you want */ fprintf(stderr, "ERROR: realloc failed"); } ip = tmp;
Dynamic Memory Allocation :: realloc(3)
● Now we are creating a temporary pointer to try a realloc. If it fails, then it isn't a big problem as we keep our ip pointer on the original memory space. Also, note that we specified the real size of our original array and now are adding 5 more ints (so 4bytes*(5+5) = 40bytes, on a typical 32-bit machine).
Dynamic Memory Allocation :: free(3)
● Now that we can malloc, calloc, and realloc we need to be able to free the memory space if we have no use for it anymore. Like we mentioned above, any memory space that loses its pointer or isn't free'd is a memory leak.
● So what's the prototype for free(3)? Here it is: void free(void *ptr); ● Free simply takes in a pointer to free. Not challenging at all. Note that free can take in NULL, as specified by ANSI.
Dynamic Memory Allocation :: Multi-dimensional Structures
● It's nice that we can create a ``flat" structure, like an array of 100 doubles. But what if we want to create a 2D array of doubles at runtime? This sounds like a difficult task, but it's actually simple!
● As an example, lets say we are reading in a file of x, y, z coordinates from a file of unknown length. The incorrect method to approach this task is to create an arbitrarily large 2D array with hopefully enough rows or entries. Instead of leaving our data structure to chance, let's just dynamically allocate, and re-allocate on the fly.
Dynamic Memory Allocation :: Multi-dimensional Structures
● First, let's define a few macros to keep our code looking clean: #define oops(s) { perror((s)); exit(EXIT_FAILURE); } #define MALLOC(s,t) if(((s) = malloc(t)) == NULL) { oops("error: malloc() "); } #define INCREMENT 10 ● MALLOC macro simply takes in the pointer (s) to the memory space to be allocated (t). oops is called when malloc fails, returning the error code from malloc and exits the program. INCREMENT is the default amount of memory to allocate when we run out of allocated space.
Dynamic Memory Allocation :: Multi-dimensional Structures
● On to the dynamic memory allocation!
double **xyz; int i; MALLOC(xyz, sizeof(double *) * INCREMENT); } for (i = 0; i < INCREMENT; i++) { MALLOC(xyz[i], sizeof(double) * 3);
Dynamic Memory Allocation :: Multi-dimensional Structures
● What's going on here? Our double pointer, xyz is our actual storage 2D array. We must use a double pointer, because we are pointing to multiple pointers of doubles! If this sounds confusing, think of it this way. Instead of each array entry having a real double entry, each array position contains a pointer to another array of doubles! Therefore, we have our desired 2D array structure.
Dynamic Memory Allocation :: Multi-dimensional Structures
● The first MALLOC call instructs malloc to create 10 double pointers in the xyz array. So each of these 10 array positions now has an unitializied pointer to data of type pointer to a double. The for loop goes through each array position and creates a new array at each position to three doubles, because we want to read in x, y, z coordinates for each entry. The total space we just allocated is 10 spaces of 3 doubles each. So we've just allocated 30 double spaces.
●
Dynamic Memory Allocation :: Multi-dimensional Structures
What if we run out of space? How do we reallocate?
● ● ● ● ● ● ● ● ● ● ● ● ● double **tmp; } int current_size, n; /* clip ... other code */ if (current_size >= n) { if ((tmp = realloc(xyz, sizeof(double *) * (n + INCREMENT)) == NULL) { oops("realloc() error! "); } for (i = n; i < n + INCREMENT; i++) { MALLOC(tmp[i], sizeof(double) * 3); } n += INCREMENT; xyz = tmp;
●
Dynamic Memory Allocation :: Multi-dimensional Structures
What's going on here? Suppose our file of x, y, z coordinates is longer than 10 lines. On the 11th line, we'll invoke the realloc(). n is the current number of rows allocated. current_size indicates the number of rows we are working on (in our case, the expression would be 10 >= 10). We instruct realloc to reallocate space for xyz of (double *) type, or double pointers of the current size (n) plus the INCREMENT. This will give us 10 additional entries. Remember NEVER reallocate to the same pointer!!
●
Dynamic Memory Allocation :: Multi-dimensional Structures
If realloc() succeeds, then we need to allocate space for the double array of size 3 to hold the x, y, z coordinates in the new xyz realloc'd array. Note the for loop, where we start and end. Then we cleanup by providing our new max array size allocated (n) and setting the xyz double pointer to the newly realloc'd and malloc'd space, tmp.
Dynamic Memory Allocation ::
● right? What if we're done with our array? We should free it!
for (i = 0; i < n; i++) { free(xyz[i]); } free(xyz); ● The above code free's each entry in the xyz array (the actual double pointers to real data) and then we free the pointer to a pointer reference. The statements cannot be reversed, because you'll lose the pointer reference to each 3-entry double array!
Dynamic Memory Allocation Review
● You have powerful tools you can use when allocating memory dynamically: sizeof, malloc, calloc, realloc, and free.
● Take precautions when using the actual memory allocation functions for memory leaks, especially with realloc. Remember, always check for NULL with malloc! Your programs will thank you for it.
Dynamic Memory Allocation Review
● You have powerful tools you can use when allocating memory dynamically: sizeof, malloc, calloc, realloc, and free.
● Take precautions when using the actual memory allocation functions for memory leaks, especially with realloc. Remember, always check for NULL with malloc! Your programs will thank you for it.
Strings :: Declaration and Syntax
● Let's see some examples of string declarations: char str[5] = {'l', 'i', 'n', 'u', 'x'}; char str[6] = {'l', 'i', 'n', 'u', 'x', '\0'}; char str[3]; char str[ ] = "linux"; char str[5] = "linux"; char str[9] = "linux";
Strings :: Declaration and Syntax
● The first one is a valid declaration, but will cause major problems because it is not null terminated. ● The second example shows a correct null terminated string. The special escape character \0 denotes string termination. ● The fifth example also suffers the same problem. The fourth example, however does not. This is because the compiler will determine the length of the string and automatically initialize the last character to a
Strings :: Dynamic Memory Allocation
● You must be careful to allocate one additional space to contain the null-terminator.
char *s; if ((s = malloc(sizeof(char) * 5)) == NULL) { /* ERROR Handling code */ } strcpy(s, "linux"); printf("%s\n", s);
Strings :: Dynamic Memory Allocation
● This would result in a bunch of junk being printed to the screen. printf will try to print the string, but will continue to print past the allocated memory for s, because there is no null-terminator. ● The simple solution would be to add 1 to the malloc call.
Strings :: Dynamic Memory Allocation
● You must be particularly careful when using malloc or realloc in combination with strlen. strlen returns the size of a string minus the null-terminator.
● What's wrong with the following code: char s1[ ] = "linux"; char *s2; strcpy(s2, s1); ● Remember that simply declaring a pointer does not create any space for the pointer to point to (remember that?).
Strings :: string.h Library
● You can add support for string operations via the string.h library. ● Below is a listing of prototypes for commonly used functions in string.h: size_t strlen(const char *s); char *strdup(const char *s); char *strcpy(char *dest, const char *src); char *strncpy(char *dest, const char *src, size_t n); char *strcat(char *dest, const char *src);
Strings :: string.h Library
● Below is a listing of prototypes for commonly used functions in string.h: char *strncat(char *dest, const char *src, size_t n); int strcmp(const char *s1, const char *s2); int strncmp(const char *s1, const char *s2, size_t n); int atoi(const char *nptr); double atof(const char *nptr);
Strings Review
● Strings are just character arrays. Nothing more, nothing less.
● Strings must be null-terminated if you want to properly use them.
● Remember to take into account null terminators when using dynamic memory allocation.
● The string.h library has many useful functions.
● Most of the I/O involved with strings was
Structures
● A structure in C is a collection of items of different types. You can think of a structure as a "record" is in Pascal or a class in Java without methods.
● Structures, or structs, are very useful in creating data structures larger and more complex than the ones we have discussed so far. We will take a cursory look at some more complex ones in the next section.
●
Structures :: Declaration and Syntax
So how is a structure declared and initialized? Let's look at an example: struct student { char *first; char *last; char SSN[9]; float gpa; }; char **classes; struct student student_a, student_b;
Structures :: Declaration and Syntax
● Another way to declare the same thing is: struct { char *first; char *last; char SSN[10]; float gpa; char **classes; } student_a, student_b;
●
Structures :: Declaration and Syntax
The "better" method of initializing structs is: struct student_t { char *first; char *last; char SSN[10]; float gpa; char **classes; } student, *pstudent; ● Now we have created a student_t student and a student_t pointer. The pointer allows us greater flexibility (e.g. Create lists of students).
Structures :: Declaration and Syntax
● How do you go about initializing a struct? You could do it just like an array initialization. But be careful, you can't initialize this struct at declaration time because of the pointers.
● But how do we access fields inside of the structure? C has a special operator for this called "member of" operator denoted by . (period). For example, to assign the SSN of student_a: Strcpy(student_a.SSN, "111223333\0");
Structures :: Pointers to Structs
● Sometimes it is useful to assign pointers to structures (this will be evident in the next section with self referential structures). Declaring pointers to structures is basically the same as declaring a normal pointer: struct student *student_a; ● But how do we dereference the pointer to the struct and its fields? You can do it in one of two ways, the first way is: printf("%s\n", (*student_a).SSN); This would get the SSN in student_a. Messy and the
Structures :: Pointers to Structs
● To dereference, you can use the infix operator: ->. The above example using the new operator: printf("%s\n", student_a->SSN); ● If we malloc'd space for the structure for *student_a could we start assigning things to pointer fields inside the structure? No. You must malloc space for each individual pointer within the structure that is being pointed to.
Structures :: typedef
● There is an easier way to define structs or you could "alias" types you create. For example: typedef struct { char *first; char *last; char SSN[9]; float gpa; char **classes; } student; student student_a;
Structures :: typedef
● Now we get rid of those silly struct tags. You can use typedef for non-structs: typedef long int *pint32; pint32 x, y, z; ● x, y and z are all pointers to long ints. typedef is your friend. Use it.
● ● ● ● ● ●
Structures :: Unions
● Unions are declared in the same fashion as structs, but have a fundamental difference. Only one item within the union can be used at any time, because the memory allocated for each item inside the union is in a shared memory location. Why you ask? An example first: struct conditions { float temp; union feels_like { float wind_chill; float heat_index; } } today;
Structures :: Unions
● As you know, wind_chill is only calculated when it is "cold" and heat_index when it is "hot". There is no need for both. So when you specify the temp in today, feels_like only has one value, either a float for wind_chill or a float for heat_index.
● Types inside of unions are unrestricted, you can even use structs within unions.
Structures :: Enumerated Types
● What if you wanted a series of constants without creating a new type? Enter enumerated types. Say you wanted an "array" of months in a year: enum e_months {JAN=1, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC}; typedef enum e_months month; month currentmonth; currentmonth = JUN; /* same as currentmonth = 6; */ printf("%d\n", currentmonth);
Structures :: Enumerated Types
● We are enumerating the months in a year into a type called month. You aren't creating a type, because enumerated types are simply integers. Thus the printf statement uses %d, not %s.
● If you notice the first month, JAN=1 tells C to make the enumeration start at 1 instead of 0.
Structures :: Enumerated Types
● Note: This would be almost the same as using: #define JAN 1 #define FEB 2 #define MAR 3 /* ... etc ... */
Structures :: Abilities and Limitations
● You can create arrays of structs.
● Structs can be copied or assigned.
● The & operator may be used with structs to show addresses.
● Structs can be passed into functions. Structs can also be returned from functions.
● Structs cannot be compared!
Structures Review
● Structures can store non-homogenous data types into a single collection, much like an array does for common data (except it isn't accessed in the same manner).
● Pointers to structs have a special infix operator: -> for dereferencing the pointer.
● typedef can help you clear your code up and can help save some keystrokes.
● Enumerated types allow you to have a series of constants much like a series of #define statements.
Linked Lists, Trees, Hash Tables (Data Structures)
● The Data Structures presented here all require pointers to structs, or more specifically they are self-referential structures.
● These self-referential structures contain pointers within the structs that refer to another identical structure.
Data Structures :: Linked Lists
● Linked lists are the most basic self-referential structures. Linked lists allow you to have a chain of structs with related data.
● So how would you go about declaring a linked list? It would involve a struct and a pointer: struct llnode {
Data Structures :: Linked Lists
● The
Data Structures :: Linked Lists
● Note that even the typedef is specified, the next pointer within the struct must still have the struct tag!
● There are two ways to create the root node of the linked list. One method is to create a head pointer and the other way is to create a dummy node. It's usually easier to create a head pointer.
●
Data Structures :: Linked Lists
Now that we have a node declaration down, how do we add or remove from our linked list? Simple! Create functions to do additions, removals, and traversals.
● Additions: A sample Linked list addition function: void add(llnode **head,
Data Structures :: Linked Lists
} tmp->data = data_in; tmp->next = *head; *head = tmp; /* ... inside some function ... */ llnode *head = NULL;
Data Structures :: Linked Lists
● What's happening here? We created a head pointer, and then sent the address-of the head pointer into the add function which is expecting a pointer to a pointer. We send in the address-of head. Inside add, a tmp pointer is allocated on the heap. The data pointer on tmp is moved to point to the data_in. The next pointer is moved to point to the head pointer (*head). Then the head pointer is moved to point to tmp. Thus we have added to the beginning of the list.
Data Structures :: Linked Lists
● Removals: You traverse the list, querying the next struct in the list for the target. If you get a match, set the current target next's pointer to the pointer of the next pointer of the target. Don't forget to free the node you are removing (or you'll get a memory leak)! You need to take into consideration if the target is the first node in the list. There are many ways to do this (i.e. recursively). Think about it!
Data Structures :: Linked Lists
● Traversals: Traversing list is simple, just query the data part of the node for pertinent information as you move from next to next. There are different methods for traversing trees (see Trees).
Data Structures :: Linked Lists
● What about freeing the whole list? You can't just free the head pointer! You have to free the list. A sample function to free a complete list: void freelist(llnode *head) { llnode *tmp; while (head != NULL) { free(head->data); /* Don't forget to free memory within the list! */ tmp = head->next; free(head); head = tmp; } }
Data Structures :: Stacks
● Stacks are a specific kind of linked list. They are referred to as LIFO or Last In First Out.
● Stacks have specific adds and removes called push and pop. Pushing nodes onto stacks is easily done by adding to the front of the list. Popping is simply removing from the front of the list.
● It would be wise to give return values when pushing and popping from stacks. For example, pop can return the struct that was popped.
Data Structures :: Queues
● Queues are FIFO or First In First Out. Think of a typical (non-priority) printer queue: The first jobs submitted are printed before jobs that are submitted after them.
● Queues aren't more difficult to implement than stacks. By creating a tail pointer you can keep track of both the front and the tail ends of the list.
● So you can enqueue onto the tail of the list, and dequeue from the front of the list!
Data Structures :: Hash Tables
● So what's the problem with linked lists? Their efficiency isn't that great. (In Big-O notation, a linked list performs O(n)). Is there a way to speed up data structures?
● Enter hash tables. Hash tables provide O(1) performance while having the ability to grow dynamically. The key to a well-performing hash table is understanding the data that will be inserted into it. By custom tailoring an array of pointers, you can have O(1) access.
Data Structures :: Hash Tables
● But you are asking, how do you know where a certain data piece is in within the array? This is accomplished through a key. A key is based off the data, the most simple one's involve applying a modulus to a certain piece of information within the data. The general rule, is that if a key sucks, the hash table sucks.
● What about collisions (e.g. same key for two different pieces of information)? There are many ways to resolve this, but the most popular way is through coalesced chaining. You can create a linked list from the array position to hold multiple data pieces, if
Data Structures :: Trees
● Another variation of a linked list is a tree. A simple binary tree involves having two types of "next" pointers, a left and a right pointer. You can halve your access times by splitting your data into two different paths, while keeping a uniform data structure. But trees can degrade into linked list efficiency.
● There are different types of trees, some popular ones are self-balancing. AVL trees are a typical type of tree that can move nodes around so that the tree is balanced without a >1 height difference between levels.
Data Structures Review
● Linked lists, stacks, queues, hash tables, trees are all different types of data structures that can help accommodate almost any type of data.
● Other data structures exist such as graphs. That is beyond the scope of this tutorial.
● If you want a more in-depth look at the data structures discussed here, refer to Weiss chapter 10, pg. 257-291 and chapter 11 pg. 311-318 for information on binary search trees.
Make and Makefiles Overview
● Make allows a programmer to easily keep track of a project by maintaining current versions of their programs from separate sources. Make can automate various tasks for you, not only compiling proper branch of source code from the project tree, but helping you automate other tasks, such as cleaning directories, organizing output, and even debugging.
Make and Makefiles :: An Introduction
● If you had a program called hello.c, you could simply call make in the directory, and it would call cc (gcc) with -o hello option. But this isn't why make is such a nice tool for program building and management.
● The power and ease of use of make is facilitated through the use of a Makefile.
● Make parses the Makefile for directives and according to what parameters you give make, it will execute those rules.
Make and Makefiles :: An Introduction
● Rules take the following form: target target_name : prerequisites ...
command ...
● The target is the parameter you give make. For example make clean would cause make to carry out the target_name called clean. If there are any prerequisites to process, they make will do those before proceeding. The commands would then be executed under the target.
Make and Makefiles :: An Introduction
● NOTE: The commands listed must be TABBED over!
Make and Makefiles :: An Introduction
● Examples? The following Makefile below is a very simple one taken from the GNU make manual: edit : main.o kbd.o command.o display.o insert.o search.o files.o utils.o
cc -o edit main.o kbd.o command.o display.o insert.o search.o \ files.o utils.o
main.o : main.c defs.h
cc -c main.c
kbd.o : kbd.c defs.h command.h
cc -c kbd.c
command.o : command.c defs.h command.h
cc -c command.c
display.o : display.c defs.h buffer.h
Make and Makefiles :: An Introduction
cc -c display.c
insert.o : insert.c defs.h buffer.h
cc -c insert.c
search.o : search.c defs.h buffer.h
cc -c search.c
files.o : files.c defs.h buffer.h command.h
cc -c files.c
utils.o : utils.c defs.h
cc -c utils.c
clean : rm edit main.o kbd.o command.o display.o insert.o search.o \ files.o utils.o
Make and Makefiles :: An Introduction
● Now if you change just kbd.c, it will only recompile kbd.c into it's object file and then relink all of the object files to create edit. Much easier than recompiling the whole project!
● But that's still too much stuff to write! Use make's smarts to deduce commands. The above example re-written (taken from GNU make manual): objects = main.o kbd.o command.o display.o insert.o search.o files.o utils.o
Make and Makefiles :: An Introduction
● ● ● ● ● ● ● ● ● ● ● ● ● edit : $(objects) cc -o edit $(objects) main.o : defs.h
kbd.o : defs.h command.h
command.o : defs.h command.h
display.o : defs.h buffer.h
insert.o : defs.h buffer.h
search.o : defs.h buffer.h
files.o : defs.h buffer.h command.h
utils.o : defs.h
.PHONY: clean clean : rm edit $(objects)
●
Make and Makefiles :: An Introduction
So what changed? Now we have a grouping of objects containing all of our object files so that the edit target only requires this variable. You may also notice that all of the .c files are missing from the prerequisite line. This is because make is deducing that the c source is a required part of the target and will automatically use the c source file associated with the object file to compile. What about the .PHONY target? Let's say you actually have a file called "clean". If you had just a clean target without the .PHONY, it would never clean. To avoid this, you can use the .PHONY target. This isn't used that often because it is rare to have a file called "clean" in the target directory... but who knows if you might have
Make and Makefiles :: Beyond Simple
● You can include other Makefiles by using the include directive.
● You can create conditional syntax in Makefiles, using ifdef, ifeq, ifndef, ifneq.
● You can create variables inside of Makefiles, like the $(objects) above.
●
Make and Makefiles :: Beyond Simple
Let's use a different example. The hypothetical source tree: moo.c
/ \ / \ foo.c bar.c
/ \ / \ / \ baz.c loop.h dood.c shazbot.c
/ \ / \ / \ ------ ------ mop.c
Make and Makefiles :: Beyond Simple
● Let's create a more complex, yet easier to maintain Makefile for this project: Source, Executable, Includes, Library Defines INCL = loop.h defs.h
SRC = moo.c foo.c bar.c baz.c dood.c shazbot.c mop.c woot.c
OBJ = $(SRC:.c=.o) LIBS = -lgen
Make and Makefiles :: Beyond Simple
EXE = moolicious ● Compiler, Linker Defines CC = /usr/bin/gcc CFLAGS = -ansi -pedantic -Wall -O2 LIBPATH = -L.
LDFLAGS = -o $(EXE) $(LIBPATH) $(LIBS) CFDEBUG = -ansi -pedantic -Wall -g DDEBUG $(LDFLAGS) RM = /bin/rm -f
Make and Makefiles :: Beyond Simple
● Compile and Assemble C Source Files into Object Files %.o: %.c
$(CC) -c $(CFLAGS) $*.c
# Link all Object Files with external Libraries into Binaries $(EXE): $(OBJ) $(CC) $(LDFLAGS) $(OBJ)
Make and Makefiles :: Beyond Simple
● Objects depend on these Libraries $(OBJ): $(INCL) ● # Create a gdb/dbx Capable Executable with DEBUG flags turned on debug: $(CC) $(CFDEBUG) $(SRC)
Make and Makefiles :: Beyond Simple
● Clean Up Objects, Exectuables, Dumps out of source directory clean: $(RM) $(OBJ) $(EXE) core a.out
Make and Makefiles :: Beyond Simple
● Now we have a clean and readable Makefile that can manage the complete source tree. (Remember to use tabs on the command lines!) ● You can manipulate lots and lots of things with Makefiles. I cannot possibly cover everything in-depth in this short tutorial. You can navigate between directories (which can have separte Makefiles/rules), run shell commands, and various other tasks with make.
Debugging Techniques
● Why Use A Debugger?
– Many times our code might seem to work correctly, because we didn't test it under enough scenarios. Other times we know there's a bug, but by just reading the code we don't notice it is there. Thus, we should develop a habit of launching a debugger when we get into trouble. It shouldn't come instead of making an effort to write correct code, to add many tests in the code for invalid function arguments, NULL pointers, etc. But when we're in trouble, it's probably our best shot.
Debugging Techniques :: Non interactive
● You can debug your code by placing #ifdef DEBUG and corresponding #endif statements around debug code. For example: #ifdef DEBUG PRINTF(("Variables Currently Contain: %d, %f, %s\n", *pi, *pf[1], str)); #endif ● You can specify a DEBUG define at compile time by issuing gcc with the -DDEBUG command option.
Debugging Techniques :: Non interactive
● Note: This can be even further simplified into a single command called DPRINTF, so you don't even have to write the #ifdef #endif directives!
Debugging Techniques :: GNU gdb
● gdb is a powerful program in tracking down Segmentation Faults and Core Dumps. It can be used for a variety of debugging purposes though.
● First thing you must do is compile with the -g option and without any optimization (i.e. no O2 flag).
● Once you do that, you can run gdb
Debugging Techniques :: GNU gdb
● gdb should load with the executable to run on. Now you can create breakpoints where you want the the execution to stop. This can be specified with the line number in the corresponding c source file. For example: break 376 would instruct gdb to stop at line 376.
● You can now run the program by issuing the run command. If your program requires command-line options or parameters, you can specify them with the run command. For example: run 4 -s Doc! where 4, -s, Doc! are
●
Debugging Techniques :: GNU gdb
The program should run until the breakpoint or exit on a failure. If it fails before the breakpoint you need to re-examine where you should specify the break. Repeat the breakpoint step and rerun. If your program stops and shows you the breakpoint line, then you can step into the function.
● To step into the function use the step command. NOTE: Do not step into system library calls (e.g. printf). You can use the command next over these types of calls or over local function calls you don't wish to step into. You can repeat the last command
Debugging Techniques :: GNU gdb
● You can use the continue command to tell gdb to continue executing until the next breakpoint or it finishes the program.
● If you want to peek at variables, you can issue the print command on the variable. For example: print mystruct->data.
● You can also set variables using the set command. For example: set mystruct->data = 42.
● The ptype command can tell you what type a particular variable is.
Debugging Techniques :: GNU gdb
● The commands instruction tells gdb to set a particular number of commands and to report them to you. For example, commands 1 will allow you to enter in a variable number of other commands (one per line, end it with "end"), and will report those commands to you once breakpoint 1 is hit.
● The clear command tells gdb to clear a specified breakpoint.
● The list command can tell you where you are at in the particular code block.
Debugging Techniques :: GNU gdb
● You can specify breakpoints not only with lines but with function names.
● For more information on other commands, you can issue the help command inside gdb.
Debugging Techniques :: GNU gdb
● Invoking the "gdb" Debugger – Before invoking the debugger. make sure you compiled your program (all its modules, as well as during linking) with the "-g" flag. Otherwise, life will be tough. – Lets compile the "debug_me.c" program, and then invoke "gdb" to debug it: gcc -g debug_me.c -o debug_me gdb debug_me
Debugging Techniques :: GNU gdb
● Invoking the "gdb" Debugger – Note that we run the program from the same directory it was compiled in, otherwise gdb won't find the source file, and thus won't be able to show us where in the code we are at a given point. It is possible to ask gdb to search for extra source files in some directory after launching it, but for now, it's easier to just invoke it from the correct directory.
Debugging Techniques :: GNU gdb
● Running A Program Inside The Debugger – Once we invoked the debugger, we can run the program using the command "run". If the program requires command line parameters (like our debug_me program does), we can supply them to the "run" command of gdb. – For example: run "hello, world" "goodbye, world" – Note that we used quotation marks to denote that "hello, world" is a single parameter, and not to separate parameters (the debugger assumes white-space separates the parameters).
Debugging Techniques :: GNU gdb
● Setting Breakpoints – The problem with just running the code is that it keeps on running until the program exits, which is usually too late. – For this, breakpoints are introduced. A break point is a command for the debugger to stop the execution of the program before executing a specific source line.
●
Debugging Techniques :: GNU
Setting Breakpoints
gdb
– We can set break points using two methods: – 1. Specifying a specific line of code to stop in: break debug_me.c:9 – Will insert a break point right before checking the command line arguments in our program – 2. Specifying a function name, to break every time it is being called: break main – this will set a break point right when starting the program
Debugging Techniques :: GNU gdb
● Stepping A Command At A Time – So lets see, we've invoked gdb, then typed: break main run "hello, world" "goodbye, world"
Debugging Techniques :: GNU gdb
● Stepping A Command At A Time – Then the debugger gave something like the following: – Starting program: /usr/home/choo/work/c-on unix/debug_me – warning: Unable to find dynamic linker breakpoint function.
– warning: GDB will be unable to debug shared library initializers – warning: and track explicitly loaded dynamic code.
Breakpoint 1, main (argc=1, argv=0xbffffba4) at
Debugging Techniques :: GNU gdb
● Stepping A Command At A Time – 9 if (argc < 2) { /* 2 - 1 for program name (argv[0]) and one for a param. */ – (gdb)
Debugging Techniques :: GNU gdb
● Stepping A Command At A Time – Note that you won't always get the warnings i got - it just goes to show you how lousy my system setup is. In any case, these warnings are not relevant to our code, as we do not intend to debug any shared libraries.
Debugging Techniques :: GNU gdb
● Stepping A Command At A Time – Now we want to start running the program slowly, step by step. There are two options for that: 1. "next" - causes the debugger to execute the current command, and stop again, showing the next command in the code to be executed.
2. "step" - causes the debugger to execute the current command, and if it is a function call break at the beginning of that function. This is useful for debugging nested code.
Debugging Techniques :: GNU gdb
● Stepping A Command At A Time – Now is your time to experiment with these options with our debug program, and see how they work. It is also useful to read the debuggers help, using the command "help break" and "help breakpoints" to learn how to set several breakpoints, how to see what breakpoints are set, how to delete breakpoints, and how to apply conditions to breakpoints (i.e. make them stop the program only if a certain expression evaluates to "true" when the breakpoint is reached).
Debugging Techniques :: GNU gdb
● Printing Variables And Expressions – Without being able to examine variables contents during program execution, the whole idea of using a debugger is quite lost. You can print the contents of a variable with a command like this: print i – And then you'll get a message like: $1 = 0 – which means that "i" contains the number "0".
●
Debugging Techniques :: GNU gdb
Printing Variables And Expressions – Note that this requires "i" to be in scope, or you'll get a message such as No symbol "i" in current context. – For example, if you break inside the "print_string" function and try to print the value of "i", you'll get this message.
– You may also try to print more complex expressions, like "i*2", or "argv[3]", or "argv[argc]", and so on. In fact, you may also use type casts, call functions found in the program, and whatever your sick mind could imagine (well, almost). Again, this is a good time to try this out.
Debugging Techniques :: GNU gdb
● Examining The Function Call Stack – Once we got into a break-point and examined some variables, we might also wish to see "where we are". That is, what function is being executed now, which function called it, and so on. This can be done using the "where" command. At the gdb command prompt, just type "where", and you'll see something like this: #0 print_string (num=1, string=0xbffffc9a "hello") at debug_me.c:7 #1 0x80484e3 in main (argc=1, argv=0xbffffba4) at debug_me.c:23
Debugging Techniques :: GNU gdb
● Examining The Function Call Stack – This means the currently executing function is "print_string", at file "debug_me.c", line 7. The function that called it is "main". We also see which arguments each function had received. If there were more functions in the call chain, we'd see them listed in order. This list is also called "a stack trace", since it shows us the structure of the execution stack at this point in the program's life.
Debugging Techniques :: GNU gdb
● Examining The Function Call Stack – Just as we can see contents of variables in the current function, we can see contents of variables local to the calling function, or to any other function on the stack. For example, if we want to see the contents of variable "i" in function "main", we can type the following two commands: frame 1 print i
Debugging Techniques :: GNU gdb
● Examining The Function Call Stack – The "frame" command tells the debugger to switch to the given stack frame ('0' is the frame of the currently executing function). At that stage, any print command invoked will use the context of that stack frame. Of-course, if we issue a "step" or "next" command, the program will continue at the top frame, not at the frame we requested to see. After all, the debugger cannot "undo" all the calls and continue from there.
Debugging Techniques :: GNU gdb
● Debugging A Crashed Program – One of the problems about debugging programs, has to do with Murphy's law: A program will crash when least expected. This phrase just means that after you take the program out as production code, it will crash. And the bugs won't necessarily be easy to reproduce. Luckily, there is some aid for us, in the image of "core files".
Debugging Techniques :: GNU gdb
● Debugging A Crashed Program – A core file contains the memory image of a process, and (assuming the program within the process contains debug info) its stack trace, contents of variables, and so on. A program is normally set to generate a core file containing its memory image when it crashes due to signals such as SEGV or BUS. Provided that the shell invoking the program was not set to limit the size of this core file, we will find this file in the working directory of the process (either the directory from which it was started, or the directory it last switched to using the chdir system call).
Debugging Techniques :: GNU gdb
● Debugging A Crashed Program – Once we get such a core file, we can look at it by issuing the following command: gdb /path/to/program/debug_me core
Debugging Techniques :: GNU gdb
● Debugging A Crashed Program – This assumes the program was launched using this path, and the core file is in the current directory. If it is not, we can give the path to the core file. When we get the debugger's prompt (assuming the core file was successfully read), we can issue commands such as "print", "where" or "frame X". We can not issue commands that imply execution (such as "next", or the invocation of function calls). In some situations, we will be able to see what caused the crash.
Debugging Techniques :: GNU gdb
● Debugging A Crashed Program – One should note that if the program crashed due to invalid memory address access, this will imply that the memory of the program was corrupt, and thus that the core file is corrupt as well, and thus contains bad memory contents, invalid stack frames, etc. Thus, we should see the core file's contents as one possible past, out of many probable pasts (this makes core file analysis rather similar to quantum theory. Almost).
Debugging Techniques :: dbx
● dbx is a multi-threaded program debugger. This program is great for tracking down memory leaks. dbx is not found on linux machines (it can be found on Solaris or other *NIX machines).
● Run dbx with the executable like gdb. Now you can set arguments with runargs.
● After doing that, issue the check -memuse command. This will check for memory use. If you want to also check for access violations, you can use the check -all command.
Debugging Techniques :: dbx
● Run the program using the run command. If you get any access violations or memory leaks, dbx will report them to you.
● Run the help command if you need to understand other commands or similar gdb commands.
Debugging Techniques :: dbx
● Run the program using the run command. If you get any access violations or memory leaks, dbx will report them to you.
● Run the help command if you need to understand other commands or similar gdb commands.
Creating Libraries
● If you have a bunch of files that contain just functions, you can turn these source files into libraries that can be used statically or dynamically by programs. This is good for program modularity, and code re-use. Write Once, Use Many.
● A library is basically just an archive of object files.
Creating Libraries :: Static Library Setup
● First thing you must do is create your C source files containing any functions that will be used. Your library can contain multiple object files.
● After creating the C source files, compile the files into object files.
● To create a library: ar rc libmylib.a objfile1.o objfile2.o objfile3.o
● This will create a static library called libname.a. Rename the "mylib" portion of the library to whatever you want.
Creating Libraries :: Static Library Setup
● Next: ranlib libmylib.a
● This creates an index inside the library. That should be it! If you plan on copying the library, remember to use the -p option with cp to preserve permissions.
Creating Libraries :: Static Library Usage
● Remember to prototype your library function calls so that you do not get implicit declaration errors.
● When linking your program to the libraries, make sure you specify where the library can be found: gcc -o foo -L. -lmylib foo.o
● The -L. piece tells gcc to look in the current directory in addition to the other library directories for finding libmylib.a.
You can easily integrate this into your
Creating Libraries :: Shared Library Setup
● Creating shared or dynamic libraries is simple also. Using the previous example, to create a shared library: gcc -fPIC -c objfile1.c
gcc -fPIC -c objfile2.c
gcc -fPIC -c objfile3.c
gcc -shared -o libmylib.so objfile1.o objfile2.o objfile3.o
Creating Libraries :: Shared Library Setup
● The -fPIC option is to tell the compiler to create Position Independent Code (create libraries using relative addresses rather than absolute addresses because these libraries can be loaded multiple times). The -shared option is to specify that an architecture dependent shared library is being created. However, not all platforms support this flag.
Creating Libraries :: Shared Library Setup
● Now we have to compile the actual program using the libraries: gcc -o foo -L. -lmylib foo.o
● Notice it is exactly the same as creating a static library. Although, it is compiled in the same way, none of the actual library code is inserted into the executable, hence the dynamic/shared library.
● Note: You can automate this process using Makefiles!
Creating Libraries :: Shared Library Usage
● Since programs that use static libraries already have the library code compiled into the program, it can run on its own. Shared libraries dynamically access libraries at run time thus the program needs to know where the shared library is stored. What's the advantage of creating executables using Dynamic Libraries? The executable is much smaller than with static libraries. If it is a standard library that can be installed, there is no need to compile it into the executable at compile time!
Creating Libraries :: Shared Library Usage
● The key to making your program work with dynamic libraries is through the LD_LIBRARY_PATH enviornment variable. To display this variable, at a shell: echo $LD_LIBRARY_PATH ● Will display this variable if it is already defined. If it isn't, you can create a wrapper script for your program to set this variable at run-time. Depending on your shell, simply use setenv (tcsh, csh) or export (bash, sh, etc) commands. If you already have LD_LIBRARY_PATH defined, make sure you append
Creating Libraries :: Shared Library Usage
● For example: setenv LD_LIBRARY_PATH /path/to/library:${LD_LIBRARY_PATH} ● would be the command you would use if you had tcsh/csh and already had an existing LD_LIBRARY_PATH. If you didn't have it already defined, just remove everything right of the :.
Creating Libraries :: Shared Library Usage
● An example with bash shells: export LD_LIBRARY_PATH=/path/to/library:${LD_LI BRARY_PATH} ● Again, remove the stuff right of the : and the : itself if you don't already have an existing LD_LIBRARY_PATH.
Creating Libraries :: Shared Library Usage
● If you have administrative rights to your computer, you can install the particular library to the /usr/local/lib directory and permanently add an LD_LIBRARY_PATH into your .tcshrc, .cshrc, .bashrc, etc. file.
Programming Tips and Tricks
● Quick Commenting Because C does not allow nested comments, sometimes it is annoying to comment blocks of code. You can utilize the C Preprocessor's #if directive to circumvent this: #if 0 /* This code here is the stuff we want commented */ if (a != 0) { b = 0; } #endif
Quick Debugging Statements
● In the C Preprocessor section, we mentioned that you could turn on and off Debugging statements by using a #define. Expanding on that, it is even more convenient if you write a macro (using the PRINTF() macro from the I/O section): #ifdef DEBUG #define DPRINTF(s) PRINTF(s) #else #define DPRINTF(s) #endif ● Now you can have DPRINTF(("Debugging statement")); for debugging statements! This can be turned on and off using the -DDEBUG gcc flag.
●
Quick man Lookup in vim or emacs
In vim, move your cursor over the standard function library call you want to lookup, or any other word that might be in the man pages. Press K (capital k).
● In emacs, open up your .emacs file and this line: (global-set-key [(f1)] (lambda () (interactive) (manual-entry (current-word)))) ● Now you can load up emacs put the cursor on the word in question and press the F1 key to load up the man page on it. You can replace the F1 key with anything you wish.