Transcript Slides

Hands-on (Crash) Introduction to C++
for (Forensic) Scientists
3
2
1
0
1
2
3
Outline
• C++ : The standard for high performance scientific
computing
• Why bother learning C++?
• Web resources that help when you get stuck
• Getting a computer to do anything useful
• Compilers (Open-source ones at least…)
• The IDE we’ll use: Netbeans
• Common data types
• Structure of a basic C/C++ program
• Compiling and running your first program
• Useful headers, the STL
Why C++?
• We live in oceans of data. Computers are
essential to record and help analyse it.
• Competent scientists speak C/C++, Java,
MATLAB, Python, Perl, R and/or Mathematica
• Codes can easily be made available for review
• If you can speak C++, you can speak anything
• Sometimes you need the speed of C++
• Using C/C++ with R
is now trivial with Rcpp
Consecutive Matching Striae (CMS)-Space
Toolmarks (screwdriver striation profiles) form database
Biasotti-Murdock Dictionary
1,786,980 “line”
comparisions
Took ~1 week on
“beefy” Mac Pro
Approximate-Consecutive Matching Striae (CMS)-Space
Toolmarks (screwdriver striation profiles) form database
Biasotti-Murdock Dictionary
Took ~20 min
on same Mac
Pro
• Approximate
intensive steps
and code in C++
Took ~3 min
• Intensive steps
in C++
• Parallel
“foreach” in R
Web resources
• Google: “C++ how do I <your question here>”
• Often answered on http://stackoverflow.com/ site
• www.learncpp.com/ (Great tutorials)
• http://www.cplusplus.com/ (Great reference)
• https://www.youtube.com/user/voidrealms
(Nice, clear “bite-sized” video tutorial series)
Getting a computer to do anything useful
• All machines understand is on/off!
•
•
•
•
High/low voltage
High/low current
High/low charge
1/0 binary digits (bits)
• To make a computer do anything, you have to
speak machine language to it:
000000 00001 00010 00110 00000 100000
Add 1 and 2. Store the result.Wikipedia
Getting a computer to do anything useful
• Machine language is not intuitive and can vary
a great deal over designs
• The basic operations operations however are
the same, e.g.:
•
•
•
•
Move data here
Combine these values
Store this data
Etc.
• “Human readable” language for basic machine
operations: assembly language
Getting a computer to do anything useful
• Assembly is still cumbersome for (most)
humans
10110000 01100001
MOV AL, 61h
A machine encoding
Assembly
Move the number 97 over to “storage area” AL
Getting a computer to do anything useful
• Better yet is a more “Englishy”, “high-level”
language
• Enter: C, C++, Fortran, Java, …
• Higher level languages like these are translated
(“compiled”) to machine language
• Not exactly true for Java, but it’s something
analogus…
• From now on we will just talk in C++
Programming
• All you need to program is a text editor and a
compiler
• There are both commercial and open-source
compilers
• Open-source compilers are really good and have
been around forever. They are the de-facto standard
in science
• GCC: GNU compiler collection (Unix/Linux, OS X)
• MinGW: a port for gcc for Windows
• Clang: From the LLVM Project (OS X, Unix/Linux)
Programming
• Minimalist programming with just:
• a text editor (people like vim or emacs)
• a compiler
• maybe some unix tools like make …
sucks for beginners
• To make life a tad easier, we’ll use an
integrated development environment (IDE)
called Netbeans.
Programming In Netbeans
Compile button
Compile/Run button
Common Data Types
• Computers are DUMB! We have to be explicit
about the type of data we want to work with
• int: 32 bit (4 byte) signed integer
• double: 64 bit (8 byte) floating point (decimal)
• std::string: a “string” of characters. Pretty high
level representation however. More on this another
day.
• Chatacters between “ ” or ‘ ’ are recognized as
C++ strings.
Structure of a Basic C++ program
Comments are set off by: // (1 line each) -or- /*
A block of text
comments
Blah blah blah blah
*/
“Header” file include section
All stand alone c++ programs must have an int main() function
Your program executes commands from the “main” code block
Common STL headers
• STL: C++ standard template library
• Template: mechanism to handle data of any type
• Subject of templates goes VERY deep. For another
day…
• Common/handy STL headers:
• <iostream>: Basic input/output
• <vector>: Common container for numbers
• <fstream>: Basic file handling
• <cstdlib>: The old C standard library of functions
• <chron>: Timers for code performance measurement
Outline
• Pointers and References
• What’s the point??
• Arrays
• Basic memory management trivia and tips
• Basic control structures
• Functions
• Code files .cpp
• Header files .hpp
• Conditional blocks
• Looping
Pointers and References
• (An art-world analogy…..) In computing:
• Our “medium” is data.
• Our “pottery wheel” is hardware.
• We want to get the most utility from our hardware.
• Want to do a lot of work on data
• Don’t want the hardware to work too hard on each task
• Copying data from place to place:
• Is time intensive
• Wasteful of precious memory (RAM)
• We always want to minimize copying!
• Pointers and references deal with memory addresses
• REALLY handy for cutting down on copying
Pointers and References
• Pointers/references refer to (point at) where data is
• Some notation gymnastics:
• double x = 3.0; //A double
• &x; //The address of the data in x
• & is the address operator
• double *a_ptr //A variable that will
//hold a memory addr
• a_ptr called a pointer
• It DOESN’T point at anything yet!
• a_ptr = &x; //a_ptr “points at” the
//data in x. It holds
//x’s memory address
Pointers and References
• a_ptr = &x; //a_ptr “points at” x.
• *a_ptr;
//REFERS to the data in x. It
//is the same thing as x, 3.0
//in this case.
• * operator serves two purposes!
• double *a_ptr DECLARES a pointer to a double.
• *a_ptr REFERS to what a_ptr is pointing at (called dereferencing). NOTE there is no type name in front.
• The C++ kosher thing to do is point at the data as soon as the
pointer is declared:
• double *a_ptr = &x;
Pointers and References
• So we learned about * operator and & operator:
• Getting used to using them requires practice!!
• int y = 3;
• int *a_ptr2 = &y;
• cout << “What gets printed?: ” *a_ptr2 <<endl;
• y = 9;
• cout << “Now what?: ” *a_ptr2 <<endl;
• *a_ptr2 = 14;
• cout << “Now what?: ” *a_ptr2 <<endl;
• Why are these BAD?:
• int *a_ptr3 = 18;
• int &z = 9;
Arrays
• Pointers are great because they refer to data in-place.
• They prevent us having to copy data from place-to-place!
• This is very convenient when working with:
• files
• large vectors and matrices (arrays, STL containers, etc.)
• CAREFUL THOUGH:
• Data pointer is pointing at can be changed unintentionally!
• Memory should be FREED when you are done with it (more later)
• We usually (in memory) store related data together
• arrays
• STL containers
• For arrays (a little more primitive but sometimes offer a speed
advantage) we will declare/free with the important operators
• new and delete.
Arrays
Pointer “arithmetic”:
Indexing!
Output from code above
Arrays
• Matrices are allocated and freed in a similar way:
• NOTE: We’ll not typically do this however. Usually we’ll use STL
containers or the wonderful modern templated C++ linear algebra
libraries: Armadillo (http://arma.sourceforge.net/) and
Eigen (http://eigen.tuxfamily.org/)
Functions
• A simple function called from main():
“arguments” to the function would go here
Define the function ABOVE where it is first used
Use the function in main
Functions
• Here is a function with arguments:
Dummy variables. They will be substituted
with actual argument when the function is
actually called
Functions
• You can define a function after it is used, but care must be
taken:
Explicit dummy variable names are not
necessary.
The function “signature” MUST be defined
before it is used
Define the function
Functions
• The C++ kosher way to organize a function. Use separate
header and implementation files:
Need to #include the header file here
func_header_file.hpp (function signature)
main.cpp
func.cpp (function implementation)
Functions
• If the function implementation isn’t too complicated define it
in the header file (cuts down on the C++ “bureaucracy”).
func_header_file.hpp
main.cpp
Conditional Statement Blocks
• If-else. These are equivalent:
Looping
• Repeat (essentially) the same actions over and over: for-loop
• Fill up a matrix
• This is the “matrix”
• The main loop:
• Fills up A
This is a lot of code to
do a simple thing…
Later we’ll use
libraries to cut down
on the work and clean
up the code!
Looping
• Repeat (essentially) the same actions over and over: for-loop
• Dot product between vectors
•
Define a dot product
function
• The loop:
.
•
Call the function in
main()
Note the return type
• Common C++ lingo
Outline
• Objects
• Classes
• Structure of your average class
• Inheritance/Polymorphism
• Operator overloading
• Basic templates
• Great libraries/Tools/Building blocks:
• Armadillo
• Eigen
• Qt
• Boost
• Rcpp, RInside
Objects
• An object: Pretty much any self contained entity in the
program.
• Common examples of objects:
• Variables (We saw these already)
• Functions (We saw these already)
• Classes (These are new!)
• Templates (These are new!)
• Arbitrarily “typed” versions and combinations of the above!
• Object oriented means we want to build a program out of these
very general objects
• We will learn to think about a program in terms of interacting class
objects
Classes
• A class encapsulates (somehow) related data and functions
(methods) to interact with it.
• Let say we we collect evidence from a crime scene:
• “Evidence” will be a class
• Associated with the evidence class will be:
• A case number
• Location of collection (a string)
• The evidence type
• Number of items collected
• These are the data of the class
• We need to set and access this data with functions for the user
• These are the methods of the class
Classes
Evidence class
Data members
• Case #
• Location
• Type
• # items
Method members
• Get a data member
• Set a data member
We can decide in the level of
exposure these members have
to other parts of the program
• Public
• Private
• Protected
Gets and Sets are
common class methods
Classes
• What does this look like in C++ code???
class keyword declares a class
Class declaration
header file
“override” default constructor/destructor by explicitly
declaring a new
one
Constructors
will create instances of the class
Optional initialization list for
class member variables
Copy constructor to copy an instance of the class.
We can override with a custom one. Explicit
declaration is optional.
Class methods. Usually public
Destructor to delete an instantiated class instance
Class variables. Usually private or protected
Classes
evidence:: prefix indicated method of class
• :: is the scope operator
evidence.cpp
implementation file
Classes
Using the class
#include header for class
Create an instance of the class
Use public members of the class
When class “goes out of scope”
(here when program ends)
destructor automatically
deallocates resources for it
Derived Classes
• Let say we we collect evidence from a crime scene:
• “Evidence” will be a class
• XXXX
• XXXX