Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University Course organization http://www.tulane.edu/~ling/NLP/ 16-Nov-2009 LING 681.02, Prof.
Download
Report
Transcript Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University Course organization http://www.tulane.edu/~ling/NLP/ 16-Nov-2009 LING 681.02, Prof.
Structured programming 4
Day 34
LING 681.02
Computational Linguistics
Harry Howard
Tulane University
Course organization
http://www.tulane.edu/~ling/NLP/
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
2
Structured programming
NLPP §4
Today's topics
Defensive programming
Debugging
Algorithm design
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
4
Defensive programming
Brainstorm with pseudo-code
Careful naming conventions
Bottom-up construction
Functional decomposition
Comment, comment, comment
Regression testing
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
5
Brainstorm with pseudocode
Before you write the first line of Python
code, write what your program does as
pseudocode.
That is to say, before writing a program that
NLTK understands, write it in a way that
people understand.
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
6
An example of pseudocode
SPOT, move forward about 10 inches, turn
left 90 degrees, and start moving forward,
then start looking for a black object with
your ultrasonic sensor, because I want you
to stop when you find a black object, then
turn right 90 degrees, and move backward 2
feet, OK?
What is good or bad about this example
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
7
A different phrasing of
the example
SPOT, move forward about 10 inches and stop.
Now turn left 90 degrees.
Start moving forward, and turn on your ultrasonic
sensor.
Stop when you find a black object.
Turn right 90 degrees and stop.
Move backward 2 feet and stop.
What is good or bad about this example?
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
8
Pseudo and real code
The main advantage of the second phrasing
is that we can match up the commands in
each line to elements in the programming
language.
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
9
Careful naming
conditions
Choose meaningful variable and function
names.
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
10
Bottom-up construction
Instead of writing a 20-line program and
then testing it,
build and test smaller units,
and then combine them.
In general, these smaller units should be
functions.
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
11
NLP pipeline
Fig. 3.1
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
12
Commenting
Add comments to every line,
unless what a line is does is so obvious that a
comment would get in the way.
Your pseudo-code could become the
comments on your real code.
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
13
Regressive testing
Keep a suite of test cases.
As your program gets bigger, it should still work
on previous test cases.
If it stops working, it has 'regressed'.
A change in code has the (unintended) side effect of
breaking something that used to work.
doctest module does testing
It runs a program as if it were in interactive mode.
See doctest documentation.
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
14
Debugging topics
Check your assumptions
Exception > stack trace
Interactive debugging
Python's debugger
Prediction
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
15
Debugging
"Most code errors result from the programmer
making incorrect assumptions". (NLPP:158)
When you find an error, first check your
assumptions.
Add print statements to show
values of variables and
how far the program progresses.
Reduce input to smallest amount needed to cause
the error.
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
16
Stack trace
A runtime error (Python exception) gives a
stack trace that pinpoints the location of
program execution at the time of the error.
But the error may actually be upstream.
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
17
Python's debugger
Invoke it:
import pdb
pdb.run('mymodule')
It lets you
monitor execution of program,
specify line numbers where program should stop
(breakpoints), and
step through the sections of code inspecting values of
variables.
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
18
Prediction
Try to predict the effect of a potential bugfix
before re-running the program.
"If the bug isn't fixed, don't fall into the trap of
blindly changing the code in the hope that it will
magically start working again." (NLPP:159)
For each change, try to articulate what is wrong
and how the change will fix the problem.
Undo the change if it doesn't work.
"Programs don't magically work; they magically
don't work." (Robert Goldman)
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
19
Algorithm design
NLPP 4.7
Algorithms
Divide and conquer
Start with something that works
Iteration
Recursion
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
21
Divide and conquer
Divide a problem of size n into two
problems of size n/2.
Binary search - dictionary example.
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
22
Start with known
Transform task into something that already
works.
To find duplicates in a list,
first sort the list,
then check for identity of adjacent pairs.
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
23
Iteration vs. recursion
For some function ƒ…
Iteration
Repeat ƒ some number of times.
Calling ƒ in a for loop.
Recursion
ƒ calls itself some number of times:
NP → the N PP.
PP → P NP.
16-Nov-2009
LING 681.02, Prof. Howard, Tulane University
24
Next time
Start NLPP §6
Learning to classify text