Perl - University of Crete
Download
Report
Transcript Perl - University of Crete
Perl Tutorial
1
Why Perl?
Perl is built around regular expressions
REs
are good for string processing
Therefore Perl is a good scripting language
Perl is especially popular for CGI scripts
Perl makes full use of the power of UNIX
Short Perl programs can be very short
“Perl
is designed to make the easy jobs easy,
without making the difficult jobs impossible.” -Larry Wall, Programming Perl
HY439
Autumn 2005
2
Why not Perl?
Perl is very UNIX-oriented
Perl
is available on other platforms...
...but isn’t always fully implemented there
However, Perl is often the best way to get some
UNIX capabilities on less capable platforms
Perl does not scale well to large programs
Weak
subroutines, heavy use of global variables
Perl’s syntax is not particularly appealing
HY439
Autumn 2005
3
What is a scripting language?
Operating systems can do many things
copy,
move, create, delete, compare files
execute programs, including compilers
schedule activities, monitor processes, etc.
A command-line interface gives you access to
these functions, but only one at a time
A scripting language is a “wrapper” language
that integrates OS functions
HY439
Autumn 2005
4
Major scripting languages
UNIX has sh, Perl
Macintosh has AppleScript, Frontier
Windows has no major scripting languages
probably
due to the weaknesses of DOS
Generic scripting languages include:
Perl
(most popular)
Tcl (easiest for beginners)
Python (new, Java-like, best for large programs)
HY439
Autumn 2005
5
Perl Example 1
#!/usr/local/bin/perl
#
# Program to do the obvious
#
print 'Hello world.';
# Print a message
HY439
Autumn 2005
6
Comments on “Hello, World”
Comments are # to end of line
the first line, #!/usr/local/bin/perl, tells where to
find the Perl compiler on your system
But
Perl statements end with semicolons
Perl is case-sensitive
Perl is compiled and run in a single operation
HY439
Autumn 2005
7
Perl Example 2
#!/ex2/usr/bin/perl
# Remove blank lines from a file
# Usage: singlespace < oldfile > newfile
while ($line = <STDIN>) {
if ($line eq "\n") { next; }
print "$line";
}
HY439
Autumn 2005
8
More Perl notes
On the UNIX command line;
In Perl, <STDIN> is the input file, <STDOUT> is the output
file
Scalar variables start with $
Scalar variables hold strings or numbers, and they are
interchangeable
Examples:
< filename means to get input from this file
> filename means to send output to this file
$priority = 9;
$priority = '9';
Array variables start with @
HY439
Autumn 2005
9
Perl Example 3
#!/usr/local/bin/perl
# Usage: fixm <filenames>
# Replace \r with \n -- replaces input files
foreach $file (@ARGV) {
print "Processing $file\n";
if (-e "fixm_temp") { die "*** File fixm_temp already exists!\n"; }
if (! -e $file) { die "*** No such file: $file!\n"; }
open DOIT, "| tr \'\\015' \'\\012' < $file > fixm_temp"
or die "*** Can't: tr '\015' '\012' < $ file > $ fixm_temp \n";
close DOIT;
open DOIT, "| mv -f fixm_temp $file"
or die "*** Can't: mv -f fixm_temp $file\n";
close DOIT;
}
HY439
Autumn 2005
10
Comments on example 3
In # Usage: fixm <filenames>, the angle brackets just mean to supply a
list of file names here
In UNIX text editors, the \r (carriage return) character usually shows up
as ^M (hence the name fixm_temp)
The UNIX command tr '\015' '\012' replaces all \015 characters (\r) with
\012 (\n) characters
The format of the open and close commands is:
open fileHandle, fileName
close fileHandle, fileName
"| tr \'\\015' \'\\012' < $file > fixm_temp" says: Take input from $file,
pipe it to the tr command, put the output on fixm_temp
HY439
Autumn 2005
11
Arithmetic in Perl
$a = 1 + 2;
$a = 3 - 4;
$a = 5 * 6;
$a = 7 / 8;
$a = 9 ** 10;
$a = 5 % 2;
++$a;
$a++;
--$a;
$a--;
# Add 1 and 2 and store in $a
# Subtract 4 from 3 and store in $a
# Multiply 5 and 6
# Divide 7 by 8 to give 0.875
# Nine to the power of 10, that is, 910
# Remainder of 5 divided by 2
# Increment $a and then return it
# Return $a and then increment it
# Decrement $a and then return it
# Return $a and then decrement it
HY439
Autumn 2005
12
String and assignment operators
$a = $b . $c; # Concatenate $b and $c
$a = $b x $c; # $b repeated $c times
$a = $b;
$a += $b;
$a -= $b;
$a .= $b;
# Assign $b to $a
# Add $b to $a
# Subtract $b from $a
# Append $b onto $a
HY439
Autumn 2005
13
Single and double quotes
$a = 'apples';
$b = 'bananas';
print $a . ' and ' . $b;
prints:
print '$a and $b';
prints:
apples and bananas
$a and $b
print "$a and $b";
prints:
apples and bananas
HY439
Autumn 2005
14
Arrays
@food = ("apples", "bananas", "cherries");
But…
print $food[1];
prints
"bananas"
@morefood = ("meat", @food);
@morefood
==
("meat", "apples", "bananas", "cherries");
($a, $b, $c) = (5, 10, 20);
HY439
Autumn 2005
15
push and pop
push adds one or more things to the end of a list
push
(@food, "eggs", "bread");
push returns the new length of the list
pop removes and returns the last element
$sandwich
= pop(@food);
$len = @food; # $len gets length of @food
$#food # returns index of last element
HY439
Autumn 2005
16
foreach
# Visit each item in turn and call it $morsel
foreach $morsel (@food)
{
print "$morsel\n";
print "Yum yum\n";
}
HY439
Autumn 2005
17
Tests
“Zero” is false. This includes:
0, '0', "0", '', ""
Anything not false is true
Use == and != for numbers, eq and ne for
strings
&&, ||, and ! are and, or, and not, respectively.
HY439
Autumn 2005
18
for loops
for loops are just as in C or Java
for ($i = 0; $i < 10; ++$i)
{
print "$i\n";
}
HY439
Autumn 2005
19
while loops
#!/usr/local/bin/perl
print "Password? ";
$a = <STDIN>;
chop $a;
# Remove the newline at end
while ($a ne "fred")
{
print "sorry. Again? ";
$a = <STDIN>;
chop $a;
}
HY439
Autumn 2005
20
do..while and do..until loops
#!/usr/local/bin/perl
do
{
print "Password? ";
$a = <STDIN>;
chop $a;
}
while ($a ne "fred");
HY439
Autumn 2005
21
if statements
if ($a)
{
print "The string is not empty\n";
}
else
{
print "The string is empty\n";
}
HY439
Autumn 2005
22
if - elsif statements
if (!$a)
{ print "The string is empty\n"; }
elsif (length($a) == 1)
{ print "The string has one character\n"; }
elsif (length($a) == 2)
{ print "The string has two characters\n"; }
else
{ print "The string has many characters\n"; }
HY439
Autumn 2005
23
Why Perl?
Two factors make Perl important:
Pattern
Based on regular expressions (REs)
REs are similar in power to those in Formal Languages…
…but have many convenience features
Ability
matching/string manipulation
to execute UNIX commands
Less useful outside a UNIX environment
HY439
Autumn 2005
24
Basic pattern matching
$sentence =~ /the/
True
if $sentence contains "the"
$sentence = "The dog bites.";
if ($sentence =~ /the/) # is false
…because
Perl is case-sensitive
!~ is "does not contain"
HY439
Autumn 2005
25
RE special characters
.
# Any single character except a newline
^
# The beginning of the line or string
$
# The end of the line or string
*
# Zero or more of the last character
+
# One or more of the last character
?
# Zero or one of the last character
HY439
Autumn 2005
26
RE examples
^.*$
# matches the entire string
hi.*bye
# matches from "hi" to "bye" inclusive
x +y
# matches x, one or more blanks, and y
^Dear
# matches "Dear" only at beginning
bags?
# matches "bag" or "bags"
hiss+
# matches "hiss", "hisss", "hissss", etc.
HY439
Autumn 2005
27
Square brackets
[qjk]
# Either q or j or k
[^qjk]
# Neither q nor j nor k
[a-z]
# Anything from a to z inclusive
[^a-z]
# No lower case letters
[a-zA-Z] # Any letter
[a-z]+
# Any non-zero sequence of
# lower case letters
HY439
Autumn 2005
28
More examples
[aeiou]+
# matches one or more vowels
[^aeiou]+ # matches one or more nonvowels
[0-9]+
# matches an unsigned integer
[0-9A-F]
# matches a single hex digit
[a-zA-Z]
# matches any letter
[a-zA-Z0-9_]+ # matches identifiers
HY439
Autumn 2005
29
More special characters
\n
\t
\w
\W
\d
\D
\s
\S
\b
\B
# A newline
# A tab
# Any alphanumeric; same as [a-zA-Z0-9_]
# Any non-word char; same as [^a-zA-Z0-9_]
# Any digit. The same as [0-9]
# Any non-digit. The same as [^0-9]
# Any whitespace character
# Any non-whitespace character
# A word boundary, outside [] only
# No word boundary
HY439
Autumn 2005
30
Quoting special characters
\|
\[
\)
\*
\^
\/
\\
# Vertical bar
# An open square bracket
# A closing parenthesis
# An asterisk
# A carat symbol
# A slash
# A backslash
HY439
Autumn 2005
31
Alternatives and parentheses
jelly|cream # Either jelly or cream
(eg|le)gs
# Either eggs or legs
(da)+
# Either da or dada or
# dadada or...
HY439
Autumn 2005
32
The $_ variable
Often we want to process one string repeatedly
The $_ variable holds the current string
If a subject is omitted, $_ is assumed
Hence, the following are equivalent:
if
($sentence =~ /under/) …
$_ = $sentence; if (/under/) ...
HY439
Autumn 2005
33
Case-insensitive substitutions
s/london/London/i
substitution; will replace london,
LONDON, London, LoNDoN, etc.
case-insensitive
You can combine global substitution with caseinsensitive substitution
s/london/London/gi
HY439
Autumn 2005
34
Remembering patterns
Any part of the pattern enclosed in parentheses
is assigned to the special variables $1, $2, $3,
…, $9
Numbers are assigned according to the left
(opening) parentheses
"The moon is high" =~ /The (.*) is (.*)/
Afterwards,
$1 = "moon" and $2 = "high"
HY439
Autumn 2005
35
Dynamic matching
During the match, an early part of the match that
is tentatively assigned to $1, $2, etc. can be
referred to by \1, \2, etc.
Example:
\b.+\b
matches a single word
/(\b.+\b) \1/ matches repeated words
"Now is the the time" =~ /(\b.+\b) \1/
Afterwards, $1 = "the"
HY439
Autumn 2005
36
tr
tr does character-by-character translation
tr returns the number of substitutions made
$sentence =~ tr/abc/edf/;
replaces
$count = ($sentence =~ tr/*/*/);
counts
a with e, b with d, c with f
asterisks
tr/a-z/A-Z/;
converts
to all uppercase
HY439
Autumn 2005
37
split
split breaks a string into parts
$info = "Caine:Michael:Actor:14, Leafy Drive";
@personal = split(/:/, $info);
@personal =
("Caine", "Michael", "Actor", "14, Leafy
Drive");
HY439
Autumn 2005
38
Associative arrays
Associative arrays allow lookup by name rather
than by index
Associative array names begin with %
Example:
%fruit
= ("apples", "red", "bananas", "yellow",
"cherries", "red");
Now, $fruit{"bananas"} returns "yellow"
Note: braces, not parentheses
HY439
Autumn 2005
39
Associative Arrays II
Can be converted to normal arrays:
@food = %fruit;
You cannot index an associative array, but you
can use the keys and values functions:
foreach $f (keys %fruit)
{
print ("The color of $f is " .
$fruit{$f} . "\n");
}
HY439
Autumn 2005
40
Calling subroutines
Assume you have a subroutine printargs that
just prints out its arguments
Subroutine calls:
printargs("perly",
Prints: "perly king"
printargs("frog",
"king");
"and", "toad");
Prints: "frog and toad"
HY439
Autumn 2005
41
Defining subroutines
Here's the definition of printargs:
sub
printargs
{ print "@_\n"; }
Where are the parameters?
Parameters are put in the array @_ which has
nothing to do with $_
HY439
Autumn 2005
42
Returning a result
The value of a subroutine is the value of the last
expression that was evaluated
sub maximum
{
if ($_[0] > $_[1])
{ $_[0]; }
else
{ $_[1]; }
}
$biggest = maximum(37, 24);
HY439
Autumn 2005
43
Local variables
@_ is local to the subroutine, and…
…so are $_[0], $_[1], $_[2], …
local creates local variables
HY439
Autumn 2005
44
Example subroutine
sub inside
{
local($a, $b);
($a, $b) = ($_[0], $_[1]);
$a =~ s/ //g;
$b =~ s/ //g;
($a =~ /$b/ || $b =~ /$a/);
}
inside("lemon", "dole money");
HY439
Autumn 2005
# Make local variables
# Assign values
# Strip spaces from
#
local variables
# Is $b inside $a
#
or $a inside $b?
# true
45
Perl V
There are only a few differences between Perl 4
and Perl 5
Perl
5 has modules
Perl 5 modules can be treated as classes
Perl 5 has “auto” variables
HY439
Autumn 2005
46
The End
HY439
Autumn 2005
47