Perl - University of Crete

Download Report

Transcript Perl - University of Crete

Perl Tutorial
1
Why Perl?

Perl is built around regular expressions
 REs
are good for string processing
 Therefore Perl is a good scripting language
 Perl is especially popular for CGI scripts


Perl makes full use of the power of UNIX
Short Perl programs can be very short
 “Perl
is designed to make the easy jobs easy,
without making the difficult jobs impossible.” -Larry Wall, Programming Perl
HY439
Autumn 2005
2
Why not Perl?

Perl is very UNIX-oriented
 Perl
is available on other platforms...
 ...but isn’t always fully implemented there
 However, Perl is often the best way to get some
UNIX capabilities on less capable platforms

Perl does not scale well to large programs
 Weak

subroutines, heavy use of global variables
Perl’s syntax is not particularly appealing
HY439
Autumn 2005
3
What is a scripting language?

Operating systems can do many things
 copy,
move, create, delete, compare files
 execute programs, including compilers
 schedule activities, monitor processes, etc.


A command-line interface gives you access to
these functions, but only one at a time
A scripting language is a “wrapper” language
that integrates OS functions
HY439
Autumn 2005
4
Major scripting languages



UNIX has sh, Perl
Macintosh has AppleScript, Frontier
Windows has no major scripting languages
 probably

due to the weaknesses of DOS
Generic scripting languages include:
 Perl
(most popular)
 Tcl (easiest for beginners)
 Python (new, Java-like, best for large programs)
HY439
Autumn 2005
5
Perl Example 1
#!/usr/local/bin/perl
#
# Program to do the obvious
#
print 'Hello world.';
# Print a message
HY439
Autumn 2005
6
Comments on “Hello, World”

Comments are # to end of line
the first line, #!/usr/local/bin/perl, tells where to
find the Perl compiler on your system
 But



Perl statements end with semicolons
Perl is case-sensitive
Perl is compiled and run in a single operation
HY439
Autumn 2005
7
Perl Example 2
#!/ex2/usr/bin/perl
# Remove blank lines from a file
# Usage: singlespace < oldfile > newfile
while ($line = <STDIN>) {
if ($line eq "\n") { next; }
print "$line";
}
HY439
Autumn 2005
8
More Perl notes

On the UNIX command line;






In Perl, <STDIN> is the input file, <STDOUT> is the output
file
Scalar variables start with $
Scalar variables hold strings or numbers, and they are
interchangeable
Examples:



< filename means to get input from this file
> filename means to send output to this file
$priority = 9;
$priority = '9';
Array variables start with @
HY439
Autumn 2005
9
Perl Example 3
#!/usr/local/bin/perl
# Usage: fixm <filenames>
# Replace \r with \n -- replaces input files
foreach $file (@ARGV) {
print "Processing $file\n";
if (-e "fixm_temp") { die "*** File fixm_temp already exists!\n"; }
if (! -e $file) { die "*** No such file: $file!\n"; }
open DOIT, "| tr \'\\015' \'\\012' < $file > fixm_temp"
or die "*** Can't: tr '\015' '\012' < $ file > $ fixm_temp \n";
close DOIT;
open DOIT, "| mv -f fixm_temp $file"
or die "*** Can't: mv -f fixm_temp $file\n";
close DOIT;
}
HY439
Autumn 2005
10
Comments on example 3




In # Usage: fixm <filenames>, the angle brackets just mean to supply a
list of file names here
In UNIX text editors, the \r (carriage return) character usually shows up
as ^M (hence the name fixm_temp)
The UNIX command tr '\015' '\012' replaces all \015 characters (\r) with
\012 (\n) characters
The format of the open and close commands is:
 open fileHandle, fileName
 close fileHandle, fileName
 "| tr \'\\015' \'\\012' < $file > fixm_temp" says: Take input from $file,
pipe it to the tr command, put the output on fixm_temp
HY439
Autumn 2005
11
Arithmetic in Perl
$a = 1 + 2;
$a = 3 - 4;
$a = 5 * 6;
$a = 7 / 8;
$a = 9 ** 10;
$a = 5 % 2;
++$a;
$a++;
--$a;
$a--;
# Add 1 and 2 and store in $a
# Subtract 4 from 3 and store in $a
# Multiply 5 and 6
# Divide 7 by 8 to give 0.875
# Nine to the power of 10, that is, 910
# Remainder of 5 divided by 2
# Increment $a and then return it
# Return $a and then increment it
# Decrement $a and then return it
# Return $a and then decrement it
HY439
Autumn 2005
12
String and assignment operators
$a = $b . $c; # Concatenate $b and $c
$a = $b x $c; # $b repeated $c times
$a = $b;
$a += $b;
$a -= $b;
$a .= $b;
# Assign $b to $a
# Add $b to $a
# Subtract $b from $a
# Append $b onto $a
HY439
Autumn 2005
13
Single and double quotes



$a = 'apples';
$b = 'bananas';
print $a . ' and ' . $b;
 prints:

print '$a and $b';
 prints:

apples and bananas
$a and $b
print "$a and $b";
 prints:
apples and bananas
HY439
Autumn 2005
14
Arrays



@food = ("apples", "bananas", "cherries");
But…
print $food[1];
 prints

"bananas"
@morefood = ("meat", @food);
 @morefood
==
("meat", "apples", "bananas", "cherries");

($a, $b, $c) = (5, 10, 20);
HY439
Autumn 2005
15
push and pop

push adds one or more things to the end of a list
 push
(@food, "eggs", "bread");
 push returns the new length of the list

pop removes and returns the last element
 $sandwich


= pop(@food);
$len = @food; # $len gets length of @food
$#food # returns index of last element
HY439
Autumn 2005
16
foreach
# Visit each item in turn and call it $morsel
foreach $morsel (@food)
{
print "$morsel\n";
print "Yum yum\n";
}
HY439
Autumn 2005
17
Tests




“Zero” is false. This includes:
0, '0', "0", '', ""
Anything not false is true
Use == and != for numbers, eq and ne for
strings
&&, ||, and ! are and, or, and not, respectively.
HY439
Autumn 2005
18
for loops

for loops are just as in C or Java

for ($i = 0; $i < 10; ++$i)
{
print "$i\n";
}
HY439
Autumn 2005
19
while loops
#!/usr/local/bin/perl
print "Password? ";
$a = <STDIN>;
chop $a;
# Remove the newline at end
while ($a ne "fred")
{
print "sorry. Again? ";
$a = <STDIN>;
chop $a;
}
HY439
Autumn 2005
20
do..while and do..until loops
#!/usr/local/bin/perl
do
{
print "Password? ";
$a = <STDIN>;
chop $a;
}
while ($a ne "fred");
HY439
Autumn 2005
21
if statements
if ($a)
{
print "The string is not empty\n";
}
else
{
print "The string is empty\n";
}
HY439
Autumn 2005
22
if - elsif statements
if (!$a)
{ print "The string is empty\n"; }
elsif (length($a) == 1)
{ print "The string has one character\n"; }
elsif (length($a) == 2)
{ print "The string has two characters\n"; }
else
{ print "The string has many characters\n"; }
HY439
Autumn 2005
23
Why Perl?

Two factors make Perl important:
 Pattern



Based on regular expressions (REs)
REs are similar in power to those in Formal Languages…
…but have many convenience features
 Ability

matching/string manipulation
to execute UNIX commands
Less useful outside a UNIX environment
HY439
Autumn 2005
24
Basic pattern matching

$sentence =~ /the/
 True

if $sentence contains "the"
$sentence = "The dog bites.";
if ($sentence =~ /the/) # is false
 …because

Perl is case-sensitive
!~ is "does not contain"
HY439
Autumn 2005
25
RE special characters
.
# Any single character except a newline
^
# The beginning of the line or string
$
# The end of the line or string
*
# Zero or more of the last character
+
# One or more of the last character
?
# Zero or one of the last character
HY439
Autumn 2005
26
RE examples
^.*$
# matches the entire string
hi.*bye
# matches from "hi" to "bye" inclusive
x +y
# matches x, one or more blanks, and y
^Dear
# matches "Dear" only at beginning
bags?
# matches "bag" or "bags"
hiss+
# matches "hiss", "hisss", "hissss", etc.
HY439
Autumn 2005
27
Square brackets
[qjk]
# Either q or j or k
[^qjk]
# Neither q nor j nor k
[a-z]
# Anything from a to z inclusive
[^a-z]
# No lower case letters
[a-zA-Z] # Any letter
[a-z]+
# Any non-zero sequence of
# lower case letters
HY439
Autumn 2005
28
More examples
[aeiou]+
# matches one or more vowels
[^aeiou]+ # matches one or more nonvowels
[0-9]+
# matches an unsigned integer
[0-9A-F]
# matches a single hex digit
[a-zA-Z]
# matches any letter
[a-zA-Z0-9_]+ # matches identifiers
HY439
Autumn 2005
29
More special characters
\n
\t
\w
\W
\d
\D
\s
\S
\b
\B
# A newline
# A tab
# Any alphanumeric; same as [a-zA-Z0-9_]
# Any non-word char; same as [^a-zA-Z0-9_]
# Any digit. The same as [0-9]
# Any non-digit. The same as [^0-9]
# Any whitespace character
# Any non-whitespace character
# A word boundary, outside [] only
# No word boundary
HY439
Autumn 2005
30
Quoting special characters
\|
\[
\)
\*
\^
\/
\\
# Vertical bar
# An open square bracket
# A closing parenthesis
# An asterisk
# A carat symbol
# A slash
# A backslash
HY439
Autumn 2005
31
Alternatives and parentheses
jelly|cream # Either jelly or cream
(eg|le)gs
# Either eggs or legs
(da)+
# Either da or dada or
# dadada or...
HY439
Autumn 2005
32
The $_ variable




Often we want to process one string repeatedly
The $_ variable holds the current string
If a subject is omitted, $_ is assumed
Hence, the following are equivalent:
 if
($sentence =~ /under/) …
 $_ = $sentence; if (/under/) ...
HY439
Autumn 2005
33
Case-insensitive substitutions

s/london/London/i
substitution; will replace london,
LONDON, London, LoNDoN, etc.
 case-insensitive

You can combine global substitution with caseinsensitive substitution
 s/london/London/gi
HY439
Autumn 2005
34
Remembering patterns



Any part of the pattern enclosed in parentheses
is assigned to the special variables $1, $2, $3,
…, $9
Numbers are assigned according to the left
(opening) parentheses
"The moon is high" =~ /The (.*) is (.*)/
 Afterwards,
$1 = "moon" and $2 = "high"
HY439
Autumn 2005
35
Dynamic matching


During the match, an early part of the match that
is tentatively assigned to $1, $2, etc. can be
referred to by \1, \2, etc.
Example:
 \b.+\b
matches a single word
 /(\b.+\b) \1/ matches repeated words
 "Now is the the time" =~ /(\b.+\b) \1/
 Afterwards, $1 = "the"
HY439
Autumn 2005
36
tr



tr does character-by-character translation
tr returns the number of substitutions made
$sentence =~ tr/abc/edf/;
 replaces

$count = ($sentence =~ tr/*/*/);
 counts

a with e, b with d, c with f
asterisks
tr/a-z/A-Z/;
 converts
to all uppercase
HY439
Autumn 2005
37
split

split breaks a string into parts

$info = "Caine:Michael:Actor:14, Leafy Drive";
@personal = split(/:/, $info);

@personal =
("Caine", "Michael", "Actor", "14, Leafy
Drive");
HY439
Autumn 2005
38
Associative arrays



Associative arrays allow lookup by name rather
than by index
Associative array names begin with %
Example:
 %fruit
= ("apples", "red", "bananas", "yellow",
"cherries", "red");
 Now, $fruit{"bananas"} returns "yellow"
 Note: braces, not parentheses
HY439
Autumn 2005
39
Associative Arrays II


Can be converted to normal arrays:
@food = %fruit;
You cannot index an associative array, but you
can use the keys and values functions:
foreach $f (keys %fruit)
{
print ("The color of $f is " .
$fruit{$f} . "\n");
}
HY439
Autumn 2005
40
Calling subroutines


Assume you have a subroutine printargs that
just prints out its arguments
Subroutine calls:
 printargs("perly",

Prints: "perly king"
 printargs("frog",

"king");
"and", "toad");
Prints: "frog and toad"
HY439
Autumn 2005
41
Defining subroutines

Here's the definition of printargs:
 sub
printargs
{ print "@_\n"; }


Where are the parameters?
Parameters are put in the array @_ which has
nothing to do with $_
HY439
Autumn 2005
42
Returning a result

The value of a subroutine is the value of the last
expression that was evaluated
sub maximum
{
if ($_[0] > $_[1])
{ $_[0]; }
else
{ $_[1]; }
}
$biggest = maximum(37, 24);
HY439
Autumn 2005
43
Local variables



@_ is local to the subroutine, and…
…so are $_[0], $_[1], $_[2], …
local creates local variables
HY439
Autumn 2005
44
Example subroutine
sub inside
{
local($a, $b);
($a, $b) = ($_[0], $_[1]);
$a =~ s/ //g;
$b =~ s/ //g;
($a =~ /$b/ || $b =~ /$a/);
}
inside("lemon", "dole money");
HY439
Autumn 2005
# Make local variables
# Assign values
# Strip spaces from
#
local variables
# Is $b inside $a
#
or $a inside $b?
# true
45
Perl V

There are only a few differences between Perl 4
and Perl 5
 Perl
5 has modules
 Perl 5 modules can be treated as classes
 Perl 5 has “auto” variables
HY439
Autumn 2005
46
The End
HY439
Autumn 2005
47