Transcript Introduction to Perl scripting
1
Introduction to Perl scripting
Part 1 basic perl
2
What is Perl?
Scripting language Practical Extraction and Reporting Language Pathologically Eclectic Rubbish Lister 病态折中式电子列表器
How do I use Perl?
3
$ vi hello.pl
print “hello world\n”; $ perl hello.pl
hello world $ vi add.pl
print $ARGV[0] + $ARGV[1], “\n”; $ perl add.pl 17 25 42
4
Why Perl?
FAST text processing Simple Scripting language Cross-platform Many extensions for Biological data
5
TMTOWTDI
Motto: TMTOWTDI (There’s More Than One Way To Do It) This can be frustrating to new users Focus on understanding what you are doing, don’t worry about all the other ways yet.
Getting started
6
Primitives – – – – – String - “string”, ‘string’ Numeric - 10, 12e4, 1e-3, 120.0123
Data types scalars - $var = “a”; $num = 10; lists - @lst = (‘apple’, ‘orange’) hashes - %hash=(1:’apple’, 2:’orange’)
7
Starter Code
# assign a variable $var = 12; print “var is $var\n”; # concatenate strings $x = “Alice”; $y = $x . “ & Alex are cousins\n”; print $y; # print can print lists of variables print $y, “var is “, $var, “\n“;
8
Tidbits
To print to screen – print “string” Special chars – newline - “\n” – tab “\t” strings and numeric conversion automatic All about context
Math
9
Standard arithmetic +, -, *, / mod operator % - 4 % 2 = 0; 5 % 2 = 1 Operate on in place: $num += 3 Increment variable, $a++, $a- power ** 2 5 = 2**5 sqrt(9) log e (5) = log(5) - log 10 (100) = log(100) / log(10)
10
Precision
Round down int ($x) Round up POSIX::ceil ( $x ) Round down POSIX::floor ( $x ) Formatted printing printf/sprintf – – %d, %f, %5.2f, %g, %e More coverage later one
Some Math Code
11
# Pythagorean theorem my $a = 3; my $b = 4; my $c = sqrt($a**2 + $b**2); # what’s left over from the division my $x = 22; my $y = 6; my $div = int ( $x / $y ); my $mod = $x % $y; print $div, “ “, $mod, “\n”; output: 3 4
Logic & Equality
12
if / unless / elsif / else – if( TEST ) { DO SOMETHING } elsif( TEST ) { SOMETHING ELSE } else { DO SOMETHING ELSE IN CASE } Equality: == (numbers) and eq (strings) Less/Greater than: <, <=, >, >= – lt, le, gt, ge for string (lexical) comparisons
13
Testing equality
$str1 = “mumbo”; $str2 = “jumbo”; if( $str1 eq $str2 ) { print “strings are equal\n”; } if( $str1 lt $str2 ) { print “less” } else { print “more\n”; if( $y >= $x ) { print “y is greater or equal\n”; }
14
Boolean Logic
AND – && and OR – || or NOT – ! not if( $a > 10 && $a <= 20) { }
15
Loops
while( TEST ) { } until( ! TEST ) { } for( $i = 0 ; $i < 10; $i++ ) {} foreach $item ( @list ) { } for $item ( @list ) { }
16
Using logic
for( $i = 0; $i < 20; $i++ ) { if( $i == 0 { print “$i is 0\n”; } elsif( $i / 2 == 0) { print “$i is even\n”; } else { print “$i is odd } }
What is truth?
17
True – if( “zero” ) {} – – if( 23 || -1 || ! 0) {} $x = “0 or none”; if( $x ) False – if( 0 || undef || ‘’ || “0” ) { }
18
Special variables
This is why many people dislike Perl Too many little silly things to remember perldoc perlvar for detailed info
Some special variables
19
$!
$, $/ - error messages here - separator when doing print “@array”; - record delimiter (“\n” usually) $a,$b - used in sorting $_ - implicit variable perldoc perlvar for more info
The Implicit variable
20
Implicit variable is $_ for ( @list ) { print $_ } while(
21 Input/Output: Getting and Writing Data
Getting Data from Files
22
open(HANDLE, “filename”) || die $!
$line1 =
while(
@slurp =
23
Data from Streams
while(
24
Can pass data into a program
while(
25
Writing out data
open(OUT, “>outname”) || die $!; print OUT “sequence report\n”; close(OUT); # appending with >> open(OUT, “>>outname”) || die $!; print OUT “appended this\n”; close(OUT);
26
Filehandles as variables
$var = \*STDIN open($fh, “>report.txt”) || die $!; print $fh “line 1\n”; open($fh2, “report”) || die $!; $fh = $fh2 while(<$fh>) { }
27 String manipulation
28
Some string functions
.
– - concatenate strings $together = $one . “ “. $two; reverse - reverse a string (or array) length uc - get length of a string - uppercase or lc - lowercase a string
29
split/join
split: separate a string into a list based on a delimiter – @lst = split(“-”, “hello-there-mr-frog”); join: make string from list using delimiter – $str = join(“ “, @lst); – Solves fencepost problem nicely (want to put something between each pair of items in a list) print join(“\t”, @lst),”\n”;
30
index
index(STRING, SUBSTRING, [STARTINGPOS]) Find the position of a substring within a string (left to right scanning) $codon = ‘ATG’; $str = AGCGCATCGCATGGCGATGCAGATG $first = index($str,$codon); $second = index($str, $codon, $first + length($codon)); rindex Same as index, but Right to Left scanning
31
substr
substr(STRING, START,[LENGTH],[REPLACE]); Extract a substring from a larger string $orf = substr($str,10,40); $end = substr($str,40); # get end Replace string – substr($str,21,10,’NNNNNNNNNNN’);
32
Zero based economy...
– 1st number is ‘0’ for an index or 1st character in a string most programming languages Biologists often number 1st base in a sequence as ‘1’ (GenBank, BioPerl) Interbase coordinates (Kent-UCSC, Chado-GMOD)
33
Coordinate systems
Zero based, interbase coordinates A T G G G T A G A 0 1 2 3 4 5 6 7 8 9 1 based coordinates A T G G G T A G A 1 2 3 4 5 6 7 8 9
34
Arrays and Lists
Lists are sets of items Can be mixed types of scalars (numbers, strings, floats) Perl uses lists extensively Variables are prefixed by @
35
List operations
reverse - reverse list order $list[$n] - get the $n-th item – $two = $list[2]; scalar - get length of array – – $len = scalar @list; $last_index = $#list delete $list[10] - delete entry
Autovivication
36
Automatically allocate space for an item $array[0] = ‘apple’; print scalar @array, “ ”; $array[4] = ‘elephant’; $array[25] = ‘zebra fish’; print scalar @array, “ ”; delete $array[25]; print scalar @array, “\n”; output: 1 26 5
pop,push,shift,unshift 37
# remove last item $last = pop @list; # remove first item $first = shift @list; # add to end of list push @list, $last; # add to beginning of list unshift @list, $first;
38
splicing an array
splice ARRAY,OFFSET,LENGTH,LIST splice ARRAY,OFFSET,LENGTH splice ARRAY,OFFSET splice ARRAY @list = (‘alice’,’chad’,’rod’); ($x,$y) = splice(@list,1,2); splice(@list, 1,0, (‘marvin’,’alex’)); newlist: (‘alice’,’marvin’,’alex’,’chad’,’rod’);
39
Sorting with sort
@list = (‘tree’,’frog’, ‘log’); @sorted = sort @list; # reverse order @sorted = sort { $b cmp $a } @list; # sort based on numerics @list = (25,21,12,17,9,8); @sorted = sort { $a <=> $b } @list; # reverse order of sort @revsorted = sort { $b <=> $a } @list;
40
How would you sort based on part of string in list?
41
@list = (‘E1’,’F3’,‘A2’); @sorted = sort @list; # sort lexical @sorted = sort { substr($a,1,1) <=> substr($b,1,1) } @list;
42
Filter with grep
@list = (‘aardvark’, ‘baboon’, ‘cat’, ‘dog’,’lamb’,’kangaroo’); @sl = grep { length($_) == 3} @list; @oo = grep { index($_,”oo”) >= 0 } @list; # use it to count my $ct = grep { substr($_,1,1) eq ‘a’} @list;
43
Transforming with map
@list = (‘aardvark’, ‘baboon’, ‘cat’, ‘dog’,’lamb’,’kangaroo’); @lens = map { length($_) } @list; @upper = map { $fch = substr($_,0,1); substr($_,0,1,uc($fch)) } @list
44
More list action
@list = (‘aardvark’, ‘baboon’, ‘cat’, ‘dog’,’lamb’,’kangaroo’); for $animal ( @list ) { if( length($animal) <= 3 ) { print “$animal is noisy\n”; } else { print “$animal is quiet\n”; } }
Sort complicated stuff
45
# want to sort these by gene number @list = (‘CG1000.1’, ‘CG0789.1’, ‘CG0321.1’, ‘CG1227.2’); @sorted = sort { ($locus_a) = split(/\./,$a); ($locus_b) = split(/\./,$b); substr($locus_a,0,2,’’); substr($locus_b,0,2,’’); $locus_a cmp $locus_b; } @list; print “sorted are “,join(“,”,@sorted), “\n”;
46
Scope
The section of program a variable is valid for Defined by braces { } use strict; Use ‘my’ to declare variables
#!/usr/bin/perl -w use strict; my $var = 10; my $var2 = ‘monkey’; print “(outside) var is $var\n”.
“(outside) var2 is $var2\n”; { my $var; $var = 20; print “(inside) var is $var\n”; $var2 = ‘ape’; } print “(outside) var is $var\n”.
“(outside) var2 is $var2\n”;
48
Good practices
Declare variables with ‘my’ Always ‘use strict’ ‘use warnings’ to get warnings
Let’s practice (old code)
49
@list = (‘aardvark’, ‘baboon’, ‘cat’, ‘dog’,’lamb’,’kangaroo’); for $animal ( @list ) { if( length($animal) <= 3 ) { print “$animal is noisy\n”; } else { print “$animal is quiet\n”; } }
Let’s practice
#!/usr/bin/perl use warnings use strict; my @list = (‘aardvark’, ‘baboon’, ‘cat’, ‘dog’,’lamb’,’kangaroo’);
50
for my $animal ( @list ) { if( length($animal) <= 3 ) { print “$animal is noisy\n”; } else { print “$animal is quiet\n”; } }
51
Editors
vi filename – begin by using this editor
52
Make a perl script
$ pico hello.pl
#!/usr/bin/perl print “hello world\n”; [Control-O , enter, Control-X enter] $ perl hello.pl
hello world $ chmod +x hello.pl
$ ./hello.pl