Lecture 6 Lecture 7 Regular grep Expressions Why Regular Expressions?  Regular expressions are used to describe text patterns/filters  Unix commands/utilities that support regular expressions:     grep(fgrep, egrep) - search.

Download Report

Transcript Lecture 6 Lecture 7 Regular grep Expressions Why Regular Expressions?  Regular expressions are used to describe text patterns/filters  Unix commands/utilities that support regular expressions:     grep(fgrep, egrep) - search.

Lecture 6
Lecture 7
Regular
grep
Expressions
Why Regular Expressions?

Regular expressions are used to describe text
patterns/filters
 Unix commands/utilities that support regular
expressions:




grep(fgrep, egrep) - search a file for a string or
regular expression
sed - stream editor
awk (nawk) - pattern scanning and processing
language
There are some minor differences between the
regular expressions supported by these programs
 We will cover the general matching operators first.
Character Class

[] matches any of the enclosed chars




[abc] matches a single a b or c
[a-z] matches any of abcdef…xyz
[^A-Za-z] matches a single character as long as it
is not a letter.
Example: [Dd][Aa][Vv][Ee]


Matches "Dave" or "dave" or "dAVE",
Does not match "ave" or "da"
Regular Expression Operators

Any character (except a metacharacter!) matches itself.
 .
Matches any single character except newline.
 *
Matches 0 or more of the immediately preceding R.E.
 ?
Matches 0 or 1 instances of the immediately preceding
R.E.
 +
Matches 1 or more instances of immediately preceding
R.E.
 ^
Matches the preceding R.E. at the beginning of the line
 $
Matches the preceding R.E. at the end of the line
 |
Matches the R.E. specified before or after this symbol
 \
Turn off the special meaning
Examples of R.E.
x[abc]?x
matches "xax" or "xx“
[abc]* matches "aaaaa" or "acbca"
0*10
matches "010" or "0000010"or "10"
^(dog)$
matches lines starting and ending
with dog
[\t ]*
(A|a)+b*c?
Grouping with parens
 If
you put a subpattern inside parens you
can use + * and ? to the entire subpattern.
a(bc)*d matches "ad" and "abcbcd"
does not match "abcxd" or "bcbcd"
Example
1.
2.
3.
4.
5.
6.
7.
8.
Christian Scott lives here and will put on a Christmas party
There are around 30 to 35 people invited.
They are:
Tom
Dan
Rhonda Savage
Nicky and Kimberly.
Steve, Suzanne, Ginger and Larry
^[A-Z]..$
^[A-Z][a-z]*3[0-5]
^ *[A-Z][a-z][a-z]$
^[A-Z][a-z]*[^,][A-Za-z]*$
[a-z]*\.
Review: Metacharacters
for filename abbreviation
Matches anything: ls Test*.doc
 ? Matches any single character
*
ls Test?.doc

[abc…] Matches any of the enclosed
characters: ls T[eE][sS][tT].doc

[a-z] matches any character in a range
ls [a-zA-Z]*

[!abc…] matches any character except those
listed: ls [!0-9]*
Difference !!

Although there are similarities to the metacharacters
used in filename expansion – we are talking about
something different!



Filename expansion is done by the shell.
Regular expressions are used by commands (programs).
However, be careful about specifying RE on the
command line as a result of this overlap


Good idea to always quote RE with special chars (‘’or “”)on
the command line
Example:
% grep ‘[a-z]*’ chap[12]*
Note: filename mask expanded by shell w/o ``
grep - search for a string
 grep



[-bchilnsvw] PATTERN [filename...]
Read files or standard /redirected input
Search for specified pattern in each line
Send results to the standard output
 Examples:
%grep ‘^X11’ *- search all files for lines starting with
the string “X11”
%grep -v text file - print lines that do not match “text”
Regular expressions for grep
c
\c
^
$
.
[...]
[^....]
r*
any non special character
turn off any special meaning of character c
beginning of line
end of line
any single character
any of characters in range .…
any single character not in range .…
zero or more occurrences of r
Regular Expressions for grep
\<
beginning of word anchor
\<abc matches “abcd” but not “dabc”
\>
end of work anchor
abc\> matches “dabc” but not “abcd”
\(…\)
stores the pattern …
\(abc\)def matches “abcdef” and stores
abc in \1. So \(abc\)def\1 matches
“abcdefabc”. Can store up to 9 matches
grep - options
 Some
-c
-h
-l
-v
-n
useful options
count number of lines
do not display filename
list only the files with matching lines
display lines that do not match
print line numbers
File db
northwest
western
southwest
southern
southeast
eastern
northeast
north
central
NW
WE
SW
SO
SE
EA
NE
NO
CT
Charles Main
Sharon Gray
Lewis Dalsass
Suan Chin
Patricia Heme
TB Savage
AM Main Jr.
Margot Webber
Ann Stephens
3.0
5.3
2.7
5.1
4.0
4.4
5.1
4.5
5.7
.98
.97
.8
.95
.7
.84
.94
.89
.94
3
5
2
4
4
5
3
5
5
34
23
18
15
17
20
13
9
13
grep with pipes
 Remember,
we can use pipes when a file
is expected
 ls –l | grep ‘\<Feb.*3\>’
egrep
 Extended


grep
allows for more kinds of regular expressions
unfortunately, egrep regular expressions are
not a superset of grep regular expressions
• some of grep’s regular expressions are not
available in egrep
grep vs. egrep

new to egrep





matches one or more occurrences of f
matches zero or one occurrences of f
matches f or g
groups characters a and b together
only in grep


f+
f?
f|g
(ab)
\( … \), \<, \>
Final Note: Different versions of grep/egrep may
support different expressions. Make sure to
check the man pages.
Recommended Reading
 Chapter
3
 Chapter 4, sections 4.1 – 4.5