Transcript Chapter 5

CS 3240 – Chapter 5

Language

Regular Context-Free Recursively Enumerable

Machine

Finite Automaton Pushdown Automaton Turing Machine

Grammar

Regular Expression, Regular Grammar

Context-Free Grammar

Unrestricted Phrase Structure Grammar CS 3240 - Introduction 2

 5.1: Context-Free Grammars  Derivations    Derivation Trees 5.2: Parsing and Ambiguity 5.3: CFGs and Programming Languages  Precedence  Associativity  Expression Trees CS 3240 - Context-Free Languages 3

   S ➞ aaSa | λ It is not right-linear or left-linear  so it is not a “ regular grammar ” But it is linear   only one variable What is it’s language?

CS 3240 - Context-Free Languages 4

S

aSb | λ

Deriving aaabbb: S ⇒ aSb ⇒ aaSbb ⇒ aaaSbbb ⇒ aaabbb CS 3240 - Context-Free Languages 5

  

Variables

 aka “ non-terminals ” Letters from some alphabet, Σ  aka “ terminals ” Rules ( “ substitution rules ” )  of the form V

→ s

▪ where s is any string of letters and variables, or λ  Rules are often called productions CS 3240 - Context-Free Languages 6

       

a n cb n a n b 2n a n b m

, where 0 ≤ n ≤ m ≤ 2n

a n b m

, n ≠ m Palindrome (start with a recursive definition) Non-Palindrome Equal

a n b n a m

CS 3240 - Context-Free Languages 7

S → aSbSbS | bSaSbS | bSbSaS | λ

Trace ababbb When building CFGs, remember that the start variable (

S

) represents a

string in the language

. So, for example, if

S

twice as many

b

’ s as

a

’ s, then so does

aSbSbS

, etc.

has CS 3240 - Pushdown Automata 8

  A derivation is a sequence of applications of grammatical rules, eventually yielding a string in the language A CFG can have multiple variables on the right-hand side of a rule   Giving a choice of which variable to expand first By convention, we usually use a leftmost derivation CS 3240 - Context-Free Languages 9

→ the sings | eats → cat | song | canary

⇒ ⇒ the ⇒ the canary ⇒ the canary ⇒ the canary sings ⇒ the canary sings the ⇒ the canary sings the song CS 3240 - Context-Free Languages “

sentential forms

(aka

productions

)

10

 A graphical representation of a derivation  The start symbol is the root  Each symbol in the right-hand side of the rule is a child node at the same level  Continue until the leaves are all terminals CS 3240 - Context-Free Languages 11

CS 3240 - Context-Free Languages 12

    Note how there was only one parse tree or the string “ the canary sings the song ”  And only one leftmost derivation This is not true of all grammars!

Some grammars allow choices of distinct rules to generate the same string  Or equivalently, where there is more than one parse tree for the same string Such a grammar is ambiguous  Not easy to process programmatically CS 3240 - Context-Free Languages 13

+ | * | () | a | b | c

⇒ ⇒ + a + ⇒ a + * ⇒ ⇒ a + b * a + b * c ⇒ ⇒ * + * ⇒ ⇒ ⇒ a + * a + b * c CS 3240 - Context-Free Languages 14

Which one is “ correct ” ?

CS 3240 - Context-Free Languages 15

 The process of determining if a string is generated by a grammar  And often we want the parse tree   So that we know the order of operations Top-down Parsing   Easiest conceptually Bottom-up Parsing  Most efficient (used by commercial compilers)  We will use a simple one in Chapter 6 CS 3240 - Context-Free Languages 16

  Try to match a string, w, to a grammar If there is a rule S

w, we ’ re done!

 Fat chance :-)  Try to find rules that match the first character  A “ look-ahead ” strategy  This is what we do “ in our heads ” anyway   Repeat on the rest of the string… Very “ brute force ” CS 3240 - Context-Free Languages 17

S → SS | aSb | bSa | λ

Parse “aabb”: CS 3240 - Context-Free Languages 18

S → SS | aSb | bSa | λ

Parse “aabb”: Candidate rules: 1) S → SS, 2) S → aSb: 1)SS ⇒ 2)aSb SSS, SS ⇒ ⇒ aSbS aSSb, aSb ⇒ aaSbb Answer: S ⇒ aSb ⇒ aaSbb ⇒ aabb (2)

Not a well-defined algorithm (yet)!

CS 3240 - Context-Free Languages 19

  A top-down parsing technique Grammar Requirements:  no ambiguity  no lambdas  no left-recursion (e.g., A -> Ab)     … and some other stuff Create a function for each variable Check first character to choose a rule Start by calling S( ) CS 3240 - Context-Free Languages 20

 Grammar: 

S -> aSb | ab

Function S:   if length == 2, check to see if it is “ ab ” otherwise, consume outer ‘ a ’ on what’s left and ‘ b ’ , then call S  See parseanbn.py, parseanbn2.py CS 3240 - Context-Free Languages 21

 Grammar: 

A -> BA | a B -> bB | b

See parsebstara.cpp CS 3240 - Context-Free Languages 22

 Lambda rules can cause productions to shrink  Then they can grow, and shrink again   And grow, and shrink, and grow, and shrink… How then can we know if the string isn ’ t in the language?

 That is, how do we know when were done so we can stop and reject the string?

CS 3240 - Context-Free Languages 23

    A rule of the form A

B doesn ’ t increase the size of the sentential form Once again, we could spend a long time cycling through unit rules before parsing |w| We prefer a method that always strictly grows to |w|, so we can stop and answer “ yes ” “ no ” efficiently or So, we will remove lambda and unit rules  In Chapter 6 CS 3240 - Context-Free Languages 24

 Precedence  Associativity CS 3240 - Context-Free Languages 25

 It was ambiguous because it treated all operators equally   But multiplication should have higher precedence than addition So we introduce a new variable for multiplicative expressions  And place it further down in the rules  Because we want it to appear further down in the

parse tree

CS 3240 - Context-Free Languages 26

+ | * | → () | a | b | c

Now only one leftmost derivation for a + b * c: ⇒ ⇒ + + + ⇒ a + ⇒ a + * ⇒ a + * ⇒ a + b * ⇒ a + b * c CS 3240 - Context-Free Languages 27

CS 3240 - Context-Free Languages 28

   Derive the parse tree for a + b + c … Note how you get (a + b) + c, in effect Left-recursion gives left associativity   Analogously for right associativity Exercise:  Add a right-associative power (exponentiation) operator (^, with variable ) to the grammar with the proper precedence CS 3240 - Context-Free Languages 29