Prezentacja programu PowerPoint
Download
Report
Transcript Prezentacja programu PowerPoint
Formal grammars
A formal grammar is a system for defining
the syntax of a language by specifying
sequences of symbols or sentences that are
considered grammatical.
Grammatical sentences of a language may
be very large or infinite, therefore they are
usually derived by a recursive definition.
Definition of the formal grammar G
G = < V, Σ, P, σ >
V – set of terminal symbols
Σ – set of nonterminal symbols with the restriction
that V and Σ are disjoint
σ – start symbol
P – set of production rules in a form:
A –> B
where:
A – is a sequence of symbols having at least one
nonterminal,
B – is the result of replacing some nonterminal symbol
A with a sequence of symbols (possibly empty)
from V and Σ
Small subset of English grammar
V = {“the”, ”a”, ”cat”, ”dog”, ”saw”, “chased“}
= {S, NP, VP, D, N, V}
S – sentence
D – determiner
NP – noun phrase
N – noun
VP – verb phrase
V – verb
=S
P=
{
S –> NP VP,
NP –> D N,
VP –> V NP,
D –> ”the”,
D –> “a”,
N –> ”cat”,
N –> ”dog”,
V –> “saw”,
V –> “chased”
}
Derivation
Example of a leftmost derivation:
S –>
–>
–>
–>
–>
–>
–>
–>
–>
NP VP
D N VP
“the” N VP
“the” “cat” VP
“the” “cat” V NP
“the” “cat” “chased” NP
“the” “cat” “chased” D N
“the” “cat” “chased” “a” N
“the” “cat” “chased” “a” “dog”
Parse trees
S
NP
VP
D
N
V
“the”
“cat”
“chased”
NP
D
N
“a”
“dog”
Backus notation for production rules
::= – is defined as
|
– separates alternatives
<> – denotes nonterminal symbols
Production rules for the small subset of English grammar
P=
{
<S> ::= <NP> <VP>,
<NP> ::= <D> <N>,
<VP> ::= <V> <NP>,
<D> ::= ”the” | “a”,
<N> ::= ”cat” | ”dog”,
<V> ::= “saw” | “chased”
}
Classification of formal grammars
Type
Name
Production rules
Recognizing automaton /
Storage required /
Parsing complexity
3
Regular
grammars,
Finite state
grammars
A –> xB
C –> y
A, B, C – non-terminal symbols
x, y – terminal symbols
Finite state automaton /
Finite storage /
O (n)
2
Context free
grammars
A –> BC…D
A – non-terminal symbols
BC…D – any sequence of terminal or non-terminal symbols
Pushdown automaton /
Pushdown stack /
O (n3)
1
Context
sensitive
grammars
aAz –> aBC…Dz
A – non-terminal symbols
a, z – sequences of zero or more terminal or non-terminal
symbols
BC…D – any sequence of terminal or non-terminal symbols
Linear bounded automaton
(non-deterministic Turing
machine) /
Tape being a linear
multiple of input length /
NP Complete
0
Unrestricted
grammars,
General rewrite
grammars
Allows the production rules to transform any sequence of
symbols into any other sequence of symbols.
To convert context-sensitive grammar into unrestricted
grammar, replacement of any non-terminal symbol A with an
empty sequence needs to be allowed.
Turing machine /
Infinite tape /
Undecidable
Classification of formal grammars
Unrestricted grammars
Context sensitive grammars
Context free grammars
Regular grammars
Generalization / difficulty of parsing
Parsing methods
Top-down parsing approach:
LL parsers –
Left to Right,
Leftmost Derivation
Bottom-up parsing approach
LR parsers –
Left to Right,
Rightmost Derivation
LALR parsers – Look Ahead LR
(use of lookahead symbols
to aid the parsing process)
GLR parsers –
Generalized LR
(multiple parsing threads
in order to resolve ambiguities)
Grammar of Simple Arithmetic Expressions
V = {“a”, ”b”, ”d”, ”+”, ”*”, “(“, “)”}
= {E, C, F}
E – expression
C – component
F – factor
=E
P= {
<E> ::= <C> | <E> “+” <C> | <C> “+” <E>
<C> ::= <F> | <F> “*” <C> | <C> “*” <F>
<F> ::= “(“ <E> “)” | “a” | “b” | “d”
}
Grammar of Reverse Polish Notation
V = {“a”, ”b”, ”d”, ”+”, ”*”}
= {W, Z, X, O}
=W
P= {
<W> ::= <Z> <Z> <O> | <Z>
<Z> ::= <X> <X> <O> | <X>
<X> ::= “a” | “b” | “d”
<O> ::= “+” | “*”
}
(a + b) * d a b + d * d a b + *
<W>
<Z>
<Z>
<O>
“*”
<X>
<X>
<O>
<X>
“a”
“b”
“+”
“d”
Algorithm of finding the result of arithmetic
expressions in RPN
START
2, 3, 4, 5, +, *, +,
Read a symbol
(from the left to the right)
Yes
Parameter
Put the parameter
on the top of the stack
No
Yes
Operator
No
No
ERROR
Yes
END
Get parameters
from the stack,
execute the operation,
put the result
on the top of the stack
Input
2
3
4
5
+
*
+
Stack
2
2, 3
2, 3, 4
2, 3, 4, 5
2, 3, 9
2, 27
29