Prezentacja programu PowerPoint

Download Report

Transcript Prezentacja programu PowerPoint

Formal grammars
A formal grammar is a system for defining
the syntax of a language by specifying
sequences of symbols or sentences that are
considered grammatical.
Grammatical sentences of a language may
be very large or infinite, therefore they are
usually derived by a recursive definition.
Definition of the formal grammar G
G = < V, Σ, P, σ >
V – set of terminal symbols
Σ – set of nonterminal symbols with the restriction
that V and Σ are disjoint
σ – start symbol
P – set of production rules in a form:
A –> B
where:
A – is a sequence of symbols having at least one
nonterminal,
B – is the result of replacing some nonterminal symbol
A with a sequence of symbols (possibly empty)
from V and Σ
Small subset of English grammar
V = {“the”, ”a”, ”cat”, ”dog”, ”saw”, “chased“}
 = {S, NP, VP, D, N, V}
S – sentence
D – determiner
NP – noun phrase
N – noun
VP – verb phrase
V – verb
=S
P=
{
S –> NP VP,
NP –> D N,
VP –> V NP,
D –> ”the”,
D –> “a”,
N –> ”cat”,
N –> ”dog”,
V –> “saw”,
V –> “chased”
}
Derivation
Example of a leftmost derivation:
S –>
–>
–>
–>
–>
–>
–>
–>
–>
NP VP
D N VP
“the” N VP
“the” “cat” VP
“the” “cat” V NP
“the” “cat” “chased” NP
“the” “cat” “chased” D N
“the” “cat” “chased” “a” N
“the” “cat” “chased” “a” “dog”
Parse trees
S
NP
VP
D
N
V
“the”
“cat”
“chased”
NP
D
N
“a”
“dog”
Backus notation for production rules
::= – is defined as
|
– separates alternatives
<> – denotes nonterminal symbols
Production rules for the small subset of English grammar
P=
{
<S> ::= <NP> <VP>,
<NP> ::= <D> <N>,
<VP> ::= <V> <NP>,
<D> ::= ”the” | “a”,
<N> ::= ”cat” | ”dog”,
<V> ::= “saw” | “chased”
}
Classification of formal grammars
Type
Name
Production rules
Recognizing automaton /
Storage required /
Parsing complexity
3
Regular
grammars,
Finite state
grammars
A –> xB
C –> y
A, B, C – non-terminal symbols
x, y – terminal symbols
Finite state automaton /
Finite storage /
O (n)
2
Context free
grammars
A –> BC…D
A – non-terminal symbols
BC…D – any sequence of terminal or non-terminal symbols
Pushdown automaton /
Pushdown stack /
O (n3)
1
Context
sensitive
grammars
aAz –> aBC…Dz
A – non-terminal symbols
a, z – sequences of zero or more terminal or non-terminal
symbols
BC…D – any sequence of terminal or non-terminal symbols
Linear bounded automaton
(non-deterministic Turing
machine) /
Tape being a linear
multiple of input length /
NP Complete
0
Unrestricted
grammars,
General rewrite
grammars
Allows the production rules to transform any sequence of
symbols into any other sequence of symbols.
To convert context-sensitive grammar into unrestricted
grammar, replacement of any non-terminal symbol A with an
empty sequence needs to be allowed.
Turing machine /
Infinite tape /
Undecidable
Classification of formal grammars
Unrestricted grammars
Context sensitive grammars
Context free grammars
Regular grammars
Generalization / difficulty of parsing
Parsing methods
Top-down parsing approach:
LL parsers –
Left to Right,
Leftmost Derivation
Bottom-up parsing approach
LR parsers –
Left to Right,
Rightmost Derivation
LALR parsers – Look Ahead LR
(use of lookahead symbols
to aid the parsing process)
GLR parsers –
Generalized LR
(multiple parsing threads
in order to resolve ambiguities)
Grammar of Simple Arithmetic Expressions
V = {“a”, ”b”, ”d”, ”+”, ”*”, “(“, “)”}
 = {E, C, F}
E – expression
C – component
F – factor
=E
P= {
<E> ::= <C> | <E> “+” <C> | <C> “+” <E>
<C> ::= <F> | <F> “*” <C> | <C> “*” <F>
<F> ::= “(“ <E> “)” | “a” | “b” | “d”
}
Grammar of Reverse Polish Notation
V = {“a”, ”b”, ”d”, ”+”, ”*”}
 = {W, Z, X, O}
=W
P= {
<W> ::= <Z> <Z> <O> | <Z>
<Z> ::= <X> <X> <O> | <X>
<X> ::= “a” | “b” | “d”
<O> ::= “+” | “*”
}
(a + b) * d  a b + d *  d a b + *
<W>
<Z>
<Z>
<O>
“*”
<X>
<X>
<O>
<X>
“a”
“b”
“+”
“d”
Algorithm of finding the result of arithmetic
expressions in RPN
START
2, 3, 4, 5, +, *, +,
Read a symbol
(from the left to the right)
Yes
Parameter
Put the parameter
on the top of the stack
No
Yes
Operator
No
No
ERROR

Yes
END
Get parameters
from the stack,
execute the operation,
put the result
on the top of the stack
Input
2
3
4
5
+
*
+

Stack
2
2, 3
2, 3, 4
2, 3, 4, 5
2, 3, 9
2, 27
29