COMP4031 2006-7 Artificial Intelligence for Games and

Download Report

Transcript COMP4031 2006-7 Artificial Intelligence for Games and

Parser construction tools: YACC
• Yacc is on Unix systems, it creates LALR parsers in C
yacc
specification
The yacc specification may
‘#include’ a lexical analyzer
produced by Lex, or by other
means
The ly library contains the
LALR parser which uses the
parsing table built by yacc
and calls the lexer ‘yylex’
http://csiweb.ucd.ie/staff/acater/comp30330.html
yacc
C compiler
y.tab.c
ly
library
more of
your C
source
a.
Compiler Construction
output
1
The three parts of a yacc specification
1.
2.
3.
4.
5.
declarations
– ordinary C, enclosed in %{ … %}, copied verbatim into y.tab.c
– declarations for use by yacc, such as %token, %left, %right, %nonassoc
separator – %%
grammar rules. Each one has
– a nonterminal name followed by a colon
–
productions separated by vertical bar, possibly each with additional
semantic actions and precedence information
– a final semicolon
separator – %%
supporting C routines
– there must at least be a lexical analyser named yylex
– commonly accomplished by writing #include “lex.yy.c” where the lex
program has been used to build the lexer. But can be hand-written.
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
2
Simple Desk-Calculator example
%{
#include <ctype.h>
%}
%token DIGIT
%%
line
: expr ‘\n’
;
expr
: expr ‘+’ term
| term
;
term
: term ‘*’ factor
| factor
;
factor : ‘(‘ expr ‘)’
| DIGIT
;
%%
declares isdigit among others
{ printf(“%d\n”, $1); }
{ $$ = $1 + $3; }
{ $$ = $1 * $3; }
{ $$ = $2; }
declares the token DIGIT
for use in grammar rules
and also in lexer code
a semantic rule
default semantic rule $$ = $1
is useful for single productions
#include “lex.yy.c” here to use
the yylex routine built by Lex
lexer uses C variable ‘yylval’
to communicate attribute value
yylex() { int c; c=getchar();
if (isdigit(c)) {yylval=c-’0’; return DIGIT;}
return c;}
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
3
Ambiguous grammars in Yacc
• Yacc declarations allow for shift/reduce and reduce/reduce conflicts to be resolved
using operator precedence and operator associativity information
 Yacc does have default methods for resolving conflicts but it is considered wise
to find out (using –v option) what conflicts arose and how they were resolved.
 The declarations provide a way to override Yacc’s defaults
 Productions have the precedence of their rightmost terminal, unless otherwise
specified by %prec element
• the declaration keywords %left, %right and %nonassoc inform Yacc that the tokens
following are to be treated as left-associative (as binary + & * commonly are), rightassociative (as binary – & / often are), or non-associative (as binary < & > often are)
•
the order of declarations informs
yacc that the tokens should be
accorded increasing precedence
http://csiweb.ucd.ie/staff/acater/comp30330.html
%left ‘+’ ‘-’ effect is that * has higher
%left ‘*’ ‘/’ precedence than +, so x+y*z is
grouped like x+(y*z)
Compiler Construction
4
Semantic actions in Yacc
• Each time the lexer returns a token, it can also produce an attribute value in the
variable named yyval
• Attribute values for nonterminals can also be produced by semantic actions
– several C statements enclosed in { … }
– $$ refers to attribute value for lhs nonterminal
– $1, $2 etc refer to attribute values for successive rhs grammar symbols
• Desk Calculator example uses only simple arithmentic operations. True compilers
can have much more complex code in their productions’ semantic actions
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
5
Bigger Desk-Calculator example
%{
#include <ctype.h>
#include <stdio.h>
#define YYSTYPE double /* double type for Yacc stack */
%}
%token NUMBER
%left ‘+’ ‘-’
%left ‘*’ ‘/’
%right UMINUS
%%
lines : lines expr ‘\n\
( printf(“%g\n”, $2); }
| lines ‘\n’
| /* empty */
| error ‘\n’ { yyerror(“reenter previous line”);
yyerrok; }
;
expr
: expr ‘+’ expr
{ $$ = $1 + $3; }
| expr ‘-’ expr
{ $$ = $1 - $3; }
| expr ‘*’ expr
{ $$ = $1 * $3; }
| expr ‘/’ expr
{ $$ = $1 / $3; }
| ‘(‘ expr ‘)’
{ $$ = $2; }
| ‘-’ expr
%prec UMINUS { $$ = -$2; }
| NUMBER
;
%%
#include “lex.yy.c”
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
6
Bigger Desk-Calculator example
%{
#include <ctype.h>
#include <stdio.h>
#define YYSTYPE double /* double type for Yacc stack */
%}
lines : lines expr ‘\n\
( printf(“%g\n”, $2); }
| lines ‘\n’
%token NUMBER
| /* empty */
| error ‘\n’ { yyerror(“reenter previous line”);
%left ‘+’ ‘-’
yyerrok; }
;
%left ‘*’ ‘/’
expr
: expr ‘+’ expr
{ $$ = $1 + $3; }
%right UMINUS
| expr ‘-’ expr
{ $$ = $1 - $3; }
| expr ‘*’ expr
{ $$ = $1 * $3; }
%%
| expr ‘/’ expr
{ $$ = $1 / $3; }
|
|
|
;
‘(‘ expr ‘)’
{ $$ = $2; }
‘-’ expr
%prec UMINUS { $$ = -$2; }
NUMBER
%%
#include “lex.yy.c”
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
7
Bigger Desk-Calculator example
%{
#include <ctype.h>
#include <stdio.h>
#define YYSTYPE double /* double type for Yacc stack */
%}
%token NUMBER
%left ‘+’ ‘-’
%left ‘*’ ‘/’
%right UMINUS
%%
lines
expr
:
|
|
|
;
:
|
|
|
|
|
|
;
lines expr ‘\n\
( printf(“%g\n”, $2); }
lines ‘\n’
/* empty */
error ‘\n’ { yyerror(“reenter previous line”);
yyerrok; }
expr ‘+’
expr ‘-’
expr ‘*’
expr ‘/’
‘(‘ expr
‘-’ expr
NUMBER
expr
{ $$ = $1 + $3;
expr
{ $$ = $1 - $3;
expr
{ $$ = $1 * $3;
expr
{ $$ = $1 / $3;
‘)’
{ $$ = $2; }
%prec UMINUS { $$ =
}
}
}
}
-$2; }
%%
#include “lex.yy.c”
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
8
Error handling in Yacc-generated parsers
• Rules may include error productions for selected nonterminals
– stmt : b {…} | g {…} | d {…} | error 
– error is a Yacc reserved word
• If the parser has no action for a combination of {state, input token}, then
1. it scans its stack for a state with a error production among its items
2. it pushes “error” onto its symbol stack
3. it scans input stream for a sequence reducible to 
– which may be empty
4. it pushes all  onto its symbol stack
5. it reduces according to the error production
– which may cause semantic actions to be carried out
– often involving routines yyerror(msg) and yyerrok
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
9
Some other free parser generators
see eg www.thefreecountry.com/programming/compilerconstruction.html
Name
Languages
Antlr
Parser Type
Lexers Impl. Lang Yacc
Compatible
Some
Features
C ,C++,
Recursive Descent
Java, C#,
Objective C,
Python
Y
Java?
N
Also ASTs
(Abstract
Syntax
Trees)
JavaCC
Java
Recursive Descent
Y
Java
N
Bison
C
LALR
N
C
Y
Yacc
C
LALR
N
C
Y!
YaYacc
C++
LALR
N
http://csiweb.ucd.ie/staff/acater/comp30330.html
Y
Compiler Construction
Facilitates
multiple
parsers in
one program
FreeBSD
10