COMP4031 2006-7 Artificial Intelligence for Games and
Download
Report
Transcript COMP4031 2006-7 Artificial Intelligence for Games and
Parser construction tools: YACC
• Yacc is on Unix systems, it creates LALR parsers in C
yacc
specification
The yacc specification may
‘#include’ a lexical analyzer
produced by Lex, or by other
means
The ly library contains the
LALR parser which uses the
parsing table built by yacc
and calls the lexer ‘yylex’
http://csiweb.ucd.ie/staff/acater/comp30330.html
yacc
C compiler
y.tab.c
ly
library
more of
your C
source
a.
Compiler Construction
output
1
The three parts of a yacc specification
1.
2.
3.
4.
5.
declarations
– ordinary C, enclosed in %{ … %}, copied verbatim into y.tab.c
– declarations for use by yacc, such as %token, %left, %right, %nonassoc
separator – %%
grammar rules. Each one has
– a nonterminal name followed by a colon
–
productions separated by vertical bar, possibly each with additional
semantic actions and precedence information
– a final semicolon
separator – %%
supporting C routines
– there must at least be a lexical analyser named yylex
– commonly accomplished by writing #include “lex.yy.c” where the lex
program has been used to build the lexer. But can be hand-written.
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
2
Simple Desk-Calculator example
%{
#include <ctype.h>
%}
%token DIGIT
%%
line
: expr ‘\n’
;
expr
: expr ‘+’ term
| term
;
term
: term ‘*’ factor
| factor
;
factor : ‘(‘ expr ‘)’
| DIGIT
;
%%
declares isdigit among others
{ printf(“%d\n”, $1); }
{ $$ = $1 + $3; }
{ $$ = $1 * $3; }
{ $$ = $2; }
declares the token DIGIT
for use in grammar rules
and also in lexer code
a semantic rule
default semantic rule $$ = $1
is useful for single productions
#include “lex.yy.c” here to use
the yylex routine built by Lex
lexer uses C variable ‘yylval’
to communicate attribute value
yylex() { int c; c=getchar();
if (isdigit(c)) {yylval=c-’0’; return DIGIT;}
return c;}
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
3
Ambiguous grammars in Yacc
• Yacc declarations allow for shift/reduce and reduce/reduce conflicts to be resolved
using operator precedence and operator associativity information
Yacc does have default methods for resolving conflicts but it is considered wise
to find out (using –v option) what conflicts arose and how they were resolved.
The declarations provide a way to override Yacc’s defaults
Productions have the precedence of their rightmost terminal, unless otherwise
specified by %prec element
• the declaration keywords %left, %right and %nonassoc inform Yacc that the tokens
following are to be treated as left-associative (as binary + & * commonly are), rightassociative (as binary – & / often are), or non-associative (as binary < & > often are)
•
the order of declarations informs
yacc that the tokens should be
accorded increasing precedence
http://csiweb.ucd.ie/staff/acater/comp30330.html
%left ‘+’ ‘-’ effect is that * has higher
%left ‘*’ ‘/’ precedence than +, so x+y*z is
grouped like x+(y*z)
Compiler Construction
4
Semantic actions in Yacc
• Each time the lexer returns a token, it can also produce an attribute value in the
variable named yyval
• Attribute values for nonterminals can also be produced by semantic actions
– several C statements enclosed in { … }
– $$ refers to attribute value for lhs nonterminal
– $1, $2 etc refer to attribute values for successive rhs grammar symbols
• Desk Calculator example uses only simple arithmentic operations. True compilers
can have much more complex code in their productions’ semantic actions
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
5
Bigger Desk-Calculator example
%{
#include <ctype.h>
#include <stdio.h>
#define YYSTYPE double /* double type for Yacc stack */
%}
%token NUMBER
%left ‘+’ ‘-’
%left ‘*’ ‘/’
%right UMINUS
%%
lines : lines expr ‘\n\
( printf(“%g\n”, $2); }
| lines ‘\n’
| /* empty */
| error ‘\n’ { yyerror(“reenter previous line”);
yyerrok; }
;
expr
: expr ‘+’ expr
{ $$ = $1 + $3; }
| expr ‘-’ expr
{ $$ = $1 - $3; }
| expr ‘*’ expr
{ $$ = $1 * $3; }
| expr ‘/’ expr
{ $$ = $1 / $3; }
| ‘(‘ expr ‘)’
{ $$ = $2; }
| ‘-’ expr
%prec UMINUS { $$ = -$2; }
| NUMBER
;
%%
#include “lex.yy.c”
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
6
Bigger Desk-Calculator example
%{
#include <ctype.h>
#include <stdio.h>
#define YYSTYPE double /* double type for Yacc stack */
%}
lines : lines expr ‘\n\
( printf(“%g\n”, $2); }
| lines ‘\n’
%token NUMBER
| /* empty */
| error ‘\n’ { yyerror(“reenter previous line”);
%left ‘+’ ‘-’
yyerrok; }
;
%left ‘*’ ‘/’
expr
: expr ‘+’ expr
{ $$ = $1 + $3; }
%right UMINUS
| expr ‘-’ expr
{ $$ = $1 - $3; }
| expr ‘*’ expr
{ $$ = $1 * $3; }
%%
| expr ‘/’ expr
{ $$ = $1 / $3; }
|
|
|
;
‘(‘ expr ‘)’
{ $$ = $2; }
‘-’ expr
%prec UMINUS { $$ = -$2; }
NUMBER
%%
#include “lex.yy.c”
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
7
Bigger Desk-Calculator example
%{
#include <ctype.h>
#include <stdio.h>
#define YYSTYPE double /* double type for Yacc stack */
%}
%token NUMBER
%left ‘+’ ‘-’
%left ‘*’ ‘/’
%right UMINUS
%%
lines
expr
:
|
|
|
;
:
|
|
|
|
|
|
;
lines expr ‘\n\
( printf(“%g\n”, $2); }
lines ‘\n’
/* empty */
error ‘\n’ { yyerror(“reenter previous line”);
yyerrok; }
expr ‘+’
expr ‘-’
expr ‘*’
expr ‘/’
‘(‘ expr
‘-’ expr
NUMBER
expr
{ $$ = $1 + $3;
expr
{ $$ = $1 - $3;
expr
{ $$ = $1 * $3;
expr
{ $$ = $1 / $3;
‘)’
{ $$ = $2; }
%prec UMINUS { $$ =
}
}
}
}
-$2; }
%%
#include “lex.yy.c”
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
8
Error handling in Yacc-generated parsers
• Rules may include error productions for selected nonterminals
– stmt : b {…} | g {…} | d {…} | error
– error is a Yacc reserved word
• If the parser has no action for a combination of {state, input token}, then
1. it scans its stack for a state with a error production among its items
2. it pushes “error” onto its symbol stack
3. it scans input stream for a sequence reducible to
– which may be empty
4. it pushes all onto its symbol stack
5. it reduces according to the error production
– which may cause semantic actions to be carried out
– often involving routines yyerror(msg) and yyerrok
http://csiweb.ucd.ie/staff/acater/comp30330.html
Compiler Construction
9
Some other free parser generators
see eg www.thefreecountry.com/programming/compilerconstruction.html
Name
Languages
Antlr
Parser Type
Lexers Impl. Lang Yacc
Compatible
Some
Features
C ,C++,
Recursive Descent
Java, C#,
Objective C,
Python
Y
Java?
N
Also ASTs
(Abstract
Syntax
Trees)
JavaCC
Java
Recursive Descent
Y
Java
N
Bison
C
LALR
N
C
Y
Yacc
C
LALR
N
C
Y!
YaYacc
C++
LALR
N
http://csiweb.ucd.ie/staff/acater/comp30330.html
Y
Compiler Construction
Facilitates
multiple
parsers in
one program
FreeBSD
10