High Level Languages Java (Object Oriented) This Course Jython in Java Relation ASP RDF (Horn Clause Deduction, Semantic Web) Dr.

Download Report

Transcript High Level Languages Java (Object Oriented) This Course Jython in Java Relation ASP RDF (Horn Clause Deduction, Semantic Web) Dr.

10
High Level
Languages
Java (Object Oriented)
This Course
Jython in Java
Relation
ASP
RDF (Horn Clause Deduction,
Semantic Web)
Dr. Philip Cannata
1
Dr. Philip Cannata
2
Programming Languages
Lexical and Syntactic Analysis
• Chomsky Grammar Hierarchy
• Lexical Analysis – Tokenizing
• Syntactic Analysis – Parsing
Noam Chomsky
• Hmm Concrete Syntax
• Hmm Abstract Syntax
Dr. Philip Cannata
3
Chomsky Hierarchy
• Regular grammar – used for tokenizing
• Context-free grammar (BNF) – used for parsing
• Context-sensitive grammar – not really used for
programming languages
Dr. Philip Cannata
4
Regular Grammar
• Simplest; least powerful
• Equivalent to:
– Regular expression (think of perl)
– Finite-state automaton
• Right regular grammar:
  Terminal*,
A and B  Nonterminal
A→B
A→
• Example:
Integer → 0 Integer | 1 Integer | ... | 9 Integer |
0 | 1 | ... | 9
Dr. Philip Cannata
5
Regular Grammar
• Less powerful than context-free grammars
• The following is not a regular language
{ aⁿ bⁿ | n ≥ 1 }
i.e., cannot balance: ( ), { }, begin end
Dr. Philip Cannata
6
Regular Expressions
x
\x
{ name }
M|N
MN
M*
M+
M?
[aeiou]
[0-9]
.
Dr. Philip Cannata
a character x
an escaped character, e.g., \n
a reference to a name
M or N
M followed by N
zero or more occurrences of M
One or more occurrences of M
Zero or one occurrence of M
the set of vowels
the set of digits
any single character
7
Regular Expressions
Dr. Philip Cannata
8
Regular Expressions
Dr. Philip Cannata
9
Finite State Automaton for Identifiers
(S, a2i$) ├ (I, 2i$)
├ (I, i$)
├ (I, $)
├ (F, )
Thus: (S, a2i$) ├* (F, )
Dr. Philip Cannata
10
Deterministic Finite State Automaton Examples
•
Dr. Philip Cannata
11
Context-Free Grammar
Production:
α→β
α  Nonterminal
β  (Nonterminal  Terminal)*
ie, lefthand side is a single nonterminal, and righthand
side is a string of nonterminals and/or terminals
(possibly empty).
Dr. Philip Cannata
12
Context-Sensitive Grammar
Production:
α→β
|α| ≤ |β|
α, β  (Nonterminal  Terminal)*
ie, lefthand side can be composed of strings of
terminals and nonterminals, however, the number
of items on the left must be smaller than the
number of items on the right.
Dr. Philip Cannata
13
Syntax
•
The syntax of a programming language is a precise
description of all its grammatically correct programs.
•
Precise syntax was first used with Algol 60, and has been
used ever since.
•
Three levels:
– Lexical syntax - all the basic symbols of the language
(names, values, operators, etc.)
– Concrete syntax - rules for writing expressions,
statements and programs.
– Abstract syntax - internal representation of the program,
favoring content over form.
Dr. Philip Cannata
14
Grammars
Grammars: Metalanguages used to define the concrete syntax of a
language.
Backus Normal Form – Backus Naur Form (BNF)
• Stylized version of a context-free grammar (cf. Chomsky hierarchy)
• First used to define syntax of Algol 60
• Now used to define syntax of most major languages
Production:
α→β
α  Nonterminal
β  (Nonterminal  Terminal)*
ie, lefthand side is a single nonterminal, and β is a string of nonterminals and/or
terminals (possibly empty).
• Example
Integer  Digit | Integer Digit
Digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Dr. Philip Cannata
15
Extended BNF (EBNF)
Additional metacharacters
{ } a series of zero or more
( ) must pick one from a list
[ ] pick none or one from a list
Example
Expression -> Term { ( + | - ) Term }
IfStatement -> if ( Expression ) Statement [ else Statement ]
EBNF is no more powerful than BNF, but its production rules are often simpler
and clearer.
Javacc EBNF
( … )* a series of zero or more
( … )+ a series of one or more
[ … ] optional
Dr. Philip Cannata
16
For more details, see Chapter 2 of
“Programming Language Pragmatics, Third Edition (Paperback)”
Michael L. Scott (Author)
Dr. Philip Cannata
17
Instance of a Programming
Language:
int main ()
{
return 0 ;
Internal Parse Tree
}
Program (abstract syntax):
Function = main; Return type = int
params =
Block:
Return:
Variable: return#main, LOCAL addr=0
IntValue: 0
Abstract Syntax
Dr. Philip Cannata
18
Now we’ll focus
on the internal
parse tree
Dr. Philip Cannata
19
Parse Trees
Integer  Digit | Integer Digit
Digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Parse Tree for 352 as an Integer
Dr. Philip Cannata
20
Arithmetic Expression Grammar
Expr  Expr + Term | Expr – Term | Term
Term  0 | ... | 9 | ( Expr )
Parse of 5 - 4 + 3
Dr. Philip Cannata
21
Associativity and Precedence
• A grammar can be used to define associativity and
precedence among the operators in an expression.
E.g., + and - are left-associative operators in mathematics;
* and / have higher precedence than + and - .
• Consider the following grammar:
Expr -> Expr + Term | Expr – Term | Term
Term -> Term * Factor | Term / Factor | Term % Factor | Factor
Factor -> Primary ** Factor | Primary
Primary -> 0 | ... | 9 | ( Expr )
Dr. Philip Cannata
22
Associativity and Precedence
Parse of 4**2**3 + 5 * 6 + 7
Dr. Philip Cannata
23
Associativity and Precedence
Precedence
3
2
1
Associativity
right
left
left
Operators
**
* / %
+ -
Note: These relationships are shown by the structure
of the parse tree: highest precedence at the bottom,
and left-associativity on the left at each level.
Dr. Philip Cannata
24
Ambiguous Grammars
• A grammar is ambiguous if one of its strings has two
or more diffferent parse trees.
• Example:
Expr -> Expr Op Expr | ( Expr ) | Integer
Op -> + | - | * | / | % | **
• Equivalent to previous grammar but ambiguous
Dr. Philip Cannata
25
Ambiguous Grammars
Ambiguous Parse of 5 – 4 + 3
Dr. Philip Cannata
26
Dangling Else Ambiguous Grammars
IfStatement -> if ( Expression ) Statement |
if ( Expression ) Statement else Statement
Statement -> Assignment | IfStatement | Block
Block -> { Statements }
Statements -> Statements Statement | Statement
With which ‘if’ does the following ‘else’ associate
if (x < 0)
if (y < 0) y = y - 1;
else y = 0;
Dr. Philip Cannata
27
Dangling Else Ambiguous Grammars
Dr. Philip Cannata
28
Hmm BNF (i.e., Concrete Syntax)
Program : {[ Declaration ]|retType Identifier Function | MyClass | MyObject}
Function : ( ) Block
MyClass: Class Idenitifier { {retType Identifier Function}Constructor
}}
{retType Identifier Function
MyObject: Identifier Identifier = create Identifier callArgs
Constructor: Identifier ([{ Parameter } ]) block
Declaration : Type Identifier [ [Literal] ]{ , Identifier [ [ Literal ] ] }
Type : int|bool| float | list |tuple| object | string | void
Statements : { Statement }
Statement : ; | Declaration| Block |ForEach| Assignment
|IfStatement|WhileStatement|CallStatement|ReturnStatement
Block : { Statements }
ForEach: for( Expression <- Expression ) Block
Assignment : Identifier [ [ Expression ] ]= Expression ;
Parameter : Type Identifier
IfStatement: if ( Expression ) Block [elseifStatement| Block ]
WhileStatement: while ( Expression ) Block
Dr. Philip Cannata
29
Hmm BNF (i.e., Concrete Syntax)
Expression : Conjunction {|| Conjunction }
Conjunction : Equality {&&Equality }
Equality : Relation [EquOp Relation ]
EquOp: == | !=
Relation : Addition [RelOp Addition ]
RelOp: <|<= |>|>=
Addition : Term {AddOp Term }
AddOp: + | Term : Factor {MulOp Factor }
MulOp: * | / | %
Factor : [UnaryOp]Primary
UnaryOp: - | !
Primary : callOrLambda|IdentifierOrArrayRef| Literal |subExpressionOrTuple|ListOrListComprehension|
ObjFunction
callOrLambda : Identifier callArgs|LambdaDef
callArgs : ([Expression |passFunc { ,Expression |passFunc}] )
passFunc : Identifier (Type Identifier { Type Identifier } )
LambdaDef : (\\ Identifier { ,Identifier } -> Expression)
Dr. Philip Cannata
30
Hmm BNF (i.e., Concrete Syntax)
IdentifierOrArrayRef : Identifier [ [Expression] ]
subExpressionOrTuple : ([ Expression [,[ Expression { , Expression } ] ] ] )
ListOrListComprehension: [ Expression {, Expression } ] | | Expression[<- Expression ] {, Expression[<Expression ] } ]
ObjFunction: Identifier . Identifier . Identifier callArgs
Identifier : (a |b|…|z| A | B |…| Z){ (a |b|…|z| A | B |…| Z )|(0 | 1 |…| 9)}
Literal : Integer | True | False | ClFloat | ClString
Integer : Digit { Digit }
ClFloat: 0 | 1 |…| 9 {0 | 1 |…| 9}.{0 | 1 |…| 9}
ClString: ” {~[“] }”
Dr. Philip Cannata
31
Associativity and Precedence for Hmm
Clite Operator
Unary - !
*/
+< <= > >=
== !=
&&
||
Dr. Philip Cannata
Associativity
none
left
left
none
none
left
left
32
Hmm Parse Tree Example
z = x + 2 * y;
Dr. Philip Cannata
33
Now we’ll focus
on the Abstract
Syntax
Dr. Philip Cannata
34
Hmm Parse Tree
z = x + 2 * y;
=
Dr. Philip Cannata
35
Very Approximate Hmm Abstract Syntax
Dr. Philip Cannata
36
Very Approximate Hmm Abstract Syntax
Assignment = Variable target; Expression source
Expression = VariableRef | Value | Binary | Unary
VariableRef = Variable | ArrayRef
Variable = String id
ArrayRef = String id; Expression index
Value = IntValue | BoolValue | FloatValue | CharValue
Binary = Operator op; Expression term1, term2
Unary = UnaryOp op; Expression term
Operator = ArithmeticOp | RelationalOp | BooleanOp
IntValue = Integer intValue
…
Dr. Philip Cannata
37
Hmm Abstract Syntax – Binary Example
z=x+2*y
=
Binary
Operator
+
Variable
Binary
x
Operator
*
Dr. Philip Cannata
Value
2
Variable
y
38