Supercompilation of an XSLT

Transcript Supercompilation of an XSLT

Program specialization by supercompilation.
(The supercompiler Scp4.)
Andrei P. Nemytykh
Program System Institute, Russian Academy of Sciences,
Senior Researcher
http://www.botik.ru/PSI/RCMS/APP/nemytykh/nemytykh.html
http://www.botik.ru/pub/local/scp/refal5/nemytykh_cv.html
State of art of program
transformation
 During the last three decades the following most
promising program transformation methods
have been under development:
1. generalized partial computation – Yoshiko Futamura (Japan), 1971
2. supercompilation – Valentin Turchin (Russia-USA), 1972
3. partial evaluation – Neil Jones (Denmark), 1983
4. deforestation – Phillip Wadler (Great Britain-USA), 1985
and some others.
Reasons for program
optimization
•
The methods can do almost nothing with programs that are already
carefully optimized by a professional programmer in a lower-level
language. The methods can clean a program of natural inefficiencies if
the program has been developed “inefficiently” in a structured way,
using various “high-level” techniques like
1.
2.
interpretation of specialized application-oriented languages and
component programming from libraries of general re-usable software,
etc.
Thus, the methods are directed to provide degrees of freedom for new
software technologies rather than to optimizing programs written in an
old style.
Specialization: the main idea
Let human be a program with two parameters knowledge and problem.
Then creating a specialist humanknowledge from human and knowledge is
a good example of specialization:
humanknowledge(problem) = human(knowledge,problem)
Specialist humanknowledge can solve problems much quicker than
an ordinary human when the problems are covered by his specific
knowledge.
A number of tasks for specialization.
<F x0, y>
The first argument of a program F is given, while the second is unknown.
<F <G x, y> , z >
Let two programs F and G be given, specialize a composition of
applications of the programs. That is specialization with respect to a context of
application.
<IntL ( Program )
e.data
<GO
>
>
An interpreter IntL of a programming language L is specialized with respect to
a given program. IntL is written in a language M, while the program is written in
L, so we expect an optimal program written in M as a result of specialization.
Thus a specializer may be used as a compiler from L into M, where M is the
subject language of the specializer.
What is supercompilation ?
• Supercompilation is a technique of specialization of programs written in
a functional programming language. The technique was introduced in the
1970s by V. F. Turchin. He proposed a task of creating tools to observe
operational semantics of a program, when a function F that is to be computed
by the program is fixed. As a result of such observations a new algorithmic
definition of an extension of the function F must be constructed. His ideas were
studied by a number of authors for a long time.
• The main aim of a supercompiler is to perform as many actions of a given
parameterized application ( a task ) of a program uniformly on the parameters
as possible.
• It is interesting that supercompilers can be used
• as simple theorem provers;
• for simplification of definitions written in a programming language;
• as compilers;
• as transformers of genuine recursions into tail-recursions.
The Supercompiler SCP4
• is an experimental specializer for a functional language Refal-5.
(There are no special restrictions on the input language.) Scp4 has been
implemented once again using Refal-5. Sources of the supercompiler,
executable modules and sources of Refal-5 are available for immediate
free download: http://www.botik.ru/pub/local/scp/refal5/
(by Andrei P. Nemytykh and Valentin F. Turchin).
Windows 98
Windows NT/2000/XP
Linux (Intel)
• a user manual on the supercompiler and reports on several interesting
experiments with Scp4 by Alexandr V. Korlyukov (in Russian).
• Alexandr P. Konyshev implemented a compiler from the intermediate
Scp4’s language into the language C.
An introduction to REFAL
(data)
REFAL (by V.F. Turchin) is a first-order strict functional language, where
concatenation is associative and is used in infix notation. The language
has another constructor for composing of tree structures. The semantics of
the language is based on pattern matching. Unlike LISP the language is
based on the model of computation known as Markov's algorithms.
REFAL data are defined by the following grammar:
d
::= (d1) | d1 d2 | SYMBOL | empty
empty ::=
Example:
/* nothing */
(A + B) * (C – D)
*
A
+
B
C
-
D
The language REFAL
(the first example definition)
* Concatenation is associative.
$ENTRY Go { e.xs = <repl B A (<repl A B (e.xs)>)>; }
-- A and B are constants
repl {
-- pattern matching
s.a s.b ( ) = ;
-- two types of variables:
s.a s.b (s.a e.xs) = s.b <repl s.a s.b (e.xs)>;
-- s.- can take symbols
}
-- e.- can take expressions
-- all functions are unary
The above given REFAL program is translation of the following program:
Go(xs) = repl(B,A, repl(A,B,xs))
repl(a,b,xs) = if Null(xs) then []
else if Atom(car(xs))
then if a =? car(xs) then b : repl(a,b,cdr(xs))
An introduction to REFAL
(associative concatenation)
Example 1:
* The palindrome predicate:
Palindrome {
= True;
s.1 = True; -- application constr.
s.1 e.2 s.1 = <Palindrome e.2>;
e.1 = False;
}
/* Pattern may be split from both
sides: a symbol-variable s.1 is
split from the left and right sides.
*/
Example 2: (Refal style reversing)
* Reversing a list of terms:
rev {
t.x e.ys = <rev e.ys> t.x;
=;
-- empty expression
}
-- on the both sides.
/* Reversing a list of terms:
we may concatenate a term taken
by the variable t.x right on the end
of the list.
The blank is used to denote the
concatenation.
*/
The language REFAL
(data processing)
* Concatenation is associative.
$ENTRY Go { s.a e.xs = <replace s.a s.a (e.xs)>; }
replace {
-- the empty expression is argument of the ( ) constructor
s.a s.b ( ) = ;
-- the empty expression on the right side
s.a s.b (s.a e.xs) = s.b <replace s.a s.b (e.xs)>;
s.a s.b (s.x e.xs) = s.x <replace s.a s.b (e.xs)>;
s.a s.b ((e.ys) e.xs) = (<replace s.a s.b (e.ys) >) <replace s.a s.b (e.xs)>;
}
The above given REFAL program is translation of the following program:
Go(xs) = replace(a,a, xs)
replace(a,b, xs) =
if Null(xs) then []
else if Atom(car(xs)) then if a =? car(xs) then b : replace(a,b,cdr(xs))
else car(xs) : replace(a,b,cdr(xs))
else replace(a,b, car(xs)) : replace(a,b,cdr(xs))
What is supercompilation ?
• Supercompilation is a technique of specialization of programs written in
a functional programming language. The technique was introduced in the
1970s by V. F. Turchin. He suggested a task of creating tools to observe
operational semantics of a program, when a function F that is to be computed
by the program is fixed. As a result of such observations a new algorithmic
definition of an extension of the function F must be constructed. His ideas were
studied by a number of authors for a long time.
• The main aim of a supercompiler is to perform as many actions of a given
parameterized application ( a task ) of a program uniformly on the parameters
as possible.
• It is interesting that supercompilers can be used
• as simple theorem provers;
• for simplification of definitions written in a programming language;
• as a compiler;
General structure of the Refal
interpreter
Entry config.
Refal-5 Source prog.
Internal task
final
language inp.-prog.
result := ;
/* undefined */
do {
Refal-step;
} while( the task is not solved )
/* still is not completely evaluated */
if( the task is determined )
{ result := task; }
Refal-5
Result
General structure of SCP4
Parameterized entry config.-s
Internal language
Refal-5 Source program
tasks
final input program
residual := ;
/* empty */
do { current-task := head(tasks);
driving;
folding;
if( the current-task is solved )
{ global-analysis;
/* folded */
specialization w.r.t the global properties;
propagation of the global inform. over the tasks;
if( the current-task is not empty )
{ residual := residual current-task; }
tasks := tail(tasks);
}
} while( tasks )
dead-code-analysis;
Refal-5
parameterized entry points, residual program
C
Driving
The driving is an extension of the interpretation of one Refal-graph
(its input subset) step on the parameterized sets of the input entries.
The purpose is to perform as many actions of the Refal-graph machine
uniformly on the parameter values as possible.
Schoolboy driving:
ax+b=0
driving
a =? 0
a x+b=0
b=0
b =? 0
b ? 0
a ? 0
x = -b/a; a  0
any x ℝ
Ø
Folding
It folds the meta-tree of all possible computations in a finite graph and is launched
immediately after the driving. I.e. the folding tries to wrap the part of the path,
which is not folded to a given moment, from the meta-tree root to a node,
referred to as current node, in the current driving cluster. The folding is divided
in two logically closed tools: reducing and generalization. Both tools use not only
syntactical but also semantic properties of the function stack.
..
..
..
previous
node
..
current node
Generalization
(Obninsk condition (V.F. Turchin, 1989))
The definition reflects an assumed development of the stack as follows
Top; Middlen; Context (condition for approximation of a loop).
Current top
Previous top
Context
Middle
Context
Given two stacks labeled by the times - Previous and Current, we say that
this pair indicates a loop if the stacks can be represented in the form:
Previous = PreviousTop; Context;
Current = CurrentTop; Middle; Context;
where the bottom of the stacks is the length-most common part, i.e. this part
nowise took part in the process developing of the function stack along the
path from the previous node to the current, and the "tops" of the two stacks
( PreviousTop = F1ptime_1, …, FNptime_n
CurrentTop = F1ctime_1, …, FNctime_n ) coincide modulo the creation
times. The Context determines computations immediately after the loop.
Obninsk condition
(example)
Consider a definition of the unary factorial.
$ENTRY IncFact {e.n = <Plus (I) (<Fact (e.n)>)>; }
Fact { ( ) = I;
(e.n) = <Times (e.n) (<Fact (<Minus (e.n) (I)>)>)>; }
Times { (e.u) ( )
=;
(e.u) (I e.v) = <Plus (<Times (e.u) (e.v)>) (e.u)>; }
Plus { (e.u) ( )
= e.u;
(e.u) (I e.v) = I <Plus (e.u) (e.v)>; }
Minus { (I e.u) (I e.v) = <Minus (e.u) (e.v)>;
(e.u) ( ) = e.u; }
"Lazy" development of the stack along the main branch yields the following
sequence: [1]: IncFact1;
[2]: Plus2;
A hypothesis is that the stack
[3]: Fact4; Plus3;
will further have the form
[4]: Times5; Plus3;
Fact; Timesn; Plus3;
[5]: Fact7; Times6; Plus3;
…
The stack formed on the fifth step is "similar" to the stack created on the third step.
Homeomorphic embedding
condition
• Homeomorphic embedding condition specifies a "similarity" relation
on the parts of the parameterized environments , which provides
"positive" information of data. This relation is a variant of a relation
originating by Higman and Kruskal and is a specification of the
following term relation.
• Let two terms Previous and Current be written on a blackboard by a
chalk. We say the term Current is not less complex compared to the
Previous iff the Previous can be obtained from the Current by erasing
some basic terms and constructors (see also Sorensen&Glueck:1995 ,
Leuschel:1998).
• If the set of the basic terms is reasonable enough, then any infinite
term sequence tn  positive-pd has a pair ti, tk such that k > i and tk is
not less complex compared to ti.
Global analysis - I
(derivation of the output formats)
• Derivation of the output formats is critical when the length of the function
stack of a program to be transformed is not uniformly bounded on the input
data. Without such derivation interesting transformations are not happening.
$ENTRY Go {e.n = <Fact (e.n)>; }
Fact { ( ) = (I);
(e.n) = (<Times (e.n) <Fact (<Minus (e.n) (I)>)>>); }
Times { (e.u) ( )
=;
(e.u) (I e.v) = <Plus (<Times (e.u) (e.v)>) (e.u)>; }
Plus { (e.u) ( ) = e.u;
(e.u) (I e.v) = I <Plus (e.u) (e.v)>; }
Minus { (I e.u) (I e.v) = <Minus (e.u) (e.v)>;
(e.u) ( ) = e.u; }
<F7 e.1>
(<F7 e.1>)
(Input Format: e.1)
F7 {
F7 {
= ( I );
= I;
I e.1 = ( <F16 e.1, e.3> ),
I e.1 = <F16 e.1, <F7 e.1>>;
where <F7 e.1>: (e.3);
}
} (Output Format: (e.6) )
Global analysis - II
(derivation of the output formats)
Example: Specialization of the following function Plus w.r.t. the second
argument is not a surprise.
Plus { (e.u) ( ) = e.u;
(e.u) (I e.v) = I <Plus (e.u) (e.v)>; }
Let us specialize the function w.r.t. the first argument by the supercompiler.
We input the entry point
$ENTRY Go { (e.n) (e.m) = <Plus (I e.n) (e.m)>; }
to Scp4. The residual program looks as follows.
$ENTRY Go { (e.n) (e.m) = I <F7 (e.n) e.m>; }
F7 {
(e.1)
= e.1;
(e.1) I e.2 = I <F7 (e.1) e.2>; }
An simple example:
supercompilation manually
Given the following program rev1 and the parameterized entry to the program:
<rev1 e.1 A ()>
rev1 {
t.x e.ys (e.res) = <rev1 e.ys (t.x e.res)>;
(e.res) = e.res;
}
e.4 := e.1; e.5 := ;
<rev1 e.1 A ()>
e.1→t.2 e.3;
A <rev2 e.1 ()>
rev2 {
t.6 e.7 (e.5) = <rev2 e.7 (t.6 e.5)>;
(e.5) = e.5;
}
e.1 → ;
<rev1 (A)>
<rev1 e.4 A (e.5)>
e.4→t.6 e.7;
<rev1 e.3 A (t.2)>
generalization
e.4 → ;
<rev1 (A e.5)>
<rev1 e.7 A (t.6 e.5)>
e.4 := e.7;
e.5 := t.6 e.5;
reducing
A e.5
Scp4 as a simple theorem prover
Problem after L. F. Magnitckij: A flock of geese is flying, and it meets a single
gander. The gander says “How do you do, one hundred geese?” The flock leader
answers – “we are not one hundred”. “Well, if one will count all of us, and as
many again, and a half-many again, and a quarter of us, and also you, Gander, then
the amount will be exactly one hundred.”
The question: How many geese are in the flock?
Here 'gg' := 'g' 'g'
$ENTRY Go { e.n = <Eq (<Counter e.n>) (<Hundred>)>; }
Eq { ( ) ( ) = TRUE;
('g' e.xs) ('g' e.ys) = <Eq (e.xs) (e.ys)>;
(e.xs) (e.ys) = FALSE; }
Hundred { = 'gggggggggggggggggggggggggggggggggggggggggggggggggg'
'gggggggggggggggggggggggggggggggggggggggggggggggggg' ;
}
Counter { 'gggg' e.n = 'gggg' 'gggg' 'gg' 'g' <Counter e.n>;
= 'g'; }
The question:
How many geese are in the flock?
The answer given by Scp4:
/* use geese.ref for demonstration */
$ENTRY Go {
'gggggggggggggggggggggggggggggggggggggggg' e.n = FALSE ;
* The following string has 36 of 'g' (of geese).
'gggggggggggggggggggggggggggggggggggg' = TRUE ;
'gggggggggggggggggggggggggggggggg' = FALSE ;
* The length of the (n+3)-th left side is 36-4(n+1),
* while the right side is FALSE.
...
= FALSE ;
}
We not only have found the answer but also proved that no other solution exists.
Mathematical induction on e.n happens during supercompilation and
the recursions written in the original program were eliminated.
More adequate method for the Counter.
Counter { e.n = <All e.n> <All e.n> <Half e.n> <Quoter e.n> 'g'; }
All { e.n = e.n; }
Half {
=;
'gg' e.n = 'g' <Half e.n>;
}
Quoter {
=;
'gggg' e.n = 'g' <Quoter e.n>;
}
The residual program looks akin to the one given above and has one
hundred sentences.
SCP4 as a simple theorem prover. (II)
The task: /* use palindrome.ref for demonstration */
$ENTRY Go { e.ls = <palindrome e.ls <rev e.ls>>; }
palindrome { = True;
s.x = True;
s.x e.ls s.x = <palindrome e.ls>;
s.x e.ls s.y = False;
}
rev {
=;
t.x e.ls = <rev e.ls> t.x;
}
The result of specialization: A simple theorem has been proven.
$ENTRY Go {
= True ;
s.102 e.41 = True ;
}
Using of SCP4 for theorem proving and
simplification of definitions.
The problem: Resolve the following equation 'a' X = X 'a', where the variable X
ranges on strings. /* use refal.ref for demonstration */
$ENTRY Go { e.x = <Eq ('a' e.x) (e.x 'a')>; }
Eq {
(s.1 e.x) (s.1 e.y) = <Eq (e.x) (e.y)>;
((e.1) e.x) ((e.2) e.y) = <Eq (e.1 e.x) (e.2 e.y)>;
()
()
= True;
(e.x)
(e.y) = False;
}
The simplified definition: given by SCP4. A simple theorem has been proven.
$ENTRY Go { e.41 = <F7 e.41 > ; }
F7 {
'a' e.41 = <F7 e.41 > ;
= True ;
e.41 = False ;
}
Specialization of composition
and removing recursion.
The program: /* use replace.ref for demonstration */
repl { s.a s.b ( ) = ;
s.a s.b (s.a e.xs) = s.b <repl s.a s.b (e.xs)>;
s.a s.b (s.c e.xs) = s.c <repl s.a s.b (e.xs)>;
}
The task #1:
$ENTRY Go {e.xs = <repl B C (<repl A B (e.xs)>)>; }
The task #2:
$ENTRY Go {
(s.b s.c) (s.a s.b) e.xs = <repl s.b s.c (<repl s.a s.b (e.xs)>)>;}
The task #3:
$ENTRY Go {
(s.a s.a) (s.a s.a) e.xs = <repl s.a s.a (<repl s.a s.a (e.xs)>)>;
}
Specialization of composition. (I)
The program: /* use replace.ref for demonstration */
repl { s.a s.b ( ) = ;
s.a s.b (s.a e.xs) = s.b <repl s.a s.b (e.xs)>;
s.a s.b (s.c e.xs) = s.c <repl s.a s.b (e.xs)>;
}
$ENTRY Go {e.xs = <repl B C (<repl A B (e.xs)>)>; }
The result: contains just one loop over the string.
$ENTRY Go { e.41 = <F7 e.41 > ; }
F7 {
=;
A e.41 = C <F7 e.41 > ;
B e.41 = C <F7 e.41 > ;
s.101 e.41 = s.101 <F7 e.41 > ;
}
Specialization of composition. (II)
The program: /* use replace.ref for demonstration */
repl { s.a s.b ( ) = ;
s.a s.b (s.a e.xs) = s.b <repl s.a s.b (e.xs)>;
s.a s.b (s.c e.xs) = s.c <repl s.a s.b (e.xs)>;
}
$ENTRY Go {(s.b s.c) (s.a s.b) e.xs = <repl s.b s.c (<repl s.a s.b (e.xs)>)>;}
The result: again contains just one loop over the string.
$ENTRY Go {
(s.102 s.103 ) (s.105 s.102 ) e.41 = <F7 (e.41 ) s.102 s.103 s.105 > ;
}
F7 {
()
s.102 s.103 s.105 = ;
(s.105 e.41 ) s.102 s.103 s.105 = s.103 <F7 (e.41 ) s.102 s.103 s.105 >;
(s.102 e.41 ) s.102 s.103 s.105 = s.103 <F7 (e.41 ) s.102 s.103 s.105>;
(s.107 e.41 ) s.102 s.103 s.105 = s.107 <F7 (e.41 ) s.102 s.103 s.105>; }
Specialization of composition. (III)
The program: /* use replace.ref for demonstration */
repl { s.a s.b ( ) = ;
s.a s.b (s.a e.xs) = s.b <repl s.a s.b (e.xs)>;
s.a s.b (s.c e.xs) = s.c <repl s.a s.b (e.xs)>;
}
$ENTRY Go {(s.a s.a) (s.a s.a) e.xs = <repl s.a s.a (<repl s.a s.a (e.xs)>)>;}
The result: contains no loops at all. The time complexity was decreased.
$ENTRY Go {
(s.102 s.102 ) (s.102 s.102 ) e.41 = e.41 ;
}
"The House That Jack Built"
by Jonathan Swift.
The program: exploits the recursion given in the poetry. The idea belongs to
Alexandr Korljukov. /* use swift.ref for demonstration */
$ENTRY Go { = <Title> <JonathanSwift>; }
Title { =
<Paragraph>
' The House That Jack Built' <NL>
'
Jonathan Swift'
<NL>; }
Paragraph { = <NL> <NL>; }
NL { = '\n'; }
Scene { = "This is"
'the farmer sowing the corn,' " "
<NL>
"That" "kept" 'the cock that crowed in the morn,' <NL>
"That" "waked" 'the priest all shaven and shorn,' <NL>
"That" "married" 'the man all tattered and torn,'
<NL>
"That" "kissed" 'the maiden all forlorn,'
<NL>
"That" "milked" 'the cow with the crumpled horn,' <NL>
"That" "tossed" 'the dog,'
<NL>
"That" "worried" 'the cat,'
<NL>
"That" "killed" 'the rat,'
<NL>
"That" "ate" 'the malt'
<NL>
"That" "lay" 'in the house that Jack built.' <NL>; }
JonathanSwift { = <Loop <Scene>>; }
Loop { "This is" 'in the house that Jack built.' e.1 = <Paragraph>
"This is" 'the house that Jack built.' e.1;
e.Couplet = <Next e.Couplet> <Paragraph> e.Couplet; }
Next { s.This '\n' s.That s.verb e.text = <Loop s.This e.text>;
s.This s.Char e.text = <Next s.This e.text>;
s.This
=;}
Ackerman’s function
Consider a definition of Ackermann’s function for the unary arithmetic.
* <Ack (e.m) (e.n)>
Ack {
()
(e.n) = I e.n;
(I e.m) ()
= <Ack (e.m) (I)>;
(I e.m) (I e.n) = <Ack (e.m) (<Ack (I e.m) (e.n)>)>;
}
Specialization by the supercompiler Scp4 gives:
*With respect to: $ENTRY Go { e.n = <Ack () (e.n)>; }
$ENTRY Go { e.n = I e.n ; }
*With respect to: $ENTRY Go { e.n = <Ack (I) (e.n)>; }
$ENTRY Go { e.n = I I e.n ; }
*With respect to: Go { e.n = <Ack (I I) (e.n)>; }
$ENTRY Go { e.n = I I I <F7 e.n > ; }
F7 { = ;
I e.1 = I I <F7 e.1 > ; }
Specialization of interpreters.
<IntL ( Program )
e.data
<GO
>
>
An interpreter IntL of a programming language L is specialized with respect to
a given program. IntL is written in a language M, while the program is written in
L, so we expect an optimal program written in M as a result of specialization.
Thus a specializer may be used as a compiler from L into M, where M is the
subject language of the specializer.
Self-description
Syntax of an algorithmically full subset of the Refal language.
Program ::= $ENTRY definition+
definition ::= function-name { sentence;+ }
sentence ::= pattern = expr
expr ::= empty | term expr1 | function-call expr1
function-call ::= <function-name expr>
pattern ::= empty | term pattern1
term ::= SYMBOL | var | (expr)
var ::= e.name | t.name | s.name
empty ::=
/* nihil */
There are two additional restrictions:
1) the set of the variables of the right side of a sentence is a subset of the variable
set of the left side of the sentence;
2) two e-variables are not allowed on the same parenthesis structure in the
patterns. (For example, the pattern (e.1) e.2 (e.3) is allowed, while (e.1 A e.2)
e.3 is not.)
Specialization of the interpreter (I)
Example 1:
The task: <Int Palindrome e.data>
Palindrome {
= True;
s.1 = True;
s.1 e.2 s.1 = <Palindrome e.2>;
e.1 = False;
}
The residual program:
* InputFormat: <F27 e.data >
F27 {
= True ;
s.5 = True ;
s.4 e.1 s.4 = <F27 e.1 > ;
e.1 = False ;
}
Example 2: (Refal style reversing)
The task: <Int rev e.data>
rev {
t.x e.ys = <rev e.ys> t.x;
=;
}
The residual program:
* InputFormat: <F27 e.data >
F27 {
t.4 e.1 = <F27 e.1 > t.4 ;
=;
}
Specialization of the interpreter (II)
Example 3: (Lisp style reversing)
The task: <Int Go-rev1 e.data>
$ENTRY Go { e.ys = <rev1 e.ys ()>; }
rev1 {
t.x e.ys (e.res) = <rev1 e.ys (t.x e.res)>;
(e.res) = e.res; }
The residual program:
$ENTRY Go {
e.data = <F80 () e.data> ;
}
* InputFormat: <F80 (e.1) e.2>
F80 {
(e.1) t.6 e.2 = <F80 (t.6 e.1 ) e.2> ;
(e.1) = e.1 ;
}
Specialization of the interpreter (III)
Example 4: (The following program replaces every occurrence of the identifier
Lisp with the identifier Refal in an arbitrary Refal datum.)
The task: <Int Go-Replace e.data>
$ENTRY Go { e.where = <Replace (Lisp Refal) e.where>; }
Replace {
(s.what e.value) = ;
(s.what e.value) s.what e.where = e.value <Replace (s.what e.value) e.where>;
(s.what e.value) s.y
e.where = s.y <Replace (s.what e.value) e.where>;
(s.what e.value) (e.y) e.where
= (<Replace (s.what e.value) e.y>) <Replace (s.what e.value) e.where>; }
$ENTRY Go { e.data = <F42 e.data> ; }
* InputFormat: <F42 e.1>
F42 {
=;
Lisp e.1 = Refal <F42 e.1> ;
s.4
e.1 = s.4 <F42 e.1> ;
('*' e.2) e.1 = ('*' <F42 e.2>) <F42 e.1> ; }
The residual program.
Interpreter of the Turing Machine
written in Refal (I)
* <Turing (e.Program) (s.CurrentState) (e.LeftPartOfTape)
*
(s.CurrentSymbol)(e.RightPartOfTape)>
Turing {
(e.instr) (stop) (e.left) (s.symbol) (e.right) = (e.left) (s.symbol) (e.right);
(e.instr) (s.q) (e.left) (s.symbol) (e.right)
= <Turing (e.instr) <Turing1
<Search (s.q s.symbol) (e.instr)>
(e.left) (s.symbol) (e.right)> >;
}
Interpreter of the Turing Machine
written in Refal (II)
Turing1 {
(s.c s.r left) ( )
(s.symbol) (e.right) = (s.r) ( ) (B) (s.c e.right) ;
(s.c s.r left) (e.left s.a) (s.symbol) (e.right) = (s.r) (e.left) (s.a) (s.c e.right);
(s.c s.r right) (e.left) (s.symbol) ( ) = (s.r) (e.left s.c) (B) ( ) ;
(s.c s.r right) (e.left) (s.symbol) (s.a e.right) = (s.r) (e.left s.c) (s.a) (e.right) ; }
Search {
(s.key1 s.key2) ((s.key1 s.key2 e.value) e.table) = (e.value);
(s.key1 s.key2) ((e.row) e.table) = <Search (s.key1 s.key2) (e.table)>;
}
Program for the Turing Machine
The program DoublePQ replaces an array of the characters P with another array,
where each of the Ps was changed with two Qs.
We are going to specialize the interpreter of the TM with respect to this program.
The first Futamura projection by the
supercompiler Scp4 (I)
$ENTRY Go {
(e.Left) (s.Symbol) (e.Right) = <F6 (e.Left) s.Symbol e.Right>; }
F59 {
() () Q = (Q Q Q B) (B) () ;
() (s.3 e.4) Q = <F6 (Q Q Q) s.3 e.4 > ;
(e.1 s.5) (e.4) Q = <F59 (e.1) (Q e.4) s.5 > ;
(e.1) () B = (e.1 Q Q B) (B) () ;
(e.1) (s.3 e.4) B = <F6 (e.1 Q Q) s.3 e.4 > ; }
F6 {
(e.1) B = (e.1 B) (B) () ;
(e.1) B s.5 e.4 = (e.1 B) (s.5) (e.4) ;
(e.1) Q = (e.1 Q B) (B) () ;
(e.1) Q s.5 e.4 = <F6 (e.1 Q) s.5 e.4 > ;
() P = (Q Q B) (B) () ;
() P s.5 e.4 = <F6 (Q Q) s.5 e.4 > ;
(e.1 s.6) P e.4 = <F59 (e.1) (e.4) s.6 > ; }
The first Futamura projection by the
supercompiler Scp4 (II)
• For our experiment, to see the speedup, we input 512 of the Ps to the
given TM.
• The original program takes 2 634 778 Refal-steps (15.242 seconds) to
run, the result of the supercompilation takes 525 319 Refal-steps
(2.053 seconds).
• StepSpeedup = 5.016
• TimeSpeedup = 7.388 (for this example and this input data).
• All our experiments were performed under the operating system
Windows2000 with Intel Pentium-3 CPU, 262 144 KB RAM, 500
MHz.
The first Futamura projection by the
supercompiler Scp4 (III)
* The tape is infinite in both sides.
$ENTRY Go { (e.Left) (s.Symb) (e.Right) = <F6 (e.Left) s.Symb e.Right > ; }
F6 {
(e.1) B s.3 e.4 = (e.1 B) (s.3) (e.4) ;
(e.1) Q s.3 e.4 = <F6 (e.1 Q) s.3 e.4 > ;
(e.1 s.2) P e.4 = <F27 (e.1) (e.4) s.2 > ;
}
F27 {
(e.1 s.2) (e.4) Q = <F27 (e.1) (Q e.4) s.2 > ;
(e.1) (s.3 e.4) B = <F6 (e.1 Q Q) s.3 e.4 > ; }
F6 corresponds to the inner state start, while F27 corresponds to the inner state
moveleft. Each sentence corresponds to one instruction of the TM. The number
of the Refal-steps is equal to the number of the TM's steps. The output of the
supercompiler is an optimal Refal-program, i.e. a compilation of the program
for the TM into a Refal-program took place.
Supercompilation of an XSLTinterpreter (I)
• The first Futamura projection:
1. Specialization of an XSLT-interpreter written in
Refal-5 with respect to a given XSLT-program.
2. The given program is an interpreter of the
Turing Machine written in the language XSLT.
3. We supercompile the task when concrete Turing
Machines are given.
Supercompilation of an XSLTinterpreter (II)
References
[1] Nemytykh A.P., and Turchin V.F. The Supercompiler Scp4: sources, on-line
demonstration. ( Free ) http://www.botik.ru/pub/local/scp/refal5/ ,2000.
[2] Turchin, V. F., The concept of a supercompiler, ACM Transactions on Programming
Languages and Systems, 1986, 8:292-325.
[3] Nemytykh A.P., The Supercompiler Scp4: General Structure., LNCS vol. 2890,
pp.162-170, 2003.
[4] Nemytykh A.P., A Note on Elimination of Simplest Recursions. In Proceedings of
the ACM SIGPLAN Asia-PEPM'02, 138-146. ACM Press, 2002.
[5] Korlyukov A.V., and Nemytykh A.P., Supercompilation of Double Interpretation.
(How One Hour of the Machine's Time Can Be Turned to One Second). (In English),
Vestnik natcional’nogo tekhnicheskogo universiteta “Khar’kovskogo politekhnicheskogo
instituta”, Khar’kov, No. 1, 2004.
[6] Nemytykh A.P., Playing on REFAL., In Proceedings of the International Workshop on
Program Understanding, pp:29-39, 2003, A.P. Ershov Institute of Informatics Systems,
Siberian Branch of Russian Academy of Sciences, Novosibirsk – Altai Mountains, Russia.
Thank you!

Supercompilation of an XSLT

Transcript Supercompilation of an XSLT

Directory