Refactoring Functional Programs Huiqing Li Claus Reinke Simon Thompson Computing Lab, University of Kent www.cs.kent.ac.uk/projects/refactor-fp/ Writing a program -- format a list of Strings, one per line table.
Download
Report
Transcript Refactoring Functional Programs Huiqing Li Claus Reinke Simon Thompson Computing Lab, University of Kent www.cs.kent.ac.uk/projects/refactor-fp/ Writing a program -- format a list of Strings, one per line table.
Refactoring Functional
Programs
Huiqing Li
Claus Reinke
Simon Thompson
Computing Lab, University of Kent
www.cs.kent.ac.uk/projects/refactor-fp/
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> String
format []
= []
format [x]
= [x]
format (x:xs)
= (x ++ "\t") : fomrat xs
Durham Feb 04
2
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> String
format []
= []
format [x]
= [x]
format (x:xs)
= (x ++ "\t") : fomrat xs
Durham Feb 04
3
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> String
format []
= []
format [x]
= [x]
format (x:xs)
= (x ++ "\t") : format xs
Durham Feb 04
4
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> String
format []
= []
format [x]
= [x]
format (x:xs)
= (x ++ "\t") : format xs
Durham Feb 04
5
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> [String]
format []
= []
format [x]
= [x]
format (x:xs)
= (x ++ "\t") : format xs
Durham Feb 04
6
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> [String]
format []
= []
format [x]
= [x]
format (x:xs)
= (x ++ “\t") : format xs
Durham Feb 04
7
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> [String]
format []
= []
format [x]
= [x]
format (x:xs)
= (x ++ "\n") : format xs
Durham Feb 04
8
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
appNL
:: [String] -> [String]
appNL
[]
= []
appNL
[x]
= [x]
appNL
(x:xs)
= (x ++ "\n") : appNL
Durham Feb 04
xs
9
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
appNL
:: [String] -> [String]
appNL
[]
= []
appNL
[x]
= [x]
appNL
(x:xs)
= (x ++ "\n") : appNL
Durham Feb 04
xs
10
Writing a program
-- appNL
a list of Strings, one per line
table :: [String] -> String
table = concat . appNL
appNL
:: [String] -> [String]
appNL
[]
= []
appNL
[x]
= [x]
appNL
(x:xs)
= (x ++ "\n") : appNL
Durham Feb 04
xs
11
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . appNL
appNL
:: [String] -> [String]
appNL
[]
= []
appNL
[x]
= [x]
appNL
(x:xs)
= (x ++ "\n") : appNL
Durham Feb 04
xs
12
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . appNL
where
appNL
:: [String] -> [String]
appNL
[]
= []
appNL
[x]
= [x]
appNL
(x:xs)
= (x ++ "\n") : appNL
Durham Feb 04
xs
13
Refactoring
Refactoring means changing the design of
program …
… without changing its behaviour.
Refactoring comes in many forms
• micro refactoring as a part of program development,
• major refactoring as a preliminary to revision,
• as a part of debugging, …
As programmers, we do it all the time.
Durham Feb 04
14
Not just programming
Paper or presentation
moving sections about; amalgamate sections; move
inline code to a figure; animation; …
Proof
introduce lemma; remove, amalgamate hypotheses, …
Program
the topic of the lecture
Durham Feb 04
15
Overview of the talk
Example refactorings … what do we learn?
Refactoring functional programs
Generalities
Tooling: demo, rationale, design.
What comes next?
Conclusions
Durham Feb 04
16
Refactoring Functional Programs
• 3-year EPSRC-funded project
Explore the prospects of refactoring functional programs
Catalogue useful refactorings
Look into the difference between OO and FP refactoring
A real life refactoring tool for Haskell programming
A formal way to specify refactorings … and a set of proofs
that the implemented refactorings are correct.
• Currently mid-project: the second HaRe
release is module-aware.
Durham Feb 04
17
Refactoring functional programs
Semantics: can articulate preconditions and …
… verify transformations.
Absence of side effects makes big changes
predictable and verifiable …
… unlike OO.
Language support: expressive type system,
abstraction mechanisms, HOFs, …
Opens up other possibilities … proof …
Durham Feb 04
18
Rename
f x y = …
findMaxVolume x y = …
Name may be too specific,
if the function is a
candidate for reuse.
Make the specific purpose
of the function clearer.
Needs scope information: just change this f and not all
fs (e.g. local definitions or variables).
Needs module information: change f wherever it is
imported.
Durham Feb 04
19
Lift / demote
f x y = … h …
f x y = … (h y) …
where
h = …
h y = …
Hide a function which is
clearly subsidiary to f; clear
up the namespace.
Makes h accessible to the
other functions in the
module and beyond.
Needs free variable information: which of the parameters
of f is used in the definition of h?
Need h not to be defined at the top level, … , DMR.
Durham Feb 04
20
Lessons from the first examples
Changes are not limited to a single point or even
a single module: diffuse and bureaucratic …
… unlike traditional program transformation.
Many refactorings bidirectional …
… as there is never a unique correct design.
Durham Feb 04
21
How to apply refactoring?
By hand, in a text editor
Tedious
Error-prone
Depends on extensive testing
With machine support
Reliable
Low cost: easy to make and un-make large changes.
Exploratory … a full part of the programmer’s toolkit.
Durham Feb 04
22
Machine support invaluable
Current practice: editor + type checker (+ tests).
Our project: automated support for a repertoire
of refactorings …
… integrated into the existing development
process: Haskell IDEs such as vim and emacs.
Durham Feb 04
23
Demonstration of HaRe, hosted in vim.
Durham Feb 04
24
Proof of concept …
To show proof of concept it is enough to:
• build a stand-alone tool,
• work with a subset of the language,
• ‘pretty print’ the refactored source code in a
standard format.
Durham Feb 04
25
… or a useful tool?
To make a tool that will be used we must:
• integrate with existing program development
tools: the program editors emacs and vim: only
add to their capabilities;
• work with the complete Haskell 98 language;
• preserve the formatting and comments in the
refactored source code;
• allow users to extend and script the system.
Durham Feb 04
26
Refactorings implemented in HaRe
Rename
Delete
Lift (top level / one level)
Demote
Introduce definition
Remove definition
Unfold
Generalise
Add and remove parameters
All these refactorings are module aware.
Durham Feb 04
27
Implementing HaRe: an example
-- This is an example
module Main where
sumSquares x y = sq x + sq y
where sq :: Int->Int
sq x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
Promote the definition of sq to top level
Durham Feb 04
28
Implementing HaRe: an example
-- This is an example
module Main where
sumSquares x y = sq x + sq y
where sq :: Int->Int
sq x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
Identify the definition of sq to be promoted
Durham Feb 04
29
Implementing HaRe: an example
-- This is an example
module Main where
sumSquares x y = sq x + sq y
where sq :: Int->Int
sq x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
Is sq defined at top level, here or in importing
modules; is sq imported from elsewhere?
Durham Feb 04
30
Implementing HaRe: an example
-- This is an example
module Main where
sumSquares x y = sq x + sq y
where sq :: Int->Int
sq x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
Does sq use anything defined locally to
sumSquares ?
Durham Feb 04
31
Implementing HaRe: an example
-- This is an example
module Main where
sumSquares x y = sq pow x + sq pow y
where sq :: Int->Int->Int
sq pow x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
If so, generalise to add these as parameters, and
change type signature.
Durham Feb 04
32
Implementing HaRe: an example
-- This is an example
module Main where
sumSquares x y = sq pow x + sq pow y
where pow = 2 :: Int
sq :: Int->Int->Int
sq pow x = x ^ pow
main = sumSquares 10 20
Finally, move the definition to top level.
Durham Feb 04
33
The Implementation of Hare
Information
gathering
Pre-condition
checking
Program
transformation
Program
rendering
Durham Feb 04
34
Information needed
Syntax: replace the function called sq, not the
variable sq …… parse tree.
Static semantics: replace this function sq, not all
the sq functions …… scope information.
Module information: what is the traffic between
this module and its clients …… call graph.
Type information: replace this identifier when it
is used at this type …… type annotations.
Durham Feb 04
35
Infrastructure
To achieve this we chose to:
• build a tool that can interoperate with emacs,
vim, … yet act separately.
• leverage existing libraries for processing
Haskell 98, for tree transformation, yet …
… modify them as little as possible.
• be as portable as possible, in the Haskell space.
Durham Feb 04
36
The Haskell background
Libraries
• parser:
• type checker:
• tree transformations:
many
few
few
Difficulties
• Haskell98 vs. Haskell extensions.
• Libraries: proof of concept vs. distributable.
• Source code regeneration.
• Real project
Durham Feb 04
37
Programatica
Project at OGI to build a Haskell system …
… with integral support for verification at various
levels: assertion, testing, proof etc.
The Programatica project has built a Haskell
front end in Haskell, supporting syntax, static,
type and module analysis …
… freely available under BSD licence.
Durham Feb 04
38
The Implementation of Hare
Information
gathering
Pre-condition
checking
Program
transformation
Program
rendering
Durham Feb 04
39
First steps … lifting and friends
Use the Haddock parser … full Haskell given in
500 lines of data type definitions.
Work by hand over the Haskell syntax: 27 cases
for expressions …
Code for finding free variables, for instance …
Durham Feb 04
40
Finding free variables … 100 lines
instance FreeVbls HsExp where
freeVbls (HsVar v) = [v]
freeVbls (HsApp f e)
= freeVbls f ++ freeVbls e
freeVbls (HsLambda ps e)
= freeVbls e \\ concatMap paramNames ps
freeVbls (HsCase exp cases)
= freeVbls exp ++ concatMap freeVbls cases
freeVbls (HsTuple _ es)
= concatMap freeVbls es
… etc.
Durham Feb 04
41
This approach
Boiler plate code …
… 1000 lines for 100 lines of significant code.
Error prone: significant code lost in the noise.
Want to generate the boiler plate and the tree
traversals …
… DriFT: Winstanley, Wallace
… Strafunski: Lämmel and Visser
Durham Feb 04
42
Strafunski
Strafunski allows a user to write general (read
generic), type safe, tree traversing programs …
… with ad hoc behaviour at particular points.
Traverse through the tree accumulating free
variables from component parts, except in the
case of lambda abstraction, local scopes, …
Strafunski allows us to work within Haskell …
other options are under development.
Durham Feb 04
43
Rename an identifier
rename:: (Term t)=>PName->HsName->t->Maybe t
rename oldName newName = applyTP worker
where
worker = full_tdTP (idTP ‘adhocTP‘ idSite)
idSite
idSite
| v
idSite
Durham Feb 04
:: PName -> Maybe PName
v@(PN name orig)
== oldName = return (PN newName orig)
pn = return pn
44
The coding effort
Transformations with Strafunski are
straightforward …
… the chore is implementing conditions that
guarantee that the transformation is meaningpreserving.
This is where the bulk of our code lies.
Durham Feb 04
45
The Implementation of Hare
Information
gathering
Pre-condition
checking
Program
transformation
Program
rendering
Durham Feb 04
46
Program rendering example
-- This is an example
module Main where
sumSquares x y = sq x + sq y
where sq :: Int->Int
sq x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
Promote the definition of sq to top level
Durham Feb 04
47
Program rendering example
module Main where
sumSquares x y
= sq pow x + sq pow y where pow
= 2 :: Int
sq :: Int->Int->Int
sq pow x = x ^ pow
main = sumSquares 10 20
Using a pretty printer: comments lost and layout
quite different.
Durham Feb 04
48
Program rendering example
-- This is an example
module Main where
sumSquares x y = sq x + sq y
where sq :: Int->Int
sq x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
Promote the definition of sq to top level
Durham Feb 04
49
Program rendering example
-- This is an example
module Main where
sumSquares x y = sq pow x + sq pow y
where pow = 2 :: Int
sq :: Int->Int->Int
sq pow x = x ^ pow
main = sumSquares 10 20
Layout and comments preserved.
Durham Feb 04
50
Rendering: our approach
White space and comments in the token stream.
2 views of the program: token stream and AST.
Modification of the AST guides the modification
of the token stream.
After a refactoring, the program source is
extracted from the token stream not the AST.
Use heuristics to associate comments with
semantic entities.
Durham Feb 04
51
Production tool (version 0)
Programatica
parser and
type checker
Durham Feb 04
Refactor
using a
Strafunski
engine
Render code
from the
token stream
and
syntax tree.
52
Production tool (version 1)
Programatica
parser and
type checker
Refactor
using a
Strafunski
engine
Render code
from the
token stream
and
syntax tree.
Pass lexical
information to
update the
syntax tree
and so avoid
reparsing
Durham Feb 04
53
Module awareness: example
Move a top-level definition f from module A to B.
-- Is f defined at the top-level of B?
-- Are the free variables in f accessible within module B?
-- Will the move require recursive modules?
-- Remove the definition of f from module A.
-- Add the definition to module B.
-- Modify the import/export in module A, B and the client
modules of A and B if necessary.
-- Change uses of A.f to B.f or f in all affected modules.
-- Resolve ambiguity.
Durham Feb 04
54
What have we learned?
Emerging Haskell libraries make it a practical
platform.
Efficiency issues … type checking large systems.
Limitations of IDE interactions in vim and emacs.
Reflections on Haskell itself.
Durham Feb 04
55
Reflecting on Haskell
Cannot hide items in an export list (though you
can on import).
The formal semantics of pattern matching is
problematic.
‘Ambiguity’ vs. name clash.
‘Tab’ is a nightmare!
Correspondence principle fails …
Durham Feb 04
56
Correspondence
Operations on definitions and operations on
expressions can be placed in correspondence
(R.D.Tennent, 1980)
Durham Feb 04
57
Correspondence
Definitions
Expressions
where
let
f x y = e
\x y -> e
f x
| g1
| g2
f x = if g1 then e1
else if g2 …
Durham Feb 04
=
=
e1
e2
58
Where do we go next?
• Larger-scale examples: ADTs, monads, …
• An API for do-it-yourself refactorings, or …
• … a language for composing refactorings
• Detecting ‘bad smells’
• Evolving the evidence: GC6.
Durham Feb 04
59
What do users want?
Find and remove duplicate code.
Argument permutations.
Data refactorings.
More traditional program transformations.
Monadification.
Durham Feb 04
60
Monadification (cf Erwig)
r = f e1 e2
do
v1 <- e1
v2 <- e2
r
<- f v1 v2
return r
Durham Feb 04
61
Larger-scale examples
More complex examples in the functional
domain; often link with data types.
Dawning realisation that can some refactorings
are pretty powerful.
Bidirectional … no right answer.
Durham Feb 04
62
Algebraic or abstract type?
data Tr a
= Leaf a |
Node a (Tr a) (Tr a)
Tr
Leaf
Node
flatten :: Tr a -> [a]
flatten (Leaf x) = [x]
flatten (Node s t)
= flatten s ++
flatten t
Durham Feb 04
63
Algebraic or abstract type?
Tr
isLeaf
data Tr a
= Leaf a |
Node a (Tr a) (Tr a)
isLeaf = …
isNode = …
isNode
flatten :: Tr a -> [a]
leaf
left
flatten t
right
| isleaf t = [leaf t]
mkLeaf
| isNode t
mkNode
= flatten (left t)
++ flatten (right t)
…
Durham Feb 04
64
Algebraic or abstract type?
Pattern matching syntax is
more direct …
Allows changes in the
implementation type
without affecting the client:
e.g. might memoise
… but can achieve a
considerable amount with
field names.
Other reasons? Simplicity
(due to other refactoring
steps?).
Durham Feb 04
Problematic with a primitive
type as carrier.
Allows an invariant to be
preserved.
65
Outside or inside?
Tr
isLeaf
isNode
data Tr a
= Leaf a |
Node a (Tr a) (Tr a)
isLeaf = …
…
Durham Feb 04
flatten :: Tr a -> [a]
leaf
left
flatten t
right
| isleaf t = [leaf t]
mkLeaf
| isNode t
mkNode
= flatten (left t)
++ flatten (right t)
66
Outside or inside?
Tr
isLeaf
isNode
data Tr a
= Leaf a |
Node a (Tr a) (Tr a)
leaf
left
right
mkLeaf
isLeaf = …
mkNode
…
flatten
flatten = …
Durham Feb 04
67
Outside or inside?
If inside and the type is
reimplemented, need to
reimplement everything in
the signature, including
flatten.
If inside can modify the
implementation to memoise
values of flatten, or to give
a better implementation
using the concrete type.
The more outside the
better, therefore.
Layered types possible: put
the utilities in a privileged
zone.
Durham Feb 04
68
API
Refactorings
Refactoring
utilities
Strafunski
Haskell
Durham Feb 04
69
DSL
Combining forms
Refactorings
Refactoring
utilities
Strafunski
Haskell
Durham Feb 04
70
Detecting
‘bad
smells’
Work by
Chris Ryder
Durham Feb 04
71
Evolving the evidence
Dependable System Evolution is the software
engineering grand challenge.
Build systems with evidence of their
dependability …
… but this begs the question of how to evolve
the evidence in line with the system.
Refactoring proofs, test coverage data etc.
Durham Feb 04
72
Teaching and learning design
Exciting prospect of using a refactoring tool as
an integral part of an elementary programming
course.
Learning a language: learn how you could
modify the programs that you have written …
… appreciate the design space, and
… the features of the language.
Durham Feb 04
73
Conclusions
Refactoring + functional programming: good fit.
Practical tool … not ‘yet another type tweak’.
Leverage from available libraries … with work.
We have begun to use the tool in building itself!
Much more to do than we have time for.
Martin Fowler’s ‘Rubicon’: ‘extract definition’ … in
HaRe version 1 … fp productivity.
Durham Feb 04
74
www.cs.kent.ac.uk/projects/refactor-fp/