Refactoring Functional Programs Huiqing Li Claus Reinke Simon Thompson Computing Lab, University of Kent www.cs.kent.ac.uk/projects/refactor-fp/ Writing a program -- format a list of Strings, one per line table.

Download Report

Transcript Refactoring Functional Programs Huiqing Li Claus Reinke Simon Thompson Computing Lab, University of Kent www.cs.kent.ac.uk/projects/refactor-fp/ Writing a program -- format a list of Strings, one per line table.

Refactoring Functional Programs

Huiqing Li Claus Reinke Simon Thompson

Computing Lab, University of Kent

www.cs.kent.ac.uk/projects/refactor-fp/

Writing a program

-- format a list of Strings, one per line table :: [String] -> String table = concat . format format :: [String] -> String format [] = [] format [x] = [x] format (x:xs) = (x ++ "\t") : fomrat xs

Manchester 04 2

Writing a program

-- format a list of Strings, one per line table :: [String] -> String table = concat . format format :: [String] -> String format [] = [] format [x] = [x] format (x:xs) = (x ++ "\t") : fomrat xs

Manchester 04 3

Writing a program

-- format a list of Strings, one per line table :: [String] -> String table = concat . format format :: [String] -> String format [] = [] format [x] = [x] format (x:xs) = (x ++ "\t") : format xs

Manchester 04 4

Writing a program

-- format a list of Strings, one per line table :: [String] -> String table = concat . format format :: [String] -> String format [] = [] format [x] = [x] format (x:xs) = (x ++ "\t") : format xs

Manchester 04 5

Writing a program

-- format a list of Strings, one per line table :: [String] -> String table = concat . format format :: [String] -> [String] format [] = [] format [x] = [x] format (x:xs) = (x ++ "\t") : format xs

Manchester 04 6

Writing a program

-- format a list of Strings, one per line table :: [String] -> String table = concat . format format :: [String] -> [String] format [] = [] format [x] = [x] format (x:xs) = (x ++ “ \t ") : format xs

Manchester 04 7

Writing a program

-- format a list of Strings, one per line table :: [String] -> String table = concat . format format :: [String] -> [String] format [] = [] format [x] = [x] format (x:xs) = (x ++ "\n") : format xs

Manchester 04 8

Writing a program

-- format a list of Strings, one per line table :: [String] -> String table = concat . format appNL :: [String] -> [String] appNL [] = [] appNL [x] = [x] appNL (x:xs) = (x ++ "\n") : appNL xs

Manchester 04 9

Writing a program

-- format a list of Strings, one per line table :: [String] -> String table = concat . format appNL :: [String] -> [String] appNL [] = [] appNL [x] = [x] appNL (x:xs) = (x ++ "\n") : appNL xs

Manchester 04 10

Writing a program

- appNL a list of Strings, one per line table :: [String] -> String table = concat . appNL appNL :: [String] -> [String] appNL [] = [] appNL [x] = [x] appNL (x:xs) = (x ++ "\n") : appNL xs

Manchester 04 11

Writing a program

-- format a list of Strings, one per line table :: [String] -> String table = concat . appNL appNL :: [String] -> [String] appNL [] = [] appNL [x] = [x] appNL (x:xs) = (x ++ "\n") : appNL xs

Manchester 04 12

Writing a program

-- format a list of Strings, one per line table :: [String] -> String table = concat . appNL where appNL :: [String] -> [String] appNL [] = [] appNL [x] = [x] appNL (x:xs) = (x ++ "\n") : appNL xs

Manchester 04 13

Refactoring

Refactoring means changing the design program … of … without changing its behaviour .

Refactoring comes in many forms • micro refactoring as a part of program development, • major refactoring as a preliminary to revision, • as a part of debugging, … As programmers, we do it all the time.

Manchester 04 14

Not just programming

Paper or presentation moving sections about; amalgamate sections; move inline code to a figure; animation; … Proof introduce lemma; remove, amalgamate hypotheses, … Program the topic of the lecture Manchester 04 15

Overview of the talk

Example refactorings … what do we learn?

Refactoring functional programs Generalities Tooling: demo, rationale, design.

What comes next?

Conclusions Manchester 04 16

Refactoring Functional Programs

• 3-year EPSRC-funded project   Explore the prospects of refactoring functional programs Catalogue useful refactorings   Look into the difference between OO and FP refactoring A real life refactoring tool for Haskell programming  A formal way to specify refactorings … and a set of proofs that the implemented refactorings are correct.

• Currently mid-project: the latest HaRe release is module-aware and has module refactorings.

Manchester 04 17

Refactoring

functional

programs

Semantics: can articulate preconditions and … … verify transformations.

Absence of side effects makes big changes predictable and verifiable … … unlike OO. Language support: expressive type system , abstraction mechanisms, HOFs , … Opens up other possibilities … proof … Manchester 04 18

Rename

f x y = … findMaxVolume x y = …

 Name may be too specific, if the function is a candidate for reuse.

 Make the specific purpose of the function clearer.

Needs scope information: just change this

f

s (e.g. local definitions or variables).

f

and not all Needs module imported.

information: change

f

wherever it is Manchester 04 19

Lift / demote

f x y = … h … where h = …

 Hide a function which is clearly subsidiary to

f

; clear up the namespace.

f x y = … (h y) … h y = …

 Makes

h

accessible to the other functions in the module and beyond.

Needs of

f

free variable information: which of the parameters is used in the definition of

h

?

Need

h

not to be defined at the top level, … , DMR.

Manchester 04 20

Lessons from the first examples

Changes are not limited to a single point or even a single module: diffuse and bureaucratic … … unlike traditional program transformation.

Many refactorings bidirectional … … as there is never a unique correct design.

Manchester 04 21

How to apply refactoring?

By hand, in a text editor Tedious Error-prone Depends on extensive testing With machine support Reliable Low cost: easy to make and un-make large changes.

Exploratory … a full part of the programmer’s toolkit.

Manchester 04 22

Machine support invaluable

Current practice: editor + type checker (+ tests).

Our project: automated support for a repertoire of refactorings … … integrated into the existing development process: Haskell IDEs such as vim and emacs.

Manchester 04 23

Demonstration of HaRe, hosted in vim.

Manchester 04 24

Proof of concept …

To show proof of concept it is enough to: • build a stand-alone tool, • work with a subset of the language, • ‘pretty print’ the refactored source code in a standard format.

Manchester 04 25

… or a useful tool?

To make a tool that will be used we must: • integrate with existing program development tools: the program editors emacs and vim: only add to their capabilities ; • work with the complete Haskell 98 language ; • preserve the formatting and comments in the refactored source code; • allow users to extend and script the system.

Manchester 04 26

Refactorings implemented in HaRe

Rename Delete Lift (top / one level) Demote Introduce definition Remove definition Unfold Generalise Add / remove params Move def between modules Delete /add to exports Clean imports Make imports explicit All these refactorings are module aware.

Manchester 04 27

The Implementation of Hare

Information gathering Pre-condition checking Program transformation Program rendering Manchester 04 28

Information needed

Syntax: replace the function called

sq

, not the variable

sq

…… parse tree .

Static semantics: the

sq

replace this function

sq

, not all functions …… scope information .

Module information: what is the traffic between this module and its clients …… call graph .

Type information: replace this identifier when it is used at this type …… type annotations .

Manchester 04 29

Infrastructure

To achieve this we chose to: • build a tool that can vim, … yet act interoperate with emacs, separately .

• leverage existing libraries for processing Haskell 98, for tree transformation, yet … … modify them as little as possible.

• be as portable as possible, in the Haskell space.

Manchester 04 30

The Haskell background

Libraries • parser: • type checker: • tree transformations: many few few Difficulties • Haskell98 vs. Haskell extensions.

• Libraries: proof of concept vs. distributable.

• Source code regeneration.

• Real project Manchester 04 31

Programatica

Project at OGI to build a Haskell system … … with integral support for verification at various levels: assertion, testing, proof etc.

The Programatica project has built a Haskell front end in Haskell, supporting syntax, static, type and module analysis … … freely available under BSD licence.

Manchester 04 32

The Implementation of Hare

Information gathering Pre-condition checking Program transformation Program rendering Manchester 04 33

First steps … lifting and friends

Use the Haddock parser … full Haskell given in 500 lines of data type definitions.

Work by hand over the Haskell syntax: 27 cases for expressions … Code for finding free variables, for instance … Manchester 04 34

Finding free variables … 100 lines

instance FreeVbls HsExp where freeVbls (HsVar v) = [v] freeVbls (HsApp f e) = freeVbls f ++ freeVbls e freeVbls (HsLambda ps e) = freeVbls e \\ concatMap paramNames ps freeVbls (HsCase exp cases) = freeVbls exp ++ concatMap freeVbls cases freeVbls (HsTuple _ es) = concatMap freeVbls es … etc.

Manchester 04 35

This approach

Boiler plate code … … 1000 lines for 100 lines of significant code.

Error prone: significant code lost in the noise.

Want to generate the boiler plate and the tree traversals … … DriFT : Winstanley, Wallace … Strafunski : Lämmel and Visser Manchester 04 36

Strafunski

Strafunski allows a user to write general (read generic ), type safe, tree traversing programs … … with ad hoc behaviour at particular points.

Traverse through the tree accumulating variables from component parts, except free in the case of lambda abstraction, local scopes, … Strafunski allows us to work within Haskell … other options are under development.

Manchester 04 37

Rename an identifier

rename:: (Term t)=>PName->HsName->t->Maybe t rename oldName newName = applyTP worker where worker = full_tdTP ( idTP ‘ adhocTP ‘ idSite) idSite :: PName -> Maybe PName idSite v@(PN name orig) | v == oldName = return (PN newName orig) idSite pn = return pn

Manchester 04 38

The coding effort

Transformations with Strafunski are straightforward … … the chore is implementing guarantee that the transformation is meaning preserving.

conditions that This is where much of our code lies.

Manchester 04 39

The Implementation of Hare

Information gathering Pre-condition checking Program transformation Program rendering Manchester 04 40

Program rendering example

-- This is an example module Main where sumSquares x y = sq x + sq y where sq :: Int->Int sq x = x ^ pow pow = 2 :: Int main = sumSquares 10 20

Promote the definition of

sq

to top level Manchester 04 41

Program rendering example

module Main where sumSquares x y = sq pow x + sq pow y where pow = 2 :: Int sq :: Int->Int->Int sq pow x = x ^ pow main = sumSquares 10 20

Using a pretty printer: comments lost and layout quite different.

Manchester 04 42

Program rendering example

-- This is an example module Main where sumSquares x y = sq x + sq y where sq :: Int->Int sq x = x ^ pow pow = 2 :: Int main = sumSquares 10 20

Promote the definition of

sq

to top level Manchester 04 43

Program rendering example

-- This is an example module Main where sumSquares x y = sq pow x + sq pow y where pow = 2 :: Int sq :: Int->Int->Int sq pow x = x ^ pow main = sumSquares 10 20

Layout and comments preserved.

Manchester 04 44

Rendering: our approach

White space and comments in the token stream.

2 views of the program: token stream and AST.

Modification of the AST guides the modification of the token stream.

After a refactoring, the program source is extracted from the token stream not the AST.

Use heuristics to associate comments with semantic entities.

Manchester 04 45

Production tool (version 0)

Programatica parser and type checker Refactor using a Strafunski engine Render code from the token stream and syntax tree.

Manchester 04 46

Production tool (version 1)

Programatica parser and type checker Refactor using a Strafunski engine Render code from the token stream and syntax tree.

Pass lexical information to update the syntax tree and so avoid reparsing Manchester 04 47

Module awareness: example

Move a top-level definition f from module A to B .

-- Is f defined at the top-level of B ?

-- Are the free variables in f accessible within module B ?

-- Will the move require recursive modules?

-- Remove the definition of f from module A .

-- Add the definition to module B .

-- Modify the import/export lists in module A, B and the client modules of A and B if necessary. -- Change uses of A.f to B.f

or f in all affected modules.

-- Resolve ambiguity.

Manchester 04 48

What have we learned?

Emerging Haskell libraries make it a practical platform.

Efficiency issues … type checking large systems.

Limitations of IDE interactions in vim and emacs.

Reflections on Haskell itself.

Manchester 04 49

Reflecting on Haskell

Cannot hide items in an export list (though you can on import).

The formal semantics of pattern matching is problematic.

‘Ambiguity’ vs. name clash.

‘Tab’ is a nightmare!

Correspondence principle fails … Manchester 04 50

Correspondence

Operations on definitions and operations on expressions can be placed in correspondence (R.D.Tennent, 1980) Manchester 04 51

Correspondence

Definitions

where

Expressions

let f x y = e f x | g1 = e1 | g2 = e2

Manchester 04

\x y -> e f x = if g1 then e1 else if g2 …

52

Where do we go next?

• Larger-scale examples: ADTs, monads, … • An API for do-it-yourself refactorings, or … • … a language for composing refactorings • Detecting ‘bad smells’ • Evolving the evidence: GC6.

Manchester 04 53

What do users want?

Find and remove duplicate code.

Argument permutations.

Data refactorings.

More traditional program transformations.

Monadification.

Manchester 04 54

Monadification (cf Erwig)

r = f e1 e2 do v1 <- e1 v2 <- e2 r <- f v1 v2 return r

Manchester 04 55

Larger-scale examples

More complex examples in the functional domain; often link with data types.

Dawning realisation that can some refactorings are pretty powerful.

Bidirectional … no right answer.

Manchester 04 56

Algebraic or abstract type?

data Tr a = Leaf a | Node a (Tr a) (Tr a) Tr Leaf Node flatten :: Tr a -> [a] flatten (Leaf x) = [x] flatten (Node s t) = flatten s ++ flatten t

Manchester 04 57

Algebraic or abstract type?

data Tr a = Leaf a | Node a (Tr a) (Tr a) isLeaf = … isNode = … … Tr isLeaf isNode leaf left right mkLeaf mkNode flatten :: Tr a -> [a] flatten t | isleaf t = [leaf t] | isNode t = flatten (left t) ++ flatten (right t)

Manchester 04 58

Algebraic or abstract type?

 Pattern matching syntax is more direct … … but can achieve a considerable amount with field names. Other reasons? Simplicity (due to other refactoring steps?).

 Allows changes in the implementation type without affecting the client: e.g. might memoise Problematic with a primitive type as carrier.

Allows an invariant preserved.

to be Manchester 04 59

Outside or inside?

data Tr a = Leaf a | Node a (Tr a) (Tr a) isLeaf = … … Tr isLeaf isNode leaf left right mkLeaf mkNode flatten :: Tr a -> [a] flatten t | isleaf t = [leaf t] | isNode t = flatten (left t) ++ flatten (right t)

Manchester 04 60

Outside or inside ?

data Tr a = Leaf a | Node a (Tr a) (Tr a) isLeaf = … … flatten = … Tr isLeaf isNode leaf left right mkLeaf mkNode flatten

Manchester 04 61

Outside or inside?

 If inside and the type is reimplemented , need to reimplement everything in the signature, including

flatten

. The more outside the better, therefore.

 If inside can implementation to memoise values of modify

flatten

the , or to give a better implementation using the concrete type.

Layered types the utilities in a privileged zone.

possible: put Manchester 04 62

API

Refactorings Refactoring utilities Strafunski Haskell Manchester 04 63

DSL

Combining forms Refactorings Refactoring utilities Strafunski Haskell Manchester 04 64

Detecting ‘bad smells’

Work by Chris Ryder Manchester 04 65

Evolving the evidence

Dependable System Evolution engineering grand challenge.

is the software Build systems with evidence of their dependability … … but this begs the question of how to evolve the evidence in line with the system.

Refactoring proofs, test coverage data etc. Manchester 04 66

Teaching and learning design

Exciting prospect of using a refactoring tool as an integral part of an elementary programming course.

Learning a language: learn how you could modify the programs that you have written … … appreciate the design space , and … the features of the language .

Manchester 04 67

Conclusions

Refactoring + functional programming: good fit.

Practical tool … not ‘yet another type tweak’.

Leverage from available libraries … with work.

We have begun to use the tool in building itself!

Much more to do than we have time for.

Martin Fowler’s ‘Rubicon’: ‘extract definition’ … in HaRe version 1 … fp productivity.

Manchester 04 68

www.cs.kent.ac.uk/projects/refactor-fp/