Powerpoint slides

Download Report

Transcript Powerpoint slides

Scrap your boilerplate:
generic programming in
Haskell
Ralf Lämmel, Vrije University
Simon Peyton Jones, Microsoft Research
The problem: boilerplate code
Company
Dept “Research”
Manager
“Fred”
Dept “Production”
Manager
“Bill”
£10k
£15k
Dept “Devt”
Dept “Manuf”
Employee
“Fred”
£10k
Find all people in tree
and increase their
salary by 10%
The problem: boilerplate code
data
data
data
data
data
data
type
type
type
Company
Dept
SubUnit
Employee
Person
Salary
Manager
Name
Address
=
=
=
=
=
=
=
=
=
C [Dept]
D Name Manager [SubUnit]
PU Employee | DU Dept
E Person Salary
P Name Address
S Float
Employee
String
String
incSal :: Float -> Company -> Company
The problem: boilerplate code
incSal :: Float -> Company -> Company
incSal k (C ds) = C (map (incD k) ds)
incD :: Float -> Dept -> Dept
incD k (D n m us) = D n (incE k m) (map (incU k) us)
incU :: Float -> SubUnit -> SubUnit
incU k (PU e) = incE k e
incU k (DU d) = incD k d
incE :: Float -> Employee -> Employee
incE k (E p s) = E p (incS k s)
incS :: Float -> Salary -> Salary
incS k (S f) = S (k*f)
Boilerplate is bad
Boilerplate is tedious to write
Boilerplate is fragile: needs to be changed
when data type changes (“schema
evolution”)
Boilerplate obscures the key bits of code
Getting rid of boilerplate
Use an un-typed language, with a fixed
collection of data types
Convert to a universal type and write
(untyped) traversals over that
Use “reflection” to query types and
traverse child nodes
Getting rid of boilerplate
Generic (aka polytypic) programming: define
function by induction over the (structure of the)
type of its argument
generic inc<t> :: Float -> t -> t
inc<1>
k Unit
= Unit
inc<a+b> k (Inl x) = Inl (inc<a> k x)
inc<a+b> k (Inr y) = Inr (inc<b> k y)
inc<a*b> k (x, y)
= (inc<a> k x, inc<a> k y)
PhD required. Elegant only for “totally generic”
functions (read, show, equality)
Our solution
Generic programming for the rest of us
Typed language
Works for arbitrary data types:
parameterised, mutually recursive,
nested...
No encoding to/from some other type
Very modest language support
Elegant application of Haskell's type
classes
Our solution
incSal :: Float -> Company -> Company
incSal k = everywhere (mkT (incS k))
incS :: Float -> Salary -> Salary
incS k (S f) = S (k*f)
Two ingredients
incSal :: Float -> Company -> Company
incSal k = everywhere (mkT (incS k))
2. Apply a
function to
every node
in the tree
1. Build the
function to apply
to every node,
from incS
incS :: Float -> Salary -> Salary
incS k (S f) = S (k*f)
Type classes
member :: a -> [a] -> Bool
member x []
= False
member x (y:ys) | x==y
= True
| otherwise = member x ys
No! member is not truly polymorphic: it
does not work for any type a, only for
those on which equality is defined.
Type classes
member :: Eq a => a -> [a] ->
member x []
=
member x (y:ys) | x==y
=
| otherwise =
Bool
False
True
member x ys
The class constraint "Eq a" says that
member only works on types that belong
to class Eq.
Type classes
class Eq a where
(==) :: a -> a -> Bool
instance Eq Int where
(==) i1 i2 = eqInt i1 i2
instance (Eq
(==) []
(==) (x:xs)
(==) xs
a) => Eq
[]
=
(y:ys) =
ys
=
[a] where
True
(x == y) && (xs == ys)
False
member :: Eq a => a -> [a] -> Bool
member x []
= False
member x (y:ys) | x==y
= True
| otherwise = member x ys
Implementing type classes
data Eq a = MkEq (a->a->Bool)
eq (MkEq e) = e
dEqInt :: Eq Int
dEqInt = MkEq eqInt
Instance
declarations create
dictionaries
Class witnessed
by a “dictionary”
of methods
dEqList :: Eq a -> Eq [a]
dEqList (MkEq e) = MkEq el
where el []
[]
= True
el (x:xs) (y:ys) = x `e` y && xs `el` ys
el xs
ys
= False
member :: Eq a -> a -> [a] ->
member d x []
member d x (y:ys) | eq d x y
| otherwise
Overloaded
functions
take extra
dictionary
parameter(s)
Bool
= False
= True
= member d x ys
Ingredient 1: type extension
(mkT f) is a function that


behaves just like f on arguments whose type
is compatible with f's,
behaves like the identity function on all other
arguments
So applying (mkT (incS k)) to all nodes
in the tree will do what we want.
Type safe cast
cast :: (Typeable a, Typeable b)
=> a -> Maybe b
ghci> (cast 'a') :: Maybe Char
Just 'a'
ghci> (cast 'a') :: Maybe Bool
Nothing
ghci> (cast True) :: Maybe Bool
Just True
Type extension
mkT :: (Typeable a, Typeable b)
=> (a->a) -> (b->b)
mkT f = case cast f of
Just g -> g
Nothing -> id
ghci> (mkT not) True
False
ghci> (mkT not) 'a'
'a'
Implementing cast
An Int, perhaps
data TypeRep
instance Eq TypeRep
mkRep :: String -> [TypeRep] -> TypeRep
class Typeable a where
typeOf :: a -> TypeRep
instance Typeable Int where
typeOf i = mkRep "Int" []
Guaranteed not
to evaluate its
argument
Implementing cast
class Typeable a where
typeOf :: a -> TypeRep
instance (Typeable a, Typeable b)
=> Typeable (a,b) where
typeOf p = mkRep "(,)" [ta,tb]
where
ta = typeOf (fst p)
tb = typeOf (snd p)
Implementing cast
cast :: (Typeable a, Typeable b)
=> a -> Maybe b
cast x = r
where
r = if typeOf x = typeOf (get r)
then Just (unsafeCoerce x)
else Nothing
get :: Maybe a -> a
get x = undefined
Implementing cast
In GHC:


Typeable instances are generated
automatically by the compiler for any data
type
The definition of cast is in a library
Then cast is sound
Bottom line: cast is best thought of as a
language extension, but it is an easy one
to implement. All the hard work is done
by type classes
Two ingredients
incSal :: Float -> Company -> Company
incSal k = everywhere (mkT (incS k))
2. Apply a
function to
every node
in the tree
1. Build the
function to apply
to every node,
from incS
incS :: Float -> Salary -> Salary
incS k (S f) = S (k*f)
Ingredient 2: traversal
Step 1: implement one-layer traversal
Step 2: extend one-layer traversal to
recursive traversal of the entire tree
One-layer traversal
class Typeable a => Data a where
gmapT :: (forall b. Data b => b -> b)
-> a -> a
instance Data Int where
gmapT f x = x
(gmapT f x)
applies f to the
IMMEDIATE
CHILDREN of x
instance (Data a,Data b)
=> Data (a,b) where
gmapT f (x,y) = (f x, f y)
One-layer traversal
class Typeable a => Data a where
gmapT :: (forall b. Data b => b -> b)
-> a -> a
gmapT's argument is a
polymorphic function; so
gmapT has a rank-2 type
instance (Data a) => Data [a] where
gmapT f []
= []
gmapT f (x:xs) = f x : f xs
-- !!!
Step 2: Now traversals are easy!
everywhere
:: Data a
=> (forall b. Data b => b -> b)
-> a -> a
everywhere f x = f (gmapT (everywhere f) x)
Many different traversals!
everywhere, everywhere'
:: Data a
=> (forall b. Data b => b -> b)
-> a -> a
everywhere f x = f (gmapT (everywhere f) x)
-- Bottom up
everywhere' f x = gmapT (everywhere' f) (f x))
-- Top down
More perspicuous types
everywhere
:: Data a
=> (forall b. Data b => b -> b)
-> a -> a
everywhere :: (forall b. Data b => b -> b)
-> (forall a. Data a => a -> a)
type GenericT = forall a. Data a => a -> a
everywhere :: GenericT -> GenericT
Aha!
What is "really going on"?
inc :: Data t => Float -> t -> t
The magic of type classes passes an extra
argument to inc that contains:
 The function gmapT
 The function typeOf
A call of (mkT incS), done at every node in tree,
entails a comparison of the TypeRep returned
by the passed-in typeOf with a fixed TypeRep
for Salary; this is precisely a dynamic type
check
Summary so far
Solution consists of:



A little user-written code
Mechanically generated instances for
Typeable and Data for each data type
A library of combinators (cast, mkT,
everywhere, etc)
Language support:


cast
rank-2 types
Efficiency is so-so (factor of 2-3 with no
effort)
Summary so far
Robust to data type evolution
Works easily for weird data types
data Rose a = MkR a [Rose a]
instance (Data a) => Data (Rose a) where
gmapT f (MkR x rs) = MkR (f x) (f rs)
data Flip a b = Nil | Cons a (Flip b a)
-- Etc...
Generalisations
With this same language support, we can
do much more




generic queries
generic monadic operations
generic folds
generic zips (e.g. equality)
Generic queries
Add up the salaries of all the employees
in the tree
salaryBill :: Company -> Float
salaryBill = everything (+) (0 `mkQ` billS)
2. Apply the function to
every node in the tree, and
combine results with (+)
billS :: Salary -> Float
billS (S f) = f
1. Build the
function to apply
to every node,
from billS
Type extension again
mkQ :: (Typeable a, Typeable b)
=> d -> (b->d) -> a -> d
(d `mkQ` q) a = case cast a of
Just b -> q b
Nothing -> d
Apply 'q' if its type fits,
otherwise return 'd'
ghci> (22 `mkQ` ord) 'a'
97
ghci> (22 `mkQ` ord) True
22
ord :: Char -> Int
Traversal again
class Typeable a => Data a where
gmapT :: (forall b. Data b => b -> b)
-> a -> a
gmapQ :: forall r.
(forall b. Data b => b -> r)
-> a -> [r]
Apply a function
to all children of
this node, and
collect the results
in a list
Traversal again
class Typeable a => Data a where
gmapT :: (forall b. Data b => b -> b)
-> a -> a
gmapQ :: forall r.
(forall b. Data b => b -> r)
-> a -> [r]
instance Data Int where
gmapQ f x = []
instance (Data a,Data b)
=> Data (a,b) where
gmapQ f (x,y) = f x ++ f y
The query traversal
everything
:: Data a
=> (r->r->r)
-> (forall b. Data b => b -> r)
-> a -> r
everything k f x
= foldl k (f x) (gmapQ (everything f) x)
Note that foldr vs foldl
is in the traversal, not
gmapQ
Looking for one result
By making the result type be (Maybe r),
we can find the first (or last) satisfying
value [laziness]
findDept :: String -> Company -> Maybe Dept
findDept s = everything `orElse`
(Nothing `mkQ` findD s)
findD :: String -> Dept -> Maybe Dept
findD s d@(D s' _ _)
= if s==s' then Just d else Nothing
Monadic transforms
class Typeable a => Data a where
gmapT :: (forall b. Data b => b -> b)
-> a -> a
gmapQ :: forall r.
(forall b. Data b => b -> r)
-> a -> [r]
gmapM :: Monad m
=> (forall b. Data b => b -> m b)
-> a -> m a
Uh oh! Where do we stop?
Where do we stop?
Happily, we can generalise all three gmaps into
one
data Employee = E Person Salary
instance Data Employee where
gfoldl k z (E p s) = (z E `k` p) `k` s
We can define gmapT, gmapQ, gmapM in terms
of (suitably parameterised) gfoldl
The type of gfoldl hurts the brain (but the
definitions are all easy)
Where do we stop?
class Typeable a => Data a where
gfoldl :: (forall a b. Data a => c (a -> b)
-> a -> c b)
-> (forall g. g -> c g)
-> a -> c a
But we still can't do show!
Want show :: Data a => a -> String
show :: Data a => a -> String
show t = ??? ++ concat (gmapQ show t)
show the children and
concatenate the results
But how to show the
constructor?
Add more to class Data
class Data a where
toConstr :: a -> Constr
data Constr
-- abstract
conString :: Constr -> String
conFixity :: Constr -> Fixity
Very like
typeOf :: Typeable a => a -> TypeRep
except only for data types, not functions
So here is show
show :: Data a => a -> String
show t = conString (toConstr t)
++ concat (gmapQ show t)
Simple refinements to deal with
parentheses, infix constructors etc
toConstr on a primitive type (like Int)
yields a Constr whose conString
displays the value
Further generic functions



read :: Data a => String -> a
toBin :: Data a => a -> [Bit]
fromBin :: Data a => [Bit] -> a
testGen :: Data a => RandomGen -> a
class Data a
toConstr
fromConstr
dataTypeOf
where
:: a -> Constr
:: Constr -> a
:: a -> DataType
data DataType
-- Abstract
stringCon :: DataType -> String -> Maybe Constr
indexCon :: DataType -> Int
-> Constr
dataTypeCons :: DataType -> [Constr]
Conclusions
“Simple”, elegant
Modest language extensions
Rank-2 types
 Auto-generation of Typeable, Data instances
Fully implemented in GHC

Shortcomings:


Stop conditions
Types are a bit uninformative
Paper: http://research.microsoft.com/~simonpj