# Recursion schemes.

# Folding a list.

Say we have a data type,

`data List a = Nil | Cons a (List a)`

and we would like to fold over this data structure:

```
foldr :: forall a z. z -> (a -> z -> z) -> List a -> z
foldr nil cons = go where
go Nil = nil
go (Cons h t) = cons h (go t)
```

Let’s look at the type of `foldr`

and make some isomorphic transformations:

```
∀ a z. z -> (a -> z -> z) -> List a -> z
~ ∀ a z. (1 -> z) -> (a * z -> z) -> List a -> z
~ ∀ a z. ((1 + a * z) -> z) -> List a -> z
-- define ListF a z = 1 + a * z
~ ∀ a z. (ListF a z -> z) -> List a -> z
```

We can define `ListF`

as:

`data ListF a z = NilF | ConsF a z`

Armed with this new data type, and the new type signature for `foldr`

, we rewrite it as:

```
foldr :: forall a z. (ListF a z -> z) -> List a -> z
foldr alg = go where
go Nil = alg NilF
go (Cons h t) = alg (ConsF h (go t))
```

#### Trying a different ADT

Let’s start with a different ADT to see if we can use the same technique as we did for lists:

```
data Expr = Lit Int | Add Expr Expr | Mul Expr Expr
foldExpr :: forall z. (Int -> z) -> (z -> z -> z) -> (z -> z -> z) -> Expr -> z
foldExpr lit add mul = go where
go (Lit i) = lit i
go (Add l r) = add (go l) (go r)
go (Mul l r) = mul (go l) (go r)
```

```
∀ z. (Int -> z) -> (z -> z -> z) -> (z -> z -> z) -> Expr -> z
~ ∀ z. (Int -> z) -> (z * z -> z) -> (z * z -> z) -> Expr -> z
~ ∀ z. ((Int + z * z + z * z) -> z) -> Expr -> z
-- define ExprF z = Int + z * z + z * z
~ ∀ z. (ExprF z -> z) -> Expr -> z
```

We can define `ExprF`

as:

`data ListF z = LitF Int | AddF z z | MulF z z`

#### Back to lists

Let’s continue our `List`

transformations, perhaps we can notice something else:

```
∀ a z. (ListF a z -> z) -> List a -> z
~ ∀ a. List a -> (∀ z. (ListF a z -> z) -> z)
~ ∀ a. List a -> (μ z. ListF a z)
-- define List' a = μ z. ListF a z
~ List :~> List'
```

We see that `foldr`

’s type tells us it is isomorphic to a natural transformation from `List`

to a structure of nested `ListF`

s. Note that our specific instance (`foldr`

) of that type is actually a natural isomorphism, or at least one part of it. What is the other part?

It’s signature,

```
List' :~> List
~ forall a. (μ z. ListF a z) -> List a
~ forall a. (forall z. (ListF a z -> z) -> z) -> List a
```

is not particularly insightful. But its implementation,

```
transform :: forall a. (forall z. (ListF a z -> z) -> z) -> List a
transform go = go $ \case
NilF -> Nil
ConsF a z -> Cons a z
```

is rather interesting.

What happens is that given a `forall z. (ListF a z -> z) -> z`

, we first substitute `z = List a`

, obtaining `(ListF a (List a) -> List a) -> List a`

and then pass it a function that *embeds* a pattern functor `ListF a (List a)`

into `List a`

. Let’s call this function `embed`

:

```
embed :: ListF a (List a) -> List a
embed NilF = Nil
embed (ConsF a z) = Cons a z
```

And now we can write `transform`

as:

`transform go = go embed`

Ahh, this is quite satisfying.

Let’s now go back to `foldr`

:

```
foldr :: forall a z. (ListF a z -> z) -> List a -> z
foldr alg = go where
go Nil = alg NilF
go (Cons h t) = alg (ConsF h (go t))
```

A reader with a minor OCD might notice that if `embed`

was perfectly symmetric, the second case of `go`

here is not symmetric. Ideally we would like to have `ConsF h t :: ListF a (List a)`

and then somehow transform it into `ConsF h (go t) :: ListF a z`

. What does it remind you of? An `fmap`

!

```
instance Functor (ListF a) where
fmap f NilF = NilF
fmap f (Cons a x) = Cons a (f x)
```

Now we can rewrite `foldr`

as:

```
foldr :: forall a z. (ListF a z -> z) -> List a -> z
foldr alg = go where
go Nil = alg NilF
go (Cons h t) = alg $ fmap go $ (ConsF h t)
```

And factoring out `project :: List a -> ListF a (List a)`

:

```
project :: List a -> ListF a (List a)
project Nil = NilF
project (Cons a z) = ConsF a z
foldr :: forall a z. (ListF a z -> z) -> List a -> z
foldr alg = go where
go = alg . fmap go . project
```

Note the symmetry of `project`

and `embed`

. In particular, it is easy to show that `project . embed = id`

and `embed . project = id`

, so they are two parts of an isomorphism between `ListF a (List a)`

and `List a`

.

What happens if we pass `embed`

into `foldr`

?

```
foldr embed = go where
go = embed . fmap go . project
```

What does `go`

do? I suggest taking a few moments or minutes to convince yourself that `go`

above (specialized for `embed`

) is just an identity function on lists.

#### Generalizing

By following the same exact methodology as above, we can similarly derive:

```
embedExpr :: ExprF Expr -> Expr
embedExpr (LitF i) = Lit i
embedExpr (AddF l r) = Add l r
embedExpr (MulF l r) = Mul l r
projectExpr :: Expr -> ExprF Expr
projectExpr (Lit i) = LitF i
projectExpr (Add l r) = AddF l r
projectExpr (Mul l r) = MulF l r
foldExpr :: forall z. (ExprF z -> z) -> Expr -> z
foldExpr alg = go where
go = alg . fmap go . projectExpr
```

This suggests that we can generalize folding over different data types by using a typeclass:

```
type Algebra f a = f a -> a
newtype Mu f = Mu (forall a. Algebra f a -> a)
class Functor f => Recursive t f | t -> f where
project :: t -> f t
embed :: f t -> t
cata :: Algebra f a -> t -> a
cata alg = go where
go = alg . fmap go . project
embed' :: Mu f -> t
embed' (Mu go) = go embed
```

Note that I added two new type definitions, `Algebra f a`

.

```
Algebra f a
~ f a -> a
```

which is also known as F-Algebra, and `Mu f`

:

```
Mu f
~ forall a. Algebra f a -> a
~ forall a. (f a -> a) -> a
~ μ a. f a
```

`foldr`

became `cata`

, which stands for catamorphism.

Further note that `Recursive t f`

is usually defined entirely in terms of `project`

(so there is no `embed`

nor `embed'`

), but this is not completely satisfactory as there is always an `embed`

when there is `project`

, since together they form an isomorphism between `t`

and `f t`

.

# Folding with location

In this section we’ll discuss how to fold over a data structure while having some notion of location: e.g. if we are folding over a tree, we would like to know which path we took to arrive to a node. To motivate this problem, consider how you would accurately report that one of the tree nodes is erroneous.

#### Taking `List`

’s derivative

First, let’s see what happens if we take a derivative of `List a`

with respect to `a`

:

```
∂ₐ(List a)
= ∂ₐ(1 + a * List a)
= ∂ₐ(a) * List a + a * ∂ₐ(List a)
= List a + a * ∂ₐ(List a)
= List a * List a
```

On the other hand, as we have shown before, there is an isomorphism between `List a`

and `μ x . ListF a x`

, so

```
∂ₐ(List a)
= ∂ₐ(μ x . ListF a x)
= ∂ₐ(ListF a (μ x . ListF a x))
= ∂ₐ(ListF a x)[x=μ x . ListF a x] +
∂ₓ(ListF a x)[x=μ x . ListF a x] * ∂ₐ(μ x . ListF a x)
= μ w . ∂ₐ(ListF a x)[x=μ x . ListF a x] +
∂ₓ(ListF a x)[x=μ x . ListF a x] * w
= ∂ₐ(ListF a x)[x=μ x . ListF a x] *
List ∂ₓ(ListF a x)[x=μ x . ListF a x]
= x[x=μ x . ListF a x] *
List a[x=μ x . ListF a x]
= List a *
List a
```

Let’s put `List a`

and `∂ₐ(List a)`

side by side:

```
List a = μ w . ListF a w
= μ w . 1 + a * w
∂ₐ(List a) = μ w . ∂ₐ(ListF a x)[x=μ x . ListF a x] +
∂ₓ(ListF a x)[x=μ x . ListF a x] * w
= μ w . List a + a * w
```

What do both branches of the recursive structure mean in each case? In the case of `List a`

, `1`

means that we are at the end of the list and `a * w`

means that we have another element of the list, `a`

, and the rest of the list `w`

. In the case of `∂ₐ(List a)`

, `List a`

means that we are focused on the first element of the list and `a * w`

means that we are descending further into the structure, passing an `a`

as we go in; `w`

represents descending in.

#### Not quite there yet

Note that when folding a data structure, we don’t really observe one `a`

at a time, but rather one layer `ListF a z`

at a time, with `z`

representing the already folded part of the structure. This means that `∂ₐ(List a) = ∂ₐ(μ x . ListF a x)`

is not a good notion of location in a data structure.

We can improve the situation by using `ListF a (z, lz : List a)`

, where `z`

is the result of folding `lz`

(this is known as paramorphism). This tells us how what subtree `z`

was produced from, but this doesn’t give us information about the path we took in the data structure to arrive to a particular layer `ListF a z`

.

We can represent that path as a list of `∂ₓ(ListF a x)[x=List a] ~ a`

(we’ll use `[]`

instead of `List`

to indicate that it does not depend on the particular data structure we are considering): each element of the path is a particular choice of the direction we take as we descend down into the structure.

Generalizing to an arbitrary recursive data type, we get:

```
histCata :: Recursive t f => ([∂ₓ(f x)[x=t]] -> f (z, t) -> z) -> [∂ₓ(f x)[x=t]] -> List a -> z
-- We pass [∂ₓ(f x)[x=t]] as an argument to histCata
-- to represent the path we have taken before we started
-- folding.
```

We have a slight problem, `∂ₓ(f x)`

can not be represented as a type constructor / type lambda, and furthermore we don’t have any operations consuming and producing values of type `∂ₓ(f x)[x=t]`

yet.

#### Derivative of `ExprF`

```
∂ₓ(ExprF x)
~ ∂ₓ(LitF Int | AddF x x | MulF x x)
~ ∂ₓ(Int + x * x + x * x)
~ 0 + (∂ₓ(x) * x + x * ∂ₓ(x)) + (∂ₓ(x) * x + x * ∂ₓ(x))
~ (x + x) + (x + x)
~ (AddL x | AddR x) | (MulL x | MulR x)
```

Let’s define it as a data type:

`data ExprFD x = AddL x | AddR x | MulL x | MulR x`

Now let’s look at the definition of `project`

for expressions:

```
project :: Expr -> ExprF Expr
project (Lit i) = LitF i
project (Add l r) = AddF l r
project (Mul l r) = MulF l r
```

Notice that when we project, every `Expr`

inside of `ExprF Expr`

has a particular location in `ExprF`

that could be represented by `∂ₓ(ExprF x)`

:

```
projectD :: Expr -> ExprF (ExprFD Expr, Expr)
projectD (Lit i) = LitF i
projectD (Add l r) = AddF (AddL r, l) (AddR l, r)
projectD (Mul l r) = MulF (MulL r, l) (MulR l, r)
```

Note that `projectD`

is *not* an isomorphism. And in the opposite direction there is a function of type `(ExprFD Expr, Expr) -> Expr`

(also not an isomorphism):

```
embedD :: ExprFD Expr -> Expr -> Expr
embedD (AddL r) l = Add l r
embedD (AddR l) r = Add l r
embedD (MulL r) l = Mul l r
embedD (MulR l) r = Mul l r
```

Now we can express `histCata`

for expressions as:

```
histCata :: ([ExprFD Expr] -> ExprF (z, Expr) -> z) -> [ExprFD Expr] -> List a -> z
histCata alg = go where
go hist = alg hist . fmap (\(d, e) -> (go (d : hist) e, e)) . projectD
```

#### Generalizing

To generalize `histCata`

, we’ll need to modify our typeclass by adding another type parameter for the derivative:

```
class Functor f => Recursive t f d | t -> f, t -> d where
project :: t -> f t
embed :: f t -> t
-- and other methods
projectD :: t -> f (d t, t)
embedD :: (d t, t) -> t
histCata :: ([d t] -> f (z, t) -> z) -> [d t] -> t -> z
histCata alg = go where
go hist = alg hist . fmap (\(d, e) -> (go (d : hist) e, e)) . projectD
```

# Folding from the left

Consider the difference between `foldl`

and `foldr`

as seen through the lens of recursion schemes. Folding from the right is folding bottom-up, we start with leaves (non-recursive data constructors), fold those into `z`

, then all data constructors above the leaves, etc. Folding from the left is folding top-down, we peel one layer of a data structure at a time and then proceed to the next layer until we reach the leaves. But this makes sense only if there is just a single direction to move! What if we have binary recursive data constructors like `Add`

in `Expr`

?

We can generalize `foldl`

to arbitrary data structures in a few ways:

```
-- descend down all possible paths summing all values at the leaves
topDown1 :: Recursive t f d, Monoid z => z -> (z -> f () -> f z) -> z
-- preorder traversal
topDown2 :: Recursive t f d, Traversable f => z -> (z -> f () -> f z) -> (d z -> z) -> z
-- ??
```