I’m making this post because after spending months learning about all the different properties of monoids, functors, applicatives, and monads, I thought it would be nice to collect everything I’ve learned from different tutorials, StackOverflow questions, and reddit posts into a neat beginner’s introduction to these topics. I’ll break up this post into a few Main Ideas and say a little about each.

## Main Idea #1: Monoids, functors, applicatives, and monads are all different algebras

Types like `IO a`

are not monads themselves, they have some properties about how they can be combined together that means they follow the monad algebra. Just like we can use the natural numbers to add or multiply, we can use `IO`

in a way that is monadic or applicative. In some cases, `IO`

values can also be combined using the monoid algebra, and they can always be transformed using properties of functors. Lists are a common example of functors as well, but you can also combine them in other ways that follow a variety of other algebras. But long story short: there’s no such thing as the “`IO`

monad” or the “`List`

monad”, just the ability to use certain properties of `IO`

and `List`

to combine values in useful ways.

## Main Idea #2: Monoids, functors, applicatives, and monads are ideas about computation, and are not just specific to Haskell

Because lists, tuples, and a variety of similar types can be found in almost all high-level programming languages, you will see examples of where values are combined or transformed in a way that makes you go, “hey, applicative!”. All of these algebras are ideas from category theory that are also useful in programming. However, Haskell is particularly expressive in its ability to specify which types follow which sorts of algebras, and so you hear Haskellers talk about them a lot.

## Main Idea #3: Functors transform values inside containers, while applicatives and monads combine values inside containers

In Haskell, functions of type `a -> b`

transform values of type `a`

into values of type `b`

. For example, the `(+1) :: Int -> Int`

function increments a number by 1, and the `show :: Int -> String`

turns a number into its string representation. But what if your value was enclosed in some sort of a container? For example, if you know how to turn an integer into a string, do you know how to turn a list of integers into a list of strings? Yes, that’s the map function!

```
map (+1) [1, 2, 3, 4] = [2, 3, 4, 5]
map show [1, 2, 3, 4] = ["1", "2", "3", "4"]
```

That is the idea behind the functor algebra. If something is a functor, then there is a function called `fmap`

that will lift transformations from `a -> b`

to `f a -> f b`

. `fmap`

specialized to lists, for example, has the type `(a -> b) -> ([a] -> [b])`

, while `fmap`

specialized to the `Maybe`

has the type `(a -> b) -> (Maybe a -> Maybe b)`

We can also have a container with function(s) of type `a -> b`

in it and another container with value(s) of type `a`

in it, and try to combine the two to get a container with value(s) of type `b`

in it. An example specialized to lists is the following:

`[(*2), (+1)] <*> [1, 2, 3] = [2, 4, 6, 2, 3, 4]`

Here, the `<*>`

function just combines each pair of functions from the first list and values from the second list in a way reminiscent of list comprehensions. For a container of type `Maybe`

, you might use the `<*>`

function like this:

```
Just (+1) <*> Just 3 = Just 4
Nothing <*> Just 3 = Nothing
Just (+1) <*> Nothing = Nothing
```

The idea we’re expressing here is that if one computation fails (if one of the values is a `Nothing`

), then the result also fails. That is why the `Maybe a`

type is used when we want to deal with computations that can fail (e.g., error handling).

Finally, we also have a function for creating a container with a single value inside it, called `pure`

. `pure`

can be used in the following ways:

```
pure "hi" = ["hi"] -- for lists
pure "hi" = Just "hi" -- for Maybe types
pure "hi" = return "hi" -- for IO values
-- ... and so on for every other container type
```

The functions `<*>`

and `pure`

together constitute the applicative algebra.

## Main Idea #4: Things that are monads can be combined using a special type of function composition

We talked about how things that are applicatives can be combined together using the `<*>`

function, and later we’ll learn about how monoids have their own rules about how to combine things together. Monads also have to do with combining containers together, and they do so using basically a special type of function composition.

Function composition has two main ideas:

- If you have two functions,
`f :: a -> b`

and`g :: b -> c`

, then you can create a new function`g . f :: a -> c`

simply by taking the input value, running`f`

on it first, and then running`g`

on the result of that. As you can see, function composition is related to the “sequence” in which you run computations, which will be a key idea with monads. - There is an identity function,
`id :: a -> a`

, that simply outputs what you give it.

It also comes with two laws:

- Function composition is associative:
`f . (g . h) = (f . g) . h`

. Therefore, there is no use in typing out parentheses, so we often just write`f . g . h`

when we mean one or the other. - Composition with the identity function has no effect:
`id . f = f . id`

.

Let’s look at the types of `(.)`

and `id`

:

```
(.) :: (b -> c) -> (a -> b) -> (a -> c)
id :: a -> a
```

Now, suppose that we’re dealing with a different space (a different “category”) of functions: functions that take in ordinary values but return values inside containers. For example, if we’re dealing with computations that might fail, all our functions might be of the type `a -> Maybe b`

for some `a`

and `b`

. Turns out we also have two analogous functions for dealing with this space of functions:

```
(<=<) :: (b -> f c) -> (a -> f b) -> (a -> f c)
return :: a -> f a
```

The `return`

function is the same as the `pure`

function the applicative world: it puts a value inside a container.

These two functions define the monad algebra.

We can use the monad algebra, for example, when we want to compose different `IO`

actions together. The `IO`

function `readFile :: FilePath -> IO String`

reads text from the file at a given path, and the `putStr :: String -> IO ()`

function prints out some text to standard output. If we wanted to combine these functions together into some `printFile :: FilePath -> IO ()`

function that reads a file at the given path and prints out its contents, we could try to use ordinary function composition, but the types wouldn’t quite work out. The function composition operator knows how to feed the output value of one function as the input of another, but it doesn’t know how to deal with the side effects resulting from running each `IO`

action. But monadic function composition does know how to compose the side effects of each action, and the types work out too. Therefore, we can define:

`printFile = putStr <=< readFile`

And it does what we expect.

## Main Idea #5: You can put values inside a container but you can’t always take them out

For both the monad and the applicative algebra, you have a function that takes a value and wraps it in a specific container type. This is always a total function, and can be done without any loss of information. However, you can’t always take a value out of its container (`f a -> a`

) without some loss of information or loss of generality. For `Maybe Int`

types, for example, it is easy to extract `3`

when you are given `Just 3`

, but how can you extract a value out of `Nothing`

? With `[a]`

, you lose some information when you have to extract a single value out of a list with multiple values in it, and you fail when you try to extract a value from an empty list. With `IO a`

values, Haskell prohibits us from ever removing the “IO” tag because we care about distinguishing computations with side effects from pure computations. Therefore, when dealing with values inside a particular container type, we cannot extract a value from that container, but if the container follows the different algebras we discussed, we are guaranteed that there are four types of operations we can do that will not fail:

- We can transform the value inside the container using
`fmap`

(functors) - We can combine a container of functions and a container of values together using
`<*>`

(applicatives) - We can compose functions that produce containers of values together in the order we specify using
`<=<`

(monads) - And we can take a single value and wrap it inside a container using
`pure`

(applicatives) or`return`

(monads)

## Main Idea #6: Applicatives and monads offer an elegant way to manage side effects and state

Although we have briefly touched upon the notion of failure, we’ve mostly talked about applicatives and monads as the common algebra of different container types. In a different viewpoint, we can say that applicatives and monads are the algebra of things that have side effects. It is hard to define what a side effect is, but generally, if a function produces a side effect, it means it produces something “extra” in addition to its output value. A typical example is a function of the form `a -> Maybe b`

, which can sometimes produce an ordinary value (e.g., `Just 's'`

) and sometimes an error value (`Nothing`

). Lists provide the side effect of having an arbitrary number of return values, and other container types like trees provide a similar side effect except the return values also have some structural relationships between them.

The side-effects viewpoint also allows us to make sense of the idea of “state”. Sometimes, a function will produce a value that depends on what the current state of a program is. If the state of a program can be represented by the type `State`

, then the type `State -> a`

represents something whose value depends on the current state of the program. If the `State`

is one thing, then it will return one value, and if the `State`

is something else, then it will return another value. Then, if a function has the type `a -> (State -> b)`

, that means it takes in some input value and returns an output value that depends on the state of the program.

We can also model things that both read the state of the program and modify it with the `State -> (a, State)`

type. This will read the state of the program, and return the return value as well as a new state for the program. This sort of side effect is very useful, and when combined with the error-handling capabilities of `Maybe`

, is essentially a model of what `IO`

does: it returns a value dependent on the state of the world at the time it is executed, but it can also modify the state of the world and fail for different reasons.

Both of these state-dependent types follow the applicative and monad algebras because applicatives and monads both capture the idea that side effects can often be composed. With the type `State -> (a, State)`

, for example, if we want to compose two functions `f`

and `g`

with stateful side effects together in a monadic way, we can simply take the state that the `f`

produces and feed it as the state into g, creating the illusion that `f`

really does modify the state of the program before `g`

is run. Applicatives and monads therefore abstract away handling side effects into the background. This allows us to easily build anything that uses state, from random number generators to parsers.

The `<*>`

function from the applicative side of things also composes actions together, though not as obviously (we noted earlier that in the case of `Maybe`

, if one computation fails, the whole computation fails). You can think of the difference between applicatives and monads in the following way:

- Applicatives compose actions of the form
`f (a -> b)`

and`f (b -> c)`

together to form an action of the form`f (a -> c)`

- Monads compose actions of the form
`a -> f b`

and`b -> f c`

together to form an action of the form`a -> f c`

## Main Idea #7: Applicatives and monads both model running computations in sequence, but monads are more powerful

By composing side effects together, both applicatives and monads are able to run computations in the order that you specify. However, there is an important difference: the way applicatives compose these hybrid side-effect-containers is insensitive to the values inside the containers, while the way monads compose them together depends on those values. When `<*>`

combines a list with 4 items and a list with 7 items, the result will always be a list with 28 items, no matter what items are inside the list. The `(<=<) :: (b -> f c) -> (a -> f b) -> (a -> f c)`

function, however, allows the input to the second function (`b`

) to depend on what is inside the output to the first function (`f b`

), and so we can never be certain as to what the result of composing computations monadically will look like. Allowing this extra dependency makes monads inherently more powerful: while with applicatives you can sequence computations, with monads, you can sequence computations with the additional property that the result of one computation can depend on the result of another computation. In fact, monads have the full power of a Turing machine while applicatives only have the power of a pushdown automaton. However, with applicatives, you can be certain when a computation will fail (it will fail if any of the components fail), while with monads, you may have many, many dependencies and determining when a computation will fail can be literally impossible.

## Main Idea #8: Applicatives and monads come with nice syntax

Suppose you have two lists and you want to take the product of every pair of values (where one value in the pair comes from one list and the other value comes from the other list). With applicatives, you can do it like so:

`products = (*) <$> [1, 2, 3] <*> [2, 6, 10]`

You can put together an arbitrary number of things together using some pure function:

`result = (\w x y z -> x + y - z * w) <$> Just 3 <*> Just 10 <*> Nothing <*> Just 25`

This works because the `<$>`

function is actually a combination of `pure`

and `<*>`

characteristic of the applicative algebra:

```
(<$>) :: (a -> b) -> (f a -> f b)
f <$> x = pure f <*> x
```

And so what we are really saying is:

`result = (((pure (\w x y z -> x + y - z * w) <*> Just 3) <*> Just 10) <*> Nothing) <*> Just 25`

It is the wrapping of a pure function inside its own container followed by repeated partial application using the `<*>`

operator.

With monads, you can use the special “do notation”:

```
products = do
x <- [1, 2, 3]
y <- [2, 6, 10]
return (x * y)
```

The do notation also makes it obvious that the result of one computation can depend on the result of another:

```
products = do
x <- [1, 2, 3]
y <- [1 .. x]
return (x * y)
```

Internally, it uses the `(>>=) :: f a -> (a -> f b) -> f b`

operator, which can be defined using the `(<=<)`

monadic composition operator:

```
(>>=) :: f a -> (a -> f b) -> f b
x >>= f = f <=< (\() -> x) $ ()
```

Intuitively, this operator, called “bind”, takes a container of some value(s) and passes it into a monadic function, and the result will be another container of some value(s). For example, if we had some `Maybe a`

value and some `a -> Maybe b`

function, we could combine the two using the bind operator to produce a `Maybe b`

value. Notice how this again emphasizes that monads can sequence actions in which the next computation can depend on the result of the previous.

Using this function, we can desugar the above code example using the do notation into:

`products = [1, 2, 3] >>= \x -> [1 .. x] >>= \y -> return (x * y)`

Which is exactly what Haskell will do internally.

## Main Idea #9: Monoids are a general algebra for combining things in a certain way

Monoids are the most basic way of combining values together. Unlike functors, applicatives, and monads, they don’t necessarily have to combine containers of values: they can combine any values with certain properties. Monoids technically are an algebra over a set of elements and a certain binary operation (`(<>) :: a -> a -> a`

) such that:

- The binary operation is associative
- There exists an identity element (
`mempty`

) for which the following identity law holds:`mempty <> x = x <> mempty = x`

Lists are monoids: the operation to combine them is simply concatenation. So are many other types of containers, but interestingly, so are monads.

## Main Idea #10: Monads are monoids in the category of endofunctors

There is one last way to think about monads: the ability to join together a container of containers of values into a container of values. To see why this, along with `return`

, is sufficient for something to follow the monad algebra, suppose we had a function, `join :: f (f a) -> f a`

. Then, we could implement the monadic composition operation like so:

```
(<=<) :: (b -> f c) -> (a -> f b) -> (a -> f c)
g <=< f = \x -> join (fmap g (f x))
```

Which is sufficient for something to be a monad. Often, the `join`

function is easier to think about than the weird container-side-effect-special-monadic function composition operator. For example, to join a list of lists into a list, we can simply fold concatenation over the upper list. To join a tree of trees into a tree, we simply anneal the leaves of the upper tree with the roots of the lower trees. For types having to do with state and side effects, however, the function composition view is often more intuitive.

Now, let’s consider a specific container type, like `List`

, and the `join`

binary operation, which turns a `List . List`

into a `List`

. We can see that `join`

is something that combines two functors into one, and we can see that `return`

is some sort of identity such that `join . return = id`

. It turns out that the `join`

operation is also associative: if you have three or more layers of functors over values, then it doesn’t matter which order you merge the layers in to reduce it down to a single layer of a functor over a value. If we squint our eyes a little bit, we notice this reminds us of the monoid algebra a bit, where `join`

is the operation that combines, in this case, two layers of a functor into a single layer of a functor. That is what is meant by probably the most terse description of monads: “Monads are monoids in the category of endofunctors”.

## Where to next?

Hopefully this post gave a nice overview of the different properties of the four algebras. From here, it will be useful to learn about how to use the monads, applicatives, etc. over different types, like:

`Maybe`

`IO`

`List`

`Either`

`State`

`Parser`

These four algebras come up over and over again in Haskell, so while it is nice to understand what they are generally speaking, in practical usage you will need to know how to use the algebras for each different type.

*Please leave comments below and let me know if you have any corrections or need any clarifications.*

Would be great to have sources and further reading. I’m particularly curious about the bit in #7 about Turing Machines and pushdown automata.

LikeLike

Hi Carl,

The source for #7 is the following StackOverflow answer: http://stackoverflow.com/a/16447511. Unfortunately there isn’t too much information about this topic that I’ve found online, but I remember there was a discussion on reddit about this as well. There is also a related discussion about the power of applicative vs monadic parsers, where applicative parsers can only parse context-free grammars while monadic parsers can parse context-sensitive grammars (analogous to the turing machine vs pushdown automaton idea), but there is a subtlety to that discussed in this blog post: https://byorgey.wordpress.com/2012/01/05/parsing-context-sensitive-languages-with-applicative/.

For #4, the source is the blog post at http://www.haskellforall.com/2012/08/the-category-design-pattern.html, and for #10 the source is the StackOverflow answer at http://stackoverflow.com/a/3870310. I can’t quite pin down the sources for the rest of the Main Ideas, but the three sources I provided are in my opinion the most formative in understanding monads.

Cheers,

Siddharth

LikeLike

fmap specialized to lists, for example, has the type (a -> b) -> ([a] -> [b]), while fmap specialized to the Maybe has the type (a -> b) -> (Maybe a -> Maybe b) —-> is this a typo? and should be something like: map (not fmap) specializeS to lists while fmap specializeS to (remove the) Maybe

LikeLike

flip101,

Not a typo, I meant it as “when fmap is specialized to lists, it has the type (a -> b) -> ([a] -> [b]). When fmap is specialized to lists, it has the type (a -> b) -> (Maybe a -> Maybe b).”, which is basically what you’re saying.

LikeLike

Great compilation and post around these ideas. Thanks for taking the time to put this together!

LikeLike

Thanks, I appreciate it!

LikeLike

This was a great read. Took me a bit to read. But if you have any further resources of the practical usages of monad algebra. I’m looking to get acquainted with some problems in order to truly understand the “beauty” of Monads and Applicatives uses to combine values; And more about uses for the composition of the monadic operator. I want to be more immersed with the math of it all. Sorry if this was not a well thought out comment. thanks again

LikeLike

Hi Damian,

If you are familiar with Haskell, I would recommend reading the sections on different types of monads and monad transformers on Wikibooks Haskell. I would mostly just recommend learning about how to use each particular type that I listed above. There are also more advanced monadic types like Continuation and Free that you can learn about when you feel comfortable.

As for the mathematical treatment of monads, I can’t pretend to know that much about it, but the Catsters channel on youtube has a good explanation on monads in category theory.

Hope that helped,

Siddharth

LikeLike

Nice primer on category theory, when already having knowledge of Haskell.

ps. Under idea 9, the `mempty x = x mempty = x` is not very nicely formatted (maybe consider breaking the line)

LikeLike

Thanks for your input, it has been fixed.

LikeLike

This is such a nice clean and above all useful way to explain all this. Thank you very much.

LikeLike

I’m glad it was useful to you!

LikeLike

Good stuff. Minor correction : “Composition with the identity function has no affect” -> effect. Otherwise anyone with psych training thinks you’re saying that it shows no emotion.

LikeLike

Thanks, it has been fixed.

LikeLike