For a Few Monads More
We’ve seen how monads can be used to take values with contexts and
apply them to functions and how using >>= or
do notation allows us to focus on the values themselves
while the context gets handled for us.
We’ve met the Maybe monad and seen how it adds a context
of possible failure to values. We’ve learned about the list monad and
saw how it lets us easily introduce non-determinism into our programs.
We’ve also learned how to work in the IO monad, even before
we knew what a monad was!
In this chapter, we’re going to learn about a few other monads. We’ll see how they can make our programs clearer by letting us treat all sorts of values as monadic ones. Exploring a few monads more will also solidify our intuition for monads.
The monads that we’ll be exploring are all part of the
mtl package. A Haskell package is a collection of modules.
The mtl package comes with the Haskell Platform, so you
probably already have it. To check if you do, type
ghc-pkg list in the command-line. This will show which
Haskell packages you have installed and one of them should be
mtl, followed by a version number.
Writer? I hardly know her!
We’ve loaded our gun with the Maybe monad, the list
monad and the IO monad. Now let’s put the
Writer monad in the chamber and see what happens when we
fire it!
Whereas Maybe is for values with an added context of
failure and the list is for non-deterministic values, the
Writer monad is for values that have another value attached
that acts as a sort of log value. Writer allows us to do
computations while making sure that all the log values are combined into
one log value that then gets attached to the result.
For instance, we might want to equip our values with strings that explain what’s going on, probably for debugging purposes. Consider a function that takes a number of bandits in a gang and tells us if that’s a big gang or not. That’s a very simple function:
isBigGang :: Int -> Bool
isBigGang x = x > 9
Now, what if instead of just giving us a True or
False value, we want it to also return a log string that
says what it did? Well, we just make that string and return it alongside
our Bool:
isBigGang :: Int -> (Bool, String)
isBigGang x = (x > 9, "Compared gang size to 9.")
So now instead of just returning a Bool, we return a
tuple where the first component of the tuple is the actual value and the
second component is the string that accompanies that value. There’s some
added context to our value now. Let’s give this a go:
ghci> isBigGang 3
(False,"Compared gang size to 9.")
ghci> isBigGang 30
(True,"Compared gang size to 9.")
So far so good. isBigGang takes a normal value and
returns a value with a context. As we’ve just seen, feeding it a normal
value is not a problem. Now what if we already have a value that has a
log string attached to it, such as (3, "Smallish gang."),
and we want to feed it to isBigGang? It seems like once
again, we’re faced with this question: if we have a function that takes
a normal value and returns a value with a context, how do we take a
value with a context and feed it to the function?
When we were exploring the Maybe monad, we made a
function applyMaybe, which took a Maybe a
value and a function of type a -> Maybe b and fed that
Maybe a value into the function, even though the function
takes a normal a instead of a Maybe a. It did
this by minding the context that comes with Maybe a values,
which is that they are values with possible failure. But inside the
a -> Maybe b function, we were able to treat that value
as just a normal value, because applyMaybe (which later
became >>=) took care of checking if it was a
Nothing or a Just value.
In the same vein, let’s make a function that takes a value with an
attached log, that is, an (a,String) value and a function
of type a -> (b,String) and feeds that value into the
function. We’ll call it applyLog. But because an
(a,String) value doesn’t carry with it a context of
possible failure, but rather a context of an additional log value,
applyLog is going to make sure that the log of the original
value isn’t lost, but is joined together with the log of the value that
results from the function. Here’s the implementation of
applyLog:
applyLog :: (a,String) -> (a -> (b,String)) -> (b,String)
applyLog (x,log) f = let (y,newLog) = f x in (y,log ++ newLog)
When we have a value with a context and we want to feed it to a
function, we usually try to separate the actual value from the context
and then try to apply the function to the value and then see that the
context is taken care of. In the Maybe monad, we checked if
the value was a Just x and if it was, we took that
x and applied the function to it. In this case, it’s very
easy to find the actual value, because we’re dealing with a pair where
one component is the value and the other a log. So first we just take
the value, which is x and we apply the function
f to it. We get a pair of (y,newLog), where
y is the new result and newLog the new log.
But if we returned that as the result, the old log value wouldn’t be
included in the result, so we return a pair of
(y,log ++ newLog). We use ++ to append the new
log to the old one.
Here’s applyLog in action:
ghci> (3, "Smallish gang.") `applyLog` isBigGang
(False,"Smallish gang.Compared gang size to 9")
ghci> (30, "A freaking platoon.") `applyLog` isBigGang
(True,"A freaking platoon.Compared gang size to 9")
The results are similar to before, only now the number of people in
the gang had its accompanying log and it got included in the result log.
Here are a few more examples of using applyLog:
ghci> ("Tobin","Got outlaw name.") `applyLog` (\x -> (length x, "Applied length."))
(5,"Got outlaw name.Applied length.")
ghci> ("Bathcat","Got outlaw name.") `applyLog` (\x -> (length x, "Applied length"))
(7,"Got outlaw name.Applied length")
See how inside the lambda, x is just a normal string and
not a tuple and how applyLog takes care of appending the
logs.
Monoids to the rescue
Be sure you know what monoids are at this point! Cheers.
Right now, applyLog takes values of type
(a,String), but is there a reason that the log has to be a
String? It uses ++ to append the logs, so
wouldn’t this work on any kind of list, not just a list of characters?
Sure it would. We can go ahead and change its type to this:
applyLog :: (a,[c]) -> (a -> (b,[c])) -> (b,[c])
Now, the log is a list. The type of values contained in the list has
to be the same for the original list as well as for the list that the
function returns, otherwise we wouldn’t be able to use ++
to stick them together.
Would this work for bytestrings? There’s no reason it shouldn’t.
However, the type we have now only works for lists. It seems like we’d
have to make a separate applyLog for bytestrings. But wait!
Both lists and bytestrings are monoids. As such, they are both instances
of the Monoid type class, which means that they implement
the mappend function. And for both lists and bytestrings,
mappend is for appending. Watch:
ghci> [1,2,3] `mappend` [4,5,6]
[1,2,3,4,5,6]
ghci> B.pack [99,104,105] `mappend` B.pack [104,117,97,104,117,97]
Chunk "chi" (Chunk "huahua" Empty)
Cool! Now our applyLog can work for any monoid. We have
to change the type to reflect this, as well as the implementation,
because we have to change ++ to mappend:
applyLog :: (Monoid m) => (a,m) -> (a -> (b,m)) -> (b,m)
applyLog (x,log) f = let (y,newLog) = f x in (y,log `mappend` newLog)
Because the accompanying value can now be any monoid value, we no
longer have to think of the tuple as a value and a log, but now we can
think of it as a value with an accompanying monoid value. For instance,
we can have a tuple that has an item name and an item price as the
monoid value. We just use the Sum newtype to make sure that
the prices get added as we operate with the items. Here’s a function
that adds drink to some cowboy food:
import Data.Monoid
type Food = String
type Price = Sum Int
addDrink :: Food -> (Food,Price)
addDrink "beans" = ("milk", Sum 25)
addDrink "jerky" = ("whiskey", Sum 99)
addDrink _ = ("beer", Sum 30)
We use strings to represent foods and an Int in a
Sum newtype wrapper to keep track of how many
cents something costs. Just a reminder, doing mappend with
Sum results in the wrapped values getting added
together:
ghci> Sum 3 `mappend` Sum 9
Sum {getSum = 12}
The addDrink function is pretty simple. If we’re eating
beans, it returns "milk" along with Sum 25, so
25 cents wrapped in Sum. If we’re eating jerky we drink
whiskey and if we’re eating anything else we drink beer. Just normally
applying this function to a food wouldn’t be terribly interesting right
now, but using applyLog to feed a food that comes with a
price itself into this function is interesting:
ghci> ("beans", Sum 10) `applyLog` addDrink
("milk",Sum {getSum = 35})
ghci> ("jerky", Sum 25) `applyLog` addDrink
("whiskey",Sum {getSum = 124})
ghci> ("dogmeat", Sum 5) `applyLog` addDrink
("beer",Sum {getSum = 35})
Milk costs 25 cents, but if we eat it with beans that
cost 10 cents, we’ll end up paying 35 cents.
Now it’s clear how the attached value doesn’t always have to be a log,
it can be any monoid value and how two such values are combined into one
depends on the monoid. When we were doing logs, they got appended, but
now, the numbers are being added up.
Because the value that addDrink returns is a tuple of
type (Food,Price), we can feed that result to
addDrink again, so that it tells us what we should drink
along with our drink and how much that will cost us. Let’s give it a
shot:
ghci> ("dogmeat", Sum 5) `applyLog` addDrink `applyLog` addDrink
("beer",Sum {getSum = 65})
Adding a drink to some dog meat results in a beer and an additional
30 cents, so ("beer", Sum 35). And if we use
applyLog to feed that to addDrink, we get
another beer and the result is ("beer", Sum 65).
The Writer type
Now that we’ve seen that a value with an attached monoid acts like a
monadic value, let’s examine the Monad instance for types
of such values. The Control.Monad.Writer module exports the
Writer w a type along with its Monad instance
and some useful functions for dealing with values of this type.
First, let’s examine the type itself. To attach a monoid to a value,
we just need to put them together in a tuple. The
Writer w a type is just a newtype wrapper for
this. Its definition is very simple:
newtype Writer w a = Writer { runWriter :: (a, w) }
It’s wrapped in a newtype so that it can be made an
instance of Monad and that its type is separate from a
normal tuple. The a type parameter represents the type of
the value and the w type parameter the type of the attached
monoid value.
Its Monad instance is defined like so:
instance (Monoid w) => Monad (Writer w) where
return x = Writer (x, mempty)
(Writer (x,v)) >>= f = let (Writer (y, v')) = f x in Writer (y, v `mappend` v')
First off, let’s examine >>=. Its implementation
is essentially the same as applyLog, only now that our
tuple is wrapped in the Writer newtype, we
have to unwrap it when pattern matching. We take the value
x and apply the function f to it. This gives
us a Writer w a value and we use a let
expression to pattern match on it. We present y as the new
result and use mappend to combine the old monoid value with
the new one. We pack that up with the result value in a tuple and then
wrap that with the Writer constructor so that our result is
a Writer value instead of just an unwrapped tuple.
So, what about return? It has to take a value and put it
in a default minimal context that still presents that value as the
result. So what would such a context be for Writer values?
If we want the accompanying monoid value to affect other monoid values
as little as possible, it makes sense to use mempty.
mempty is used to present identity monoid values, such as
"" and Sum 0 and empty bytestrings. Whenever
we use mappend between mempty and some other
monoid value, the result is that other monoid value. So if we use
return to make a Writer value and then use
>>= to feed that value to a function, the resulting
monoid value will be only what the function returns. Let’s use
return on the number 3 a bunch of times, only
we’ll pair it with a different monoid every time:
ghci> runWriter (return 3 :: Writer String Int)
(3,"")
ghci> runWriter (return 3 :: Writer (Sum Int) Int)
(3,Sum {getSum = 0})
ghci> runWriter (return 3 :: Writer (Product Int) Int)
(3,Product {getProduct = 1})
Because Writer doesn’t have a Show
instance, we had to use runWriter to convert our
Writer values to normal tuples that can be shown. For
String, the monoid value is the empty string. With
Sum, it’s 0, because if we add 0 to something,
that something stays the same. For Product, the identity is
1.
The Writer instance doesn’t feature an implementation
for fail, so if a pattern match fails in do
notation, error is called.
Using do notation with Writer
Now that we have a Monad instance, we’re free to use
do notation for Writer values. It’s handy for
when we have a several Writer values and we want to do
stuff with them. Like with other monads, we can treat them as normal
values and the context gets taken for us. In this case, all the monoid
values that come attached get mappended and so are
reflected in the final result. Here’s a simple example of using
do notation with Writer to multiply two
numbers:
import Control.Monad.Writer
logNumber :: Int -> Writer [String] Int
logNumber x = Writer (x, ["Got number: " ++ show x])
multWithLog :: Writer [String] Int
multWithLog = do
a <- logNumber 3
b <- logNumber 5
return (a*b)
logNumber takes a number and makes a Writer
value out of it. For the monoid, we use a list of strings and we equip
the number with a singleton list that just says that we have that
number. multWithLog is a Writer value which
multiplies 3 and 5 and makes sure that their
attached logs get included in the final log. We use return
to present a*b as the result. Because return
just takes something and puts it in a minimal context, we can be sure
that it won’t add anything to the log. Here’s what we see if we run
this:
ghci> runWriter multWithLog
(15,["Got number: 3","Got number: 5"])
Sometimes we just want some monoid value to be included at some
particular point. For this, the tell function is useful.
It’s part of the MonadWriter type class and in the case of
Writer it takes a monoid value, like
["This is going on"] and creates a Writer
value that presents the dummy value () as its result but
has our desired monoid value attached. When we have a monadic value that
has () as its result, we don’t bind it to a variable.
Here’s multWithLog but with some extra reporting
included:
multWithLog :: Writer [String] Int
multWithLog = do
a <- logNumber 3
b <- logNumber 5
tell ["Gonna multiply these two"]
return (a*b)
It’s important that return (a*b) is the last line,
because the result of the last line in a do expression is
the result of the whole do expression. Had we put
tell as the last line, () would have been the
result of this do expression. We’d lose the result of the
multiplication. However, the log would be the same. Here is this in
action:
ghci> runWriter multWithLog
(15,["Got number: 3","Got number: 5","Gonna multiply these two"])
Adding logging to programs
Euclid’s algorithm is an algorithm that takes two numbers and
computes their greatest common divisor. That is, the biggest number that
still divides both of them. Haskell already features the
gcd function, which does exactly this, but let’s implement
our own and then equip it with logging capabilities. Here’s the normal
algorithm:
gcd' :: Int -> Int -> Int
gcd' a b
| b == 0 = a
| otherwise = gcd' b (a `mod` b)
The algorithm is very simple. First, it checks if the second number is 0. If it is, then the result is the first number. If it isn’t, then the result is the greatest common divisor of the second number and the remainder of dividing the first number with the second one. For instance, if we want to know what the greatest common divisor of 8 and 3 is, we just follow the algorithm outlined. Because 3 isn’t 0, we have to find the greatest common divisor of 3 and 2 (if we divide 8 by 3, the remainder is 2). Next, we find the greatest common divisor of 3 and 2. 2 still isn’t 0, so now we have 2 and 1. The second number isn’t 0, so we run the algorithm again for 1 and 0, as dividing 2 by 1 gives us a remainder of 0. And finally, because the second number is now 0, the final result is 1. Let’s see if our code agrees:
ghci> gcd' 8 3
1
It does. Very good! Now, we want to equip our result with a context,
and the context will be a monoid value that acts as a log. Like before,
we’ll use a list of strings as our monoid. So the type of our new
gcd' function should be:
gcd' :: Int -> Int -> Writer [String] Int
All that’s left now is to equip our function with log values. Here’s the code:
import Control.Monad.Writer
gcd' :: Int -> Int -> Writer [String] Int
gcd' a b
| b == 0 = do
tell ["Finished with " ++ show a]
return a
| otherwise = do
tell [show a ++ " mod " ++ show b ++ " = " ++ show (a `mod` b)]
gcd' b (a `mod` b)
This function takes two normal Int values and returns a
Writer [String] Int, that is, an Int that has
a log context. In the case where b is 0,
instead of just giving a as the result, we use a
do expression to put together a Writer value
as a result. First we use tell to report that we’re
finished and then we use return to present a
as the result of the do expression. Instead of this
do expression, we could have also written this:
Writer (a, ["Finished with " ++ show a])
However, I think the do expression is easier to read.
Next, we have the case when b isn’t 0. In this
case, we log that we’re using mod to figure out the
remainder of dividing a and b. Then, the
second line of the do expression just recursively calls
gcd'. Remember, gcd' now ultimately returns a
Writer value, so it’s perfectly valid that
gcd' b (a `mod` b) is a line in a do
expression.
While it may be kind of useful to trace the execution of this new
gcd' by hand to see how the logs get appended, I think it’s
more insightful to just look at the big picture and view these as values
with a context and from that gain insight as to what the final result
will be.
Let’s try our new gcd' out. Its result is a
Writer [String] Int value and if we unwrap that from its
newtype, we get a tuple. The first part of the tuple is the
result. Let’s see if it’s okay:
ghci> fst $ runWriter (gcd' 8 3)
1
Good! Now what about the log? Because the log is a list of strings,
let’s use mapM_ putStrLn to print those strings to the
screen:
ghci> mapM_ putStrLn $ snd $ runWriter (gcd' 8 3)
8 mod 3 = 2
3 mod 2 = 1
2 mod 1 = 0
Finished with 1
I think it’s awesome how we were able to change our ordinary
algorithm to one that reports what it does as it goes along just by
changing normal values to monadic values and letting the implementation
of >>= for Writer take care of the logs
for us. We can add a logging mechanism to pretty much any function. We
just replace normal values with Writer values where we want
and change normal function application to >>= (or
do expressions if it increases readability).
Inefficient list construction
When using the Writer monad, you have to be careful
which monoid to use, because using lists can sometimes turn out to be
very slow. That’s because lists use ++ for
mappend and using ++ to add something to the
end of a list is slow if that list is really long.
In our gcd' function, the logging is fast because the
list appending ends up looking like this:
a ++ (b ++ (c ++ (d ++ (e ++ f))))
Lists are a data structure that’s constructed from left to right, and
this is efficient because we first fully construct the left part of a
list and only then add a longer list on the right. But if we’re not
careful, using the Writer monad can produce list appending
that looks like this:
((((a ++ b) ++ c) ++ d) ++ e) ++ f
This associates to the left instead of to the right. This is inefficient because every time it wants to add the right part to the left part, it has to construct the left part all the way from the beginning!
The following function works like gcd', only it logs
stuff in reverse. First it produces the log for the rest of the
procedure and then adds the current step to the end of the log.
import Control.Monad.Writer
gcdReverse :: Int -> Int -> Writer [String] Int
gcdReverse a b
| b == 0 = do
tell ["Finished with " ++ show a]
return a
| otherwise = do
result <- gcdReverse b (a `mod` b)
tell [show a ++ " mod " ++ show b ++ " = " ++ show (a `mod` b)]
return result
It does the recursion first, and binds its result value to
result. Then it adds the current step to the log, but the
current step goes at the end of the log that was produced by the
recursion. Finally, it presents the result of the recursion as the final
result. Here it is in action:
ghci> mapM_ putStrLn $ snd $ runWriter (gcdReverse 8 3)
Finished with 1
2 mod 1 = 0
3 mod 2 = 1
8 mod 3 = 2
It’s inefficient because it ends up associating the use of
++ to the left instead of to the right.
Difference lists
Because lists can sometimes be inefficient when repeatedly appended
in this manner, it’s best to use a data structure that always supports
efficient appending. One such data structure is the difference list. A
difference list is similar to a list, only instead of being a normal
list, it’s a function that takes a list and prepends another list to it.
The difference list equivalent of a list like [1,2,3] would
be the function \xs -> [1,2,3] ++ xs. A normal empty
list is [], whereas an empty difference list is the
function \xs -> [] ++ xs.
The cool thing about difference lists is that they support efficient
appending. When we append two normal lists with ++, it has
to walk all the way to the end of the list on the left of
++ and then stick the other one there. But what if we take
the difference list approach and represent our lists as functions? Well
then, appending two difference lists can be done like so:
f `append` g = \xs -> f (g xs)
Remember, f and g are functions that take
lists and prepend something to them. So, for instance, if f
is the function ("dog"++) (just another way of writing
\xs -> "dog" ++ xs) and g the function
("meat"++), then f `append` g makes a new
function that’s equivalent to the following:
\xs -> "dog" ++ ("meat" ++ xs)
We’ve appended two difference lists just by making a new function that first applies one difference list some list and then the other.
Let’s make a newtype wrapper for difference lists so
that we can easily give them monoid instances:
newtype DiffList a = DiffList { getDiffList :: [a] -> [a] }
The type that we wrap is [a] -> [a] because a
difference list is just a function that takes a list and returns
another. Converting normal lists to difference lists and vice versa is
easy:
toDiffList :: [a] -> DiffList a
toDiffList xs = DiffList (xs++)
fromDiffList :: DiffList a -> [a]
fromDiffList (DiffList f) = f []
To make a normal list into a difference list we just do what we did before and make it a function that prepends it to another list. Because a difference list is a function that prepends something to another list, if we just want that something, we apply the function to an empty list!
Here’s the Monoid instance:
instance Monoid (DiffList a) where
mempty = DiffList (\xs -> [] ++ xs)
(DiffList f) `mappend` (DiffList g) = DiffList (\xs -> f (g xs))
Notice how for lists, mempty is just the id
function and mappend is actually just function composition.
Let’s see if this works:
ghci> fromDiffList (toDiffList [1,2,3,4] `mappend` toDiffList [1,2,3])
[1,2,3,4,1,2,3]
Tip top! Now we can increase the efficiency of our
gcdReverse function by making it use difference lists
instead of normal lists:
import Control.Monad.Writer
gcd' :: Int -> Int -> Writer (DiffList String) Int
gcd' a b
| b == 0 = do
tell (toDiffList ["Finished with " ++ show a])
return a
| otherwise = do
result <- gcd' b (a `mod` b)
tell (toDiffList [show a ++ " mod " ++ show b ++ " = " ++ show (a `mod` b)])
return result
We only had to change the type of the monoid from
[String] to DiffList String and then when
using tell, convert our normal lists into difference lists
with toDiffList. Let’s see if the log gets assembled
properly:
ghci> mapM_ putStrLn . fromDiffList . snd . runWriter $ gcdReverse 110 34
Finished with 2
8 mod 2 = 0
34 mod 8 = 2
110 mod 34 = 8
We do gcdReverse 110 34, then use runWriter
to unwrap it from the newtype, then apply snd
to that to just get the log, then apply fromDiffList to
convert it to a normal list and then finally print its entries to the
screen.
Comparing Performance
To get a feel for just how much difference lists may improve your
performance, consider this function that just counts down from some
number to zero, but produces its log in reverse, like
gcdReverse, so that the numbers in the log will actually be
counted up:
finalCountDown :: Int -> Writer (DiffList String) ()
finalCountDown 0 = do
tell (toDiffList ["0"])
finalCountDown x = do
finalCountDown (x-1)
tell (toDiffList [show x])
If we give it 0, it just logs it. For any other number,
it first counts down its predecessor to 0 and then appends
that number to the log. So if we apply finalCountDown to
100, the string "100" will come last in the
log.
Anyway, if you load this function in GHCi and apply it to a big
number, like 500000, you’ll see that it quickly starts
counting from 0 onwards:
ghci> mapM_ putStrLn . fromDiffList . snd . runWriter $ finalCountDown 500000
0
1
2
...
However, if we change it to use normal lists instead of difference lists, like so:
finalCountDown :: Int -> Writer [String] ()
finalCountDown 0 = do
tell ["0"]
finalCountDown x = do
finalCountDown (x-1)
tell [show x]
And then tell GHCi to start counting:
ghci> mapM_ putStrLn . snd . runWriter $ finalCountDown 500000
We’ll see that the counting is really slow.
Of course, this is not the proper and scientific way to test how fast our programs are, but we were able to see that in this case, using difference lists starts producing results right away whereas normal lists take forever.
Oh, by the way, the song Final Countdown by Europe is now stuck in your head. Enjoy!
Reader? Ugh, not this joke again.
In the chapter about
applicatives, we saw that the function type, (->) r
is an instance of Functor. Mapping a function
f over a function g will make a function that
takes the same thing as g, applies g to it and
then applies f to that result. So basically, we’re making a
new function that’s like g, only before returning its
result, f gets applied to that result as well. For
instance:
ghci> let f = (*5)
ghci> let g = (+3)
ghci> (fmap f g) 8
55
We’ve also seen that functions are applicative functors. They allow us to operate on the eventual results of functions as if we already had their results. Here’s an example:
ghci> let f = (+) <$> (*2) <*> (+10)
ghci> f 3
19
The expression (+) <$> (*2) <*> (+10) makes
a function that takes a number, gives that number to (*2)
and (+10) and then adds together the results. For instance,
if we apply this function to 3, it applies both
(*2) and (+10) to 3, giving
6 and 13. Then, it calls (+) with
6 and 13 and the result is
19.
Not only is the function type (->) r a functor and an
applicative functor, but it’s also a monad. Just like other monadic
values that we’ve met so far, a function can also be considered a value
with a context. The context for functions is that that value is not
present yet and that we have to apply that function to something in
order to get its result value.
Because we’re already acquainted with how functions work as functors
and applicative functors, let’s dive right in and see what their
Monad instance looks like. It’s located in
Control.Monad.Instances and it goes a little something like
this:
instance Monad ((->) r) where
return x = \_ -> x
h >>= f = \w -> f (h w) w
We’ve already seen how pure is implemented for
functions, and return is pretty much the same thing as
pure. It takes a value and puts it in a minimal context
that always has that value as its result. And the only way to make a
function that always has a certain value as its result is to make it
completely ignore its parameter.
The implementation for >>= seems a bit cryptic,
but it’s really not all that. When we use >>= to feed
a monadic value to a function, the result is always a monadic value. So
in this case, when we feed a function to another function, the result is
a function as well. That’s why the result starts off as a lambda. All of
the implementations of >>= so far always somehow
isolated the result from the monadic value and then applied the function
f to that result. The same thing happens here. To get the
result from a function, we have to apply it to something, which is why
we do (h w) here to get the result from the function and
then we apply f to that. f returns a monadic
value, which is a function in our case, so we apply it to w
as well.
If you don’t understand how >>= works at this
point, don’t worry, because with examples we’ll see how this is a really
simple monad. Here’s a do expression that utilizes this
monad:
import Control.Monad.Instances
addStuff :: Int -> Int
addStuff = do
a <- (*2)
b <- (+10)
return (a+b)
This is the same thing as the applicative expression that we wrote
earlier, only now it relies on functions being monads. A do
expression always results in a monadic value and this one is no
different. The result of this monadic value is a function. What happens
here is that it takes a number and then (*2) gets applied
to that number and the result becomes a. (+10)
is applied to the same number that (*2) got applied to and
the result becomes b. return, like in other
monads, doesn’t have any other effect but to make a monadic value that
presents some result. This presents a+b as the result of
this function. If we test it out, we get the same result as before:
ghci> addStuff 3
19
Both (*2) and (+10) get applied to the
number 3 in this case. return (a+b) does as
well, but it ignores it and always presents a+b as the
result. For this reason, the function monad is also called the reader
monad. All the functions read from a common source. To illustrate this
even better, we can rewrite addStuff like so:
addStuff :: Int -> Int
addStuff x = let
a = (*2) x
b = (+10) x
in a+b
We see that the reader monad allows us to treat functions as values
with a context. We can act as if we already know what the functions will
return. It does this by gluing functions together into one function and
then giving that function’s parameter to all of the functions that it
was glued from. So if we have a lot of functions that are all just
missing one parameter and they’d eventually be applied to the same
thing, we can use the reader monad to sort of extract their future
results and the >>= implementation will make sure
that it all works out.
Tasteful stateful computations
Haskell is a pure language and because of that, our programs are made of functions that can’t change any global state or variables, they can only do some computations and return them results. This restriction actually makes it easier to think about our programs, as it frees us from worrying what every variable’s value is at some point in time. However, some problems are inherently stateful in that they rely on some state that changes over time. While such problems aren’t a problem for Haskell, they can be a bit tedious to model sometimes. That’s why Haskell features a thing called the state monad, which makes dealing with stateful problems a breeze while still keeping everything nice and pure.
When we were dealing with
random numbers, we dealt with functions that took a random generator
as a parameter and returned a random number and a new random generator.
If we wanted to generate several random numbers, we always had to use
the random generator that a previous function returned along with its
result. When making a function that takes a StdGen and
tosses a coin three times based on that generator, we had to do
this:
threeCoins :: StdGen -> (Bool, Bool, Bool)
threeCoins gen =
let (firstCoin, newGen) = random gen
(secondCoin, newGen') = random newGen
(thirdCoin, newGen'') = random newGen'
in (firstCoin, secondCoin, thirdCoin)
It took a generator gen and then random gen
returned a Bool value along with a new generator. To throw
the second coin, we used the new generator, and so on. In most other
languages, we wouldn’t have to return a new generator along with a
random number. We could just modify the existing one! But since Haskell
is pure, we can’t do that, so we had to take some state, make a result
from it and a new state and then use that new state to generate new
results.
You’d think that to avoid manually dealing with stateful computations in this way, we’d have to give up the purity of Haskell. Well, we don’t have to, since there exist a special little monad called the state monad which handles all this state business for us and without giving up any of the purity that makes Haskell programming so cool.
So, to help us understand this concept of stateful computations better, let’s go ahead and give them a type. We’ll say that a stateful computation is a function that takes some state and returns a value along with some new state. That function would have the following type:
s -> (a,s)
s is the type of the state and a the result
of the stateful computations.
Assignment in most other languages could be thought of as a stateful
computation. For instance, when we do x = 5 in an
imperative language, it will usually assign the value 5 to
the variable x and it will also have the value
5 as an expression. If you look at that functionally, you
could look at it as a function that takes a state (that is, all the
variables that have been assigned previously) and returns a result (in
this case 5) and a new state, which would be all the
previous variable mappings plus the newly assigned variable.
This stateful computation, a function that takes a state and returns a result and a new state, can be thought of as a value with a context as well. The actual value is the result, whereas the context is that we have to provide some initial state to actually get that result and that apart from getting a result we also get a new state.
Stacks and stones
Say we want to model operating a stack. You have a stack of things one on top of another and you can either push stuff on top of that stack or you can take stuff off the top of the stack. When you’re putting an item on top of the stack we say that you’re pushing it to the stack and when you’re taking stuff off the top we say that you’re popping it. If you want something that’s at the bottom of the stack, you have to pop everything that’s above it.
We’ll use a list to represent our stack and the head of the list will
be the top of the stack. To help us with our task, we’ll make two
functions: pop and push. pop will
take a stack, pop one item and return that item as the result and also
return a new stack, without that item. push will take an
item and a stack and then push that item onto the stack. It will return
() as its result, along with a new stack. Here goes:
type Stack = [Int]
pop :: Stack -> (Int,Stack)
pop (x:xs) = (x,xs)
push :: Int -> Stack -> ((),Stack)
push a xs = ((),a:xs)
We used () as the result when pushing to the stack
because pushing an item onto the stack doesn’t have any important result
value, its main job is to change the stack. Notice how we just apply the
first parameter of push, we get a stateful computation.
pop is already a stateful computation because of its
type.
Let’s write a small piece of code to simulate a stack using these
functions. We’ll take a stack, push 3 to it and then pop
two items, just for kicks. Here it is:
stackManip :: Stack -> (Int, Stack)
stackManip stack = let
((),newStack1) = push 3 stack
(a ,newStack2) = pop newStack1
in pop newStack2
We take a stack and then we do
push 3 stack, which results in a tuple. The first part of
the tuple is a () and the second is a new stack and we call
it newStack1. Then, we pop a number from
newStack1, which results in a number a (which
is the 3) that we pushed and a new stack which we call
newStack2. Then, we pop a number off newStack2
and we get a number that’s b and a newStack3.
We return a tuple with that number and that stack. Let’s try it out:
ghci> stackManip [5,8,2,1]
(5,[8,2,1])
Cool, the result is 5 and the new stack is
[8,2,1]. Notice how stackManip is itself a
stateful computation. We’ve taken a bunch of stateful computations and
we’ve sort of glued them together. Hmm, sounds familiar.
The above code for stackManip is kind of tedious since
we’re manually giving the state to every stateful computation and
storing it and then giving it to the next one. Wouldn’t it be cooler if,
instead of giving the stack manually to each function, we could write
something like this:
stackManip = do
push 3
a <- pop
pop
Well, using the state monad will allow us to do exactly this. With it, we will be able to take stateful computations like these and use them without having to manage the state manually.
The State monad
The Control.Monad.State module provides a
newtype that wraps stateful computations. Here’s its
definition:
newtype State s a = State { runState :: s -> (a,s) }
A State s a is a stateful computation that manipulates a
state of type s and has a result of type
a.
Now that we’ve seen what stateful computations are about and how they
can even be thought of as values with contexts, let’s check out their
Monad instance:
instance Monad (State s) where
return x = State $ \s -> (x,s)
(State h) >>= f = State $ \s -> let (a, newState) = h s
(State g) = f a
in g newState
Let’s take a gander at return first. Our aim with
return is to take a value and make a stateful computation
that always has that value as its result. That’s why we just make a
lambda \s -> (x,s). We always present x as
the result of the stateful computation and the state is kept unchanged,
because return has to put a value in a minimal context. So
return will make a stateful computation that presents a
certain value as the result and keeps the state unchanged.
What about >>=? Well, the result of feeding a
stateful computation to a function with >>= has to be
a stateful computation, right? So we start off with the
State newtype wrapper and then we type out a
lambda. This lambda will be our new stateful computation. But what goes
on in it? Well, we somehow have to extract the result value from the
first stateful computation. Because we’re in a stateful computation
right now, we can give the stateful computation h our
current state s, which results in a pair of result and a
new state: (a, newState). Every time so far when we were
implementing >>=, once we had the extracted the
result from the monadic value, we applied the function f to
it to get the new monadic value. In Writer, after doing
that and getting the new monadic value, we still had to make sure that
the context was taken care of by mappending the old monoid
value with the new one. Here, we do f a and we get a new
stateful computation g. Now that we have a new stateful
computation and a new state (goes by the name of newState)
we just apply that stateful computation g to the
newState. The result is a tuple of final result and final
state!
So with >>=, we kind of glue two stateful
computations together, only the second one is hidden inside a function
that takes the previous one’s result. Because pop and
push are already stateful computations, it’s easy to wrap
them into a State wrapper. Watch:
import Control.Monad.State
pop :: State Stack Int
pop = State $ \(x:xs) -> (x,xs)
push :: Int -> State Stack ()
push a = State $ \xs -> ((),a:xs)
pop is already a stateful computation and
push takes an Int and returns a stateful
computation. Now we can rewrite our previous example of pushing
3 onto the stack and then popping two numbers off like
this:
import Control.Monad.State
stackManip :: State Stack Int
stackManip = do
push 3
a <- pop
pop
See how we’ve glued a push and two pops into one stateful
computation? When we unwrap it from its newtype wrapper we
get a function to which we can provide some initial state:
ghci> runState stackManip [5,8,2,1]
(5,[8,2,1])
We didn’t have to bind the second pop to a
because we didn’t use that a at all. So we could have
written it like this:
stackManip :: State Stack Int
stackManip = do
push 3
pop
pop
Pretty cool. But what if we want to do this: pop one number off the
stack and then if that number is 5 we just put it back onto
the stack and stop but if it isn’t 5, we push
3 and 8 back on? Well, here’s the code:
stackStuff :: State Stack ()
stackStuff = do
a <- pop
if a == 5
then push 5
else do
push 3
push 8
This is quite straightforward. Let’s run it with an initial stack.
ghci> runState stackStuff [9,0,2,1,0]
((),[8,3,0,2,1,0])
Remember, do expressions result in monadic values and
with the State monad, a single do expression
is also a stateful function. Because stackManip and
stackStuff are ordinary stateful computations, we can glue
them together to produce further stateful computations.
moreStack :: State Stack ()
moreStack = do
a <- stackManip
if a == 100
then stackStuff
else return ()
If the result of stackManip on the current stack is
100, we run stackStuff, otherwise we do
nothing. return () just keeps the state as it is and does
nothing.
The Control.Monad.State module provides a type class
that’s called MonadState and it features two pretty useful
functions, namely get and put. For
State, the get function is implemented like
this:
get = State $ \s -> (s,s)
So it just takes the current state and presents it as the result. The
put function takes some state and makes a stateful function
that replaces the current state with it:
put newState = State $ \s -> ((),newState)
So with these, we can see what the current stack is or we can replace it with a whole other stack. Like so:
stackyStack :: State Stack ()
stackyStack = do
stackNow <- get
if stackNow == [1,2,3]
then put [8,3,1]
else put [9,2,1]
It’s worth examining what the type of >>= would be
if it only worked for State values:
(>>=) :: State s a -> (a -> State s b) -> State s b
See how the type of the state s stays the same but the
type of the result can change from a to b?
This means that we can glue together several stateful computations whose
results are of different types but the type of the state has to stay the
same. Now why is that? Well, for instance, for Maybe,
>>= has this type:
(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b
It makes sense that the monad itself, Maybe, doesn’t
change. It wouldn’t make sense to use >>= between two
different monads. Well, for the state monad, the monad is actually
State s, so if that s was different, we’d be
using >>= between two different monads.
Randomness and the state monad
At the beginning of this section, we saw how generating numbers can sometimes be awkward because every random function takes a generator and returns a random number along with a new generator, which must then be used instead of the old one if we want to generate another random number. The state monad makes dealing with this a lot easier.
The random function from System.Random has
the following type:
random :: (RandomGen g, Random a) => g -> (a, g)
Meaning it takes a random generator and produces a random number
along with a new generator. We can see that it’s a stateful computation,
so we can wrap it in the State newtype
constructor and then use it as a monadic value so that passing of the
state gets handled for us:
import System.Random
import Control.Monad.State
randomSt :: (RandomGen g, Random a) => State g a
randomSt = State random
So now if we want to throw three coins (True is tails,
False is heads) we just do the following:
import System.Random
import Control.Monad.State
threeCoins :: State StdGen (Bool,Bool,Bool)
threeCoins = do
a <- randomSt
b <- randomSt
c <- randomSt
return (a,b,c)
threeCoins is now a stateful computation and after
taking an initial random generator, it passes it to the first
randomSt, which produces a number and a new generator,
which gets passed to the next one and so on. We use
return (a,b,c) to present (a,b,c) as the
result without changing the most recent generator. Let’s give this a
go:
ghci> runState threeCoins (mkStdGen 33)
((True,False,True),680029187 2103410263)
Nice. Doing these sort of things that require some state to be kept in between steps just became much less of a hassle!
Error error on the wall
We know by now that Maybe is used to add a context of
possible failure to values. A value can be a Just something
or a Nothing. However useful it may be, when we have a
Nothing, all we know is that there was some sort of
failure, but there’s no way to cram some more info in there telling us
what kind of failure it was or why it failed.
The Either e a type on the other hand, allows us to
incorporate a context of possible failure to our values while also being
able to attach values to the failure, so that they can describe what
went wrong or provide some other useful info regarding the failure. An
Either e a value can either be a Right value,
signifying the right answer and a success, or it can be a
Left value, signifying failure. For instance:
ghci> :t Right 4
Right 4 :: (Num t) => Either a t
ghci> :t Left "out of cheese error"
Left "out of cheese error" :: Either [Char] b
This is pretty much just an enhanced Maybe, so it makes
sense for it to be a monad, because it can also be viewed as a value
with an added context of possible failure, only now there’s a value
attached when there’s an error as well.
Its Monad instance is similar to that of
Maybe and it can be found in
Control.Monad.Error:
instance (Error e) => Monad (Either e) where
return x = Right x
Right x >>= f = f x
Left err >>= f = Left err
fail msg = Left (strMsg msg)
return, as always, takes a value and puts it in a
default minimal context. It wraps our value in the Right
constructor because we’re using Right to represent a
successful computation where a result is present. This is a lot like
return for Maybe.
The >>= examines two possible cases: a
Left and a Right. In the case of a
Right, the function f is applied to the value
inside it, similar to how in the case of a Just, the
function is just applied to its contents. In the case of an error, the
Left value is kept, along with its contents, which describe
the failure.
The Monad instance for Either e makes an
additional requirement, and that is that the type of the value contained
in a Left, the one that’s indexed by the e
type parameter, has to be an instance of the Error type
class. The Error type class is for types whose values can
act like error messages. It defines the strMsg function,
which takes an error in the form of a string and returns such a value. A
good example of an Error instance is, well, the
String type! In the case of String, the
strMsg function just returns the string that it got:
ghci> :t strMsg
strMsg :: (Error a) => String -> a
ghci> strMsg "boom!" :: String
"boom!"
But since we usually use String to describe the error
when using Either, we don’t have to worry about this too
much. When a pattern match fails in do notation, a
Left value is used to signify this failure.
Anyway, here are a few examples of usage:
ghci> Left "boom" >>= \x -> return (x+1)
Left "boom"
ghci> Right 100 >>= \x -> Left "no way!"
Left "no way!"
When we use >>= to feed a Left value
to a function, the function is ignored and an identical
Left value is returned. When we feed a Right
value to a function, the function gets applied to what’s on the inside,
but in this case that function produced a Left value
anyway!
When we try to feed a Right value to a function that
also succeeds, we’re tripped up by a peculiar type error! Hmmm.
ghci> Right 3 >>= \x -> return (x + 100)
<interactive>:1:0:
Ambiguous type variable `a' in the constraints:
`Error a' arising from a use of `it' at <interactive>:1:0-33
`Show a' arising from a use of `print' at <interactive>:1:0-33
Probable fix: add a type signature that fixes these type variable(s)
Haskell says that it doesn’t know which type to choose for the
e part of our Either e a typed value, even
though we’re just printing the Right part. This is due to
the Error e constraint on the Monad instance.
So if you get type errors like this one when using Either
as a monad, just add an explicit type signature:
ghci> Right 3 >>= \x -> return (x + 100) :: Either String Int
Right 103
Alright, now it works!
Other than this little hangup, using this monad is very similar to
using Maybe as a monad. In the previous chapter, we used
the monadic aspects of Maybe to simulate birds landing on
the balancing pole of a tightrope walker. As an exercise, you can
rewrite that with the error monad so that when the tightrope walker
slips and falls, we remember how many birds were on each side of the
pole when he fell.
Some useful monadic functions
In this section, we’re going to explore a few functions that either
operate on monadic values or return monadic values as their results (or
both!). Such functions are usually referred to as monadic functions.
While some of them will be brand new, others will be monadic
counterparts of functions that we already know, like filter
and foldl. Let’s see what they are then!
liftM and friends
When we started our journey to the top of Monad Mountain, we first looked at functors, which are for things that can be mapped over. Then, we learned about improved functors called applicative functors, which allowed us to apply normal functions between several applicative values as well as to take a normal value and put it in some default context. Finally, we introduced monads as improved applicative functors, which added the ability for these values with context to somehow be fed into normal functions.
So every monad is an applicative functor and every applicative
functor is a functor. The Applicative type class has a
class constraint such that our type has to be an instance of
Functor before we can make it an instance of
Applicative. But even though Monad should have
the same constraint for Applicative, as every monad is an
applicative functor, it doesn’t, because the Monad type
class was introduced to Haskell way before Applicative.
But even though every monad is a functor, we don’t have to rely on it
having a Functor instance because of the liftM
function. liftM takes a function and a monadic value and
maps it over the monadic value. So it’s pretty much the same thing as
fmap! This is liftM’s type:
liftM :: (Monad m) => (a -> b) -> m a -> m b
And this is the type of fmap:
fmap :: (Functor f) => (a -> b) -> f a -> f b
If the Functor and Monad instances for a
type obey the functor and monad laws, these two amount to the same thing
(and all the monads that we’ve met so far obey both). This is kind of
like pure and return do the same thing, only
one has an Applicative class constraint whereas the other
has a Monad one. Let’s try liftM out:
ghci> liftM (*3) (Just 8)
Just 24
ghci> fmap (*3) (Just 8)
Just 24
ghci> runWriter $ liftM not $ Writer (True, "chickpeas")
(False,"chickpeas")
ghci> runWriter $ fmap not $ Writer (True, "chickpeas")
(False,"chickpeas")
ghci> runState (liftM (+100) pop) [1,2,3,4]
(101,[2,3,4])
ghci> runState (fmap (+100) pop) [1,2,3,4]
(101,[2,3,4])
We already know quite well how fmap works with
Maybe values. And liftM does the same thing.
For Writer values, the function is mapped over the first
component of the tuple, which is the result. Doing fmap or
liftM over a stateful computation results in another
stateful computation, only its eventual result is modified by the
supplied function. Had we not mapped (+100) over
pop in this case before running it, it would have returned
(1,[2,3,4]).
This is how liftM is implemented:
liftM :: (Monad m) => (a -> b) -> m a -> m b
liftM f m = m >>= (\x -> return (f x))
Or with do notation:
liftM :: (Monad m) => (a -> b) -> m a -> m b
liftM f m = do
x <- m
return (f x)
We feed the monadic value m into the function and then
we apply the function f to its result before putting it
back into a default context. Because of the monad laws, this is
guaranteed not to change the context, only the result that the monadic
value presents. We see that liftM is implemented without
referencing the Functor type class at all. This means that
we can implement fmap (or liftM, whatever you
want to call it) just by using the goodies that monads offer us. Because
of this, we can conclude that monads are stronger than just regular old
functors.
The Applicative type class allows us to apply functions
between values with contexts as if they were normal values. Like
this:
ghci> (+) <$> Just 3 <*> Just 5
Just 8
ghci> (+) <$> Just 3 <*> Nothing
Nothing
Using this applicative style makes things pretty easy.
<$> is just fmap and
<*> is a function from the Applicative
type class that has the following type:
(<*>) :: (Applicative f) => f (a -> b) -> f a -> f b
So it’s kind of like fmap, only the function itself is
in a context. We have to somehow extract it from the context and map it
over the f a value and then assemble the context back
together. Because all functions are curried in Haskell by default, we
can use the combination of <$> and
<*> to apply functions that take several parameters
between applicative values.
Anyway, it turns out that just like fmap,
<*> can also be implemented by using only what the
Monad type class give us. The ap function is
basically <*>, only it has a Monad
constraint instead of an Applicative one. Here’s its
definition:
ap :: (Monad m) => m (a -> b) -> m a -> m b
ap mf m = do
f <- mf
x <- m
return (f x)
mf is a monadic value whose result is a function.
Because the function is in a context as well as the value, we get the
function from the context and call it f, then get the value
and call that x and then finally apply the function to the
value and present that as a result. Here’s a quick demonstration:
ghci> Just (+3) <*> Just 4
Just 7
ghci> Just (+3) `ap` Just 4
Just 7
ghci> [(+1),(+2),(+3)] <*> [10,11]
[11,12,12,13,13,14]
ghci> [(+1),(+2),(+3)] `ap` [10,11]
[11,12,12,13,13,14]
Now we see that monads are stronger than applicatives as well,
because we can use the functions from Monad to implement
the ones for Applicative. In fact, many times when a type
is found to be a monad, people first write up a Monad
instance and then make an Applicative instance by just
saying that pure is return and
<*> is ap. Similarly, if you already
have a Monad instance for something, you can give it a
Functor instance just saying that fmap is
liftM.
The liftA2 function is a convenience function for
applying a function between two applicative values. It’s defined simply
like so:
liftA2 :: (Applicative f) => (a -> b -> c) -> f a -> f b -> f c
liftA2 f x y = f <$> x <*> y
The liftM2 function does the same thing, only it has a
Monad constraint. There also exist liftM3 and
liftM4 and liftM5.
We saw how monads are stronger than applicatives and functors and how
even though all monads are functors and applicative functors, they don’t
necessarily have Functor and Applicative
instances, so we examined the monadic equivalents of the functions that
functors and applicative functors use.
The join function
Here’s some food for thought: if the result of one monadic value is
another monadic value i.e. if one monadic value is nested inside the
other, can you flatten them to just a single normal monadic value? Like,
if we have Just (Just 9), can we make that into
Just 9? It turns out that any nested monadic value can be
flattened and that this is actually a property unique to monads. For
this, the join function exists. Its type is this:
join :: (Monad m) => m (m a) -> m a
So it takes a monadic value within a monadic value and gives us just
a monadic value, so it sort of flattens it. Here it is with some
Maybe values:
ghci> join (Just (Just 9))
Just 9
ghci> join (Just Nothing)
Nothing
ghci> join Nothing
Nothing
The first line has a successful computation as a result of a
successful computation, so they’re both just joined into one big
successful computation. The second line features a Nothing
as a result of a Just value. Whenever we were dealing with
Maybe values before and we wanted to combine several of
them into one, be it with <*> or
>>=, they all had to be Just values for
the result to be a Just value. If there was any failure
along the way, the result was a failure and the same thing happens here.
In the third line, we try to flatten what is from the onset a failure,
so the result is a failure as well.
Flattening lists is pretty intuitive:
ghci> join [[1,2,3],[4,5,6]]
[1,2,3,4,5,6]
As you can see, for lists, join is just
concat. To flatten a Writer value whose result
is a Writer value itself, we have to mappend
the monoid value.
ghci> runWriter $ join (Writer (Writer (1,"aaa"),"bbb"))
(1,"bbbaaa")
The outer monoid value "bbb" comes first and then to it
"aaa" is appended. Intuitively speaking, when you want to
examine what the result of a Writer value is, you have to
write its monoid value to the log first and only then can you examine
what it has inside.
Flattening Either values is very similar to flattening
Maybe values:
ghci> join (Right (Right 9)) :: Either String Int
Right 9
ghci> join (Right (Left "error")) :: Either String Int
Left "error"
ghci> join (Left "error") :: Either String Int
Left "error"
If we apply join to a stateful computation whose result
is a stateful computation, the result is a stateful computation that
first runs the outer stateful computation and then the resulting one.
Watch:
ghci> runState (join (State $ \s -> (push 10,1:2:s))) [0,0,0]
((),[10,1,2,0,0,0])
The lambda here takes a state and puts 2 and
1 onto the stack and presents push 10 as its
result. So when this whole thing is flattened with join and
then run, it first puts 2 and 1 onto the stack
and then push 10 gets carried out, pushing a
10 on to the top.
The implementation for join is as follows:
join :: (Monad m) => m (m a) -> m a
join mm = do
m <- mm
m
Because the result of mm is a monadic value, we get that
result and then just put it on a line of its own because it’s a monadic
value. The trick here is that when we do m <- mm, the
context of the monad in which we are in gets taken care of. That’s why,
for instance, Maybe values result in Just
values only if the outer and inner values are both Just
values. Here’s what this would look like if the mm value
was set in advance to Just (Just 8):
joinedMaybes :: Maybe Int
joinedMaybes = do
m <- Just (Just 8)
m
Perhaps the most interesting thing about join is that
for every monad, feeding a monadic value to a function with
>>= is the same thing as just mapping that function
over the value and then using join to flatten the resulting
nested monadic value! In other words, m >>= f is
always the same thing as join (fmap f m)! It makes sense
when you think about it. With >>=, we’re always
thinking about how to feed a monadic value to a function that takes a
normal value but returns a monadic value. If we just map that function
over the monadic value, we have a monadic value inside a monadic value.
For instance, say we have Just 9 and the function
\x -> Just (x+1). If we map this function over
Just 9, we’re left with Just (Just 10).
The fact that m >>= f always equals
join (fmap f m) is very useful if we’re making our own
Monad instance for some type because it’s often easier to
figure out how we would flatten a nested monadic value than figuring out
how to implement >>=.
filterM
The filter function is pretty much the bread of Haskell
programming (map being the butter). It takes a predicate
and a list to filter out and then returns a new list where only the
elements that satisfy the predicate are kept. Its type is this:
filter :: (a -> Bool) -> [a] -> [a]
The predicate takes an element of the list and returns a
Bool value. Now, what if the Bool value that
it returned was actually a monadic value? Whoa! That is, what if it came
with a context? Could that work? For instance, what if every
True or a False value that the predicate
produced also had an accompanying monoid value, like
["Accepted the number 5"] or
["3 is too small"]? That sounds like it could work. If that
were the case, we’d expect the resulting list to also come with a log of
all the log values that were produced along the way. So if the
Bool that the predicate returned came with a context, we’d
expect the final resulting list to have some context attached as well,
otherwise the context that each Bool came with would be
lost.
The filterM function from Control.Monad
does just what we want! Its type is this:
filterM :: (Monad m) => (a -> m Bool) -> [a] -> m [a]
The predicate returns a monadic value whose result is a
Bool, but because it’s a monadic value, its context can be
anything from a possible failure to non-determinism and more! To ensure
that the context is reflected in the final result, the result is also a
monadic value.
Let’s take a list and only keep those values that are smaller than 4.
To start, we’ll just use the regular filter function:
ghci> filter (\x -> x < 4) [9,1,5,2,10,3]
[1,2,3]
That’s pretty easy. Now, let’s make a predicate that, aside from
presenting a True or False result, also
provides a log of what it did. Of course, we’ll be using the
Writer monad for this:
keepSmall :: Int -> Writer [String] Bool
keepSmall x
| x < 4 = do
tell ["Keeping " ++ show x]
return True
| otherwise = do
tell [show x ++ " is too large, throwing it away"]
return False
Instead of just and returning a Bool, this function
returns a Writer [String] Bool. It’s a monadic predicate.
Sounds fancy, doesn’t it? If the number is smaller than 4
we report that we’re keeping it and then return True.
Now, let’s give it to filterM along with a list. Because
the predicate returns a Writer value, the resulting list
will also be a Writer value.
ghci> fst $ runWriter $ filterM keepSmall [9,1,5,2,10,3]
[1,2,3]
Examining the result of the resulting Writer value, we
see that everything is in order. Now, let’s print the log and see what
we got:
ghci> mapM_ putStrLn $ snd $ runWriter $ filterM keepSmall [9,1,5,2,10,3]
9 is too large, throwing it away
Keeping 1
5 is too large, throwing it away
Keeping 2
10 is too large, throwing it away
Keeping 3
Awesome. So just by providing a monadic predicate to
filterM, we were able to filter a list while taking
advantage of the monadic context that we used.
A very cool Haskell trick is using filterM to get the
powerset of a list (if we think of them as sets for now). The powerset
of some set is a set of all subsets of that set. So if we have a set
like [1,2,3], its powerset would include the following
sets:
[1,2,3]
[1,2]
[1,3]
[1]
[2,3]
[2]
[3]
[]
In other words, getting a powerset is like getting all the
combinations of keeping and throwing out elements from a set.
[2,3] is like the original set, only we excluded the number
1.
To make a function that returns a powerset of some list, we’re going
to rely on non-determinism. We take the list [1,2,3] and
then look at the first element, which is 1 and we ask
ourselves: should we keep it or drop it? Well, we’d like to do both
actually. So we are going to filter a list and we’ll use a predicate
that non-deterministically both keeps and drops every element from the
list. Here’s our powerset function:
powerset :: [a] -> [[a]]
powerset xs = filterM (\x -> [True, False]) xs
Wait, that’s it? Yup. We choose to drop and keep every element, regardless of what that element is. We have a non-deterministic predicate, so the resulting list will also be a non-deterministic value and will thus be a list of lists. Let’s give this a go:
ghci> powerset [1,2,3]
[[1,2,3],[1,2],[1,3],[1],[2,3],[2],[3],[]]
This takes a bit of thinking to wrap your head around, but if you just consider lists as non-deterministic values that don’t know what to be so they just decide to be everything at once, it’s a bit easier.
foldM
The monadic counterpart to foldl is foldM.
If you remember your folds from the folds section, you know
that foldl takes a binary function, a starting accumulator
and a list to fold up and then folds it from the left into a single
value by using the binary function. foldM does the same
thing, except it takes a binary function that produces a monadic value
and folds the list up with that. Unsurprisingly, the resulting value is
also monadic. The type of foldl is this:
foldl :: (a -> b -> a) -> a -> [b] -> a
Whereas foldM has the following type:
foldM :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m a
The value that the binary function returns is monadic and so the result of the whole fold is monadic as well. Let’s sum a list of numbers with a fold:
ghci> foldl (\acc x -> acc + x) 0 [2,8,3,1]
14
The starting accumulator is 0 and then 2
gets added to the accumulator, resulting in a new accumulator that has a
value of 2. 8 gets added to this accumulator
resulting in an accumulator of 10 and so on and when we
reach the end, the final accumulator is the result.
Now what if we wanted to sum a list of numbers but with the added
condition that if any number is greater than 9 in the list,
the whole thing fails? It would make sense to use a binary function that
checks if the current number is greater than 9 and if it
is, fails, and if it isn’t, continues on its merry way. Because of this
added possibility of failure, let’s make our binary function return a
Maybe accumulator instead of a normal one. Here’s the
binary function:
binSmalls :: Int -> Int -> Maybe Int
binSmalls acc x
| x > 9 = Nothing
| otherwise = Just (acc + x)
Because our binary function is now a monadic function, we can’t use
it with the normal foldl, but we have to use
foldM. Here goes:
ghci> foldM binSmalls 0 [2,8,3,1]
Just 14
ghci> foldM binSmalls 0 [2,11,3,1]
Nothing
Excellent! Because one number in the list was greater than
9, the whole thing resulted in a Nothing.
Folding with a binary function that returns a Writer value
is cool as well because then you log whatever you want as your fold goes
along its way.
Making a safe RPN calculator
When we were solving the problem of implementing
a RPN calculator, we noted that it worked fine as long as the input
that it got made sense. But if something went wrong, it caused our whole
program to crash. Now that we know how to take some code that we have
and make it monadic, let’s take our RPN calculator and add error
handling to it by taking advantage of the Maybe monad.
We implemented our RPN calculator by taking a string like
"1 3 + 2 *", breaking it up into words to get something
like ["1","3","+","2","*"] and then folding over that list
by starting out with an empty stack and then using a binary folding
function that adds numbers to the stack or manipulates numbers on the
top of the stack to add them together and divide them and such.
This was the main body of our function:
import Data.List
solveRPN :: String -> Double
solveRPN = head . foldl foldingFunction [] . words
We made the expression into a list of strings, folded over it with our folding function and then when we were left with just one item in the stack, we returned that item as the answer. This was the folding function:
foldingFunction :: [Double] -> String -> [Double]
foldingFunction (x:y:ys) "*" = (x * y):ys
foldingFunction (x:y:ys) "+" = (x + y):ys
foldingFunction (x:y:ys) "-" = (y - x):ys
foldingFunction xs numberString = read numberString:xs
The accumulator of the fold was a stack, which we represented with a
list of Double values. As the folding function went over
the RPN expression, if the current item was an operator, it took two
items off the top of the stack, applied the operator between them and
then put the result back on the stack. If the current item was a string
that represented a number, it converted that string into an actual
number and returned a new stack that was like the old one, except with
that number pushed to the top.
Let’s first make our folding function capable of graceful failure. Its type is going to change from what it is now to this:
foldingFunction :: [Double] -> String -> Maybe [Double]
So it will either return Just a new stack or it will
fail with Nothing.
The reads function is like read, only it
returns a list with a single element in case of a successful read. If it
fails to read something, then it returns an empty list. Apart from
returning the value that it read, it also returns the part of the string
that it didn’t consume. We’re going to say that it always has to consume
the full input to work and make it into a readMaybe
function for convenience. Here it is:
readMaybe :: (Read a) => String -> Maybe a
readMaybe st = case reads st of [(x,"")] -> Just x
_ -> Nothing
Testing it out:
ghci> readMaybe "1" :: Maybe Int
Just 1
ghci> readMaybe "GO TO HELL" :: Maybe Int
Nothing
Okay, it seems to work. So, let’s make our folding function into a monadic function that can fail:
foldingFunction :: [Double] -> String -> Maybe [Double]
foldingFunction (x:y:ys) "*" = return ((x * y):ys)
foldingFunction (x:y:ys) "+" = return ((x + y):ys)
foldingFunction (x:y:ys) "-" = return ((y - x):ys)
foldingFunction xs numberString = liftM (:xs) (readMaybe numberString)
The first three cases are like the old ones, except the new stack
gets wrapped in a Just (we used return here to
do this, but we could have written Just just as well). In
the last case, we do readMaybe numberString and then we map
(:xs) over it. So if the stack xs is
[1.0,2.0] and readMaybe numberString results
in a Just 3.0, the result is
Just [3.0,1.0,2.0]. If readMaybe numberString
results in a Nothing then the result is
Nothing. Let’s try out the folding function by itself:
ghci> foldingFunction [3,2] "*"
Just [6.0]
ghci> foldingFunction [3,2] "-"
Just [-1.0]
ghci> foldingFunction [] "*"
Nothing
ghci> foldingFunction [] "1"
Just [1.0]
ghci> foldingFunction [] "1 wawawawa"
Nothing
It looks like it’s working! And now it’s time for the new and
improved solveRPN. Here it is ladies and gents!
import Data.List
solveRPN :: String -> Maybe Double
solveRPN st = do
[result] <- foldM foldingFunction [] (words st)
return result
Just like before, we take the string and make it into a list of
words. Then, we do a fold, starting with the empty stack, only instead
of doing a normal foldl, we do a foldM. The
result of that foldM should be a Maybe value
that contains a list (that’s our final stack) and that list should have
only one value. We use a do expression to get that value
and we call it result. In case the foldM
returns a Nothing, the whole thing will be a
Nothing, because that’s how Maybe works. Also
notice that we pattern match in the do expression, so if
the list has more than one value or none at all, the pattern match fails
and a Nothing is produced. In the last line we just do
return result to present the result of the RPN calculation
as the result of the final Maybe value.
Let’s give it a shot:
ghci> solveRPN "1 2 * 4 +"
Just 6.0
ghci> solveRPN "1 2 * 4 + 5 *"
Just 30.0
ghci> solveRPN "1 2 * 4"
Nothing
ghci> solveRPN "1 8 wharglbllargh"
Nothing
The first failure happens because the final stack isn’t a list with
one element in it and so the pattern matching in the do
expression fails. The second failure happens because
readMaybe returns a Nothing.
Composing monadic functions
When we were learning about the monad laws, we said that the
<=< function is just like composition, only instead
of working for normal functions like a -> b, it works
for monadic functions like a -> m b. For instance:
ghci> let f = (+1) . (*100)
ghci> f 4
401
ghci> let g = (\x -> return (x+1)) <=< (\x -> return (x*100))
ghci> Just 4 >>= g
Just 401
In this example we first composed two normal functions, applied the
resulting function to 4 and then we composed two monadic
functions and fed Just 4 to the resulting function with
>>=.
If we have a bunch of functions in a list, we can compose them one
all into one big function by just using id as the starting
accumulator and the . function as the binary function.
Here’s an example:
ghci> let f = foldr (.) id [(+1),(*100),(+1)]
ghci> f 1
201
The function f takes a number and then adds
1 to it, multiplies the result by 100 and then
adds 1 to that. Anyway, we can compose monadic functions in
the same way, only instead normal composition we use
<=< and instead of id we use
return. We don’t have to use a foldM over a
foldr or anything because the <=<
function makes sure that composition happens in a monadic fashion.
When we were getting to know the list monad in the previous chapter, we
used it to figure out if a knight can go from one position on a
chessboard to another in exactly three moves. We had a function called
moveKnight which took the knight’s position on the board
and returned all the possible moves that he can make next. Then, to
generate all the possible positions that he can have after taking three
moves, we made the following function:
in3 start = return start >>= moveKnight >>= moveKnight >>= moveKnight
And to check if he can go from start to end
in three moves, we did the following:
canReachIn3 :: KnightPos -> KnightPos -> Bool
canReachIn3 start end = end `elem` in3 start
Using monadic function composition, we can make a function like
in3, only instead of generating all the positions that the
knight can have after making three moves, we can do it for an arbitrary
number of moves. If you look at in3, we see that we used
moveKnight three times and each time we used
>>= to feed it all the possible previous positions.
So now, let’s make it more general. Here’s how to do it:
import Data.List
inMany :: Int -> KnightPos -> [KnightPos]
inMany x start = return start >>= foldr (<=<) return (replicate x moveKnight)
First we use replicate to make a list that contains
x copies of the function moveKnight. Then, we
monadically compose all those functions into one, which gives us a
function that takes a starting position and non-deterministically moves
the knight x times. Then, we just make the starting
position into a singleton list with return and feed it to
the function.
Now, we can change our canReachIn3 function to be more
general as well:
canReachIn :: Int -> KnightPos -> KnightPos -> Bool
canReachIn x start end = end `elem` inMany x start
Making monads
In this section, we’re going to look at an example of how a type gets
made, identified as a monad and then given the appropriate
Monad instance. We don’t usually set out to make a monad
with the sole purpose of making a monad. Instead, we usually make a type
that whose purpose is to model an aspect of some problem and then later
on if we see that the type represents a value with a context and can act
like a monad, we give it a Monad instance.
As we’ve seen, lists are used to represent non-deterministic values.
A list like [3,5,9] can be viewed as a single
non-deterministic value that just can’t decide what it’s going to be.
When we feed a list into a function with >>=, it just
makes all the possible choices of taking an element from the list and
applying the function to it and then presents those results in a list as
well.
If we look at the list [3,5,9] as the numbers
3, 5 and 9 occurring at once, we
might notice that there’s no info regarding the probability that each of
those numbers occurs. What if we wanted to model a non-deterministic
value like [3,5,9], but we wanted to express that
3 has a 50% chance of happening and 5 and
9 both have a 25% chance of happening? Let’s try and make
this happen!
Let’s say that every item in the list comes with another value, a probability of it happening. It might make sense to present this like this then:
[(3,0.5),(5,0.25),(9,0.25)]
In mathematics, probabilities aren’t usually expressed in
percentages, but rather in real numbers between a 0 and 1. A 0 means
that there’s no chance in hell for something to happen and a 1 means
that it’s happening for sure. Floating point numbers can get real messy
real fast because they tend to lose precision, so Haskell offers us a
data type for rational numbers that doesn’t lose precision. That type is
called Rational and it lives in Data.Ratio. To
make a Rational, we write it as if it were a fraction. The
numerator and the denominator are separated by a %. Here
are a few examples:
ghci> 1%4
1 % 4
ghci> 1%2 + 1%2
1 % 1
ghci> 1%3 + 5%4
19 % 12
The first line is just one quarter. In the second line we add two
halves to get a whole and in the third line we add one third with five
quarters and get nineteen twelfths. So let’use throw out our floating
points and use Rational for our probabilities:
ghci> [(3,1%2),(5,1%4),(9,1%4)]
[(3,1 % 2),(5,1 % 4),(9,1 % 4)]
Okay, so 3 has a one out of two chance of happening
while 5 and 9 will happen one time out of
four. Pretty neat.
We took lists and we added some extra context to them, so this
represents values with contexts too. Before we go any further, let’s
wrap this into a newtype because something tells me we’ll
be making some instances.
import Data.Ratio
newtype Prob a = Prob { getProb :: [(a,Rational)] } deriving Show
Alright. Is this a functor? Well, the list is a functor, so this should probably be a functor as well, because we just added some stuff to the list. When we map a function over a list, we apply it to each element. Here, we’ll apply it to each element as well, only we’ll leave the probabilities as they are. Let’s make an instance:
instance Functor Prob where
fmap f (Prob xs) = Prob $ map (\(x,p) -> (f x,p)) xs
We unwrap it from the newtype with pattern matching,
apply the function f to the values while keeping the
probabilities as they are and then wrap it back up. Let’s see if it
works:
ghci> fmap negate (Prob [(3,1%2),(5,1%4),(9,1%4)])
Prob {getProb = [(-3,1 % 2),(-5,1 % 4),(-9,1 % 4)]}
Another thing to note is that the probabilities should always add up
to 1. If those are all the things that can happen, it
doesn’t make sense for the sum of their probabilities to be anything
other than 1. A coin that lands tails 75% of the time and
heads 50% of the time seems like it could only work in some other
strange universe.
Now the big question, is this a monad? Given how the list is a monad,
this looks like it should be a monad as well. First, let’s think about
return. How does it work for lists? It takes a value and
puts it in a singleton list. What about here? Well, since it’s supposed
to be a default minimal context, it should also make a singleton list.
What about the probability? Well, return x is supposed to
make a monadic value that always presents x as its result,
so it doesn’t make sense for the probability to be 0. If it
always has to present it as its result, the probability should be
1!
What about >>=? Seems kind of tricky, so let’s
make use of the fact that m >>= f always equals
join (fmap f m) for monads and think about how we would
flatten a probability list of probability lists. As an example, let’s
consider this list where there’s a 25% chance that exactly one of
'a' or 'b' will happen. Both 'a'
and 'b' are equally likely to occur. Also, there’s a 75%
chance that exactly one of 'c' or 'd' will
happen. 'c' and 'd' are also equally likely to
happen. Here’s a picture of a probability list that models this
scenario:
What are the chances for each of these letters to occur? If we were
to draw this as just four boxes, each with a probability, what would
those probabilities be? To find out, all we have to do is multiply each
probability with all of probabilities that it contains. 'a'
would occur one time out of eight, as would 'b', because if
we multiply one half by one quarter we get one eighth. 'c'
would happen three times out of eight because three quarters multiplied
by one half is three eighths. 'd' would also happen three
times out of eight. If we sum all the probabilities, they still add up
to one.
Here’s this situation expressed as a probability list:
thisSituation :: Prob (Prob Char)
thisSituation = Prob
[( Prob [('a',1%2),('b',1%2)] , 1%4 )
,( Prob [('c',1%2),('d',1%2)] , 3%4)
]
Notice that its type is Prob (Prob Char). So now that
we’ve figure out how to flatten a nested probability list, all we have
to do is write the code for this and then we can write
>>= simply as join (fmap f m) and we
have ourselves a monad! So here’s flatten, which we’ll use
because the name join is already taken:
flatten :: Prob (Prob a) -> Prob a
flatten (Prob xs) = Prob $ concat $ map multAll xs
where multAll (Prob innerxs,p) = map (\(x,r) -> (x,p*r)) innerxs
The function multAll takes a tuple of probability list
and a probability p that comes with it and then multiplies
every inner probability with p, returning a list of pairs
of items and probabilities. We map multAll over each pair
in our nested probability list and then we just flatten the resulting
nested list.
Now we have all that we need, we can write a Monad
instance!
instance Monad Prob where
return x = Prob [(x,1%1)]
m >>= f = flatten (fmap f m)
fail _ = Prob []
Because we already did all the hard work, the instance is very
simple. We also defined the fail function, which is the
same as it is for lists, so if there’s a pattern match failure in a
do expression, a failure occurs within the context of a
probability list.
It’s also important to check if the monad laws hold for the monad
that we just made. The first one says that
return x >>= f should be equal to f x. A
rigorous proof would be rather tedious, but we can see that if we put a
value in a default context with return and then
fmap a function over that and flatten the resulting
probability list, every probability that results from the function would
be multiplied by the 1%1 probability that we made with
return, so it wouldn’t affect the context. The reasoning
for m >>= return being equal to just m
is similar. The third law states that
f <=< (g <=< h) should be the same as
(f <=< g) <=< h. This one holds as well,
because it holds for the list monad which forms the basis of the
probability monad and because multiplication is associative.
1%2 * (1%3 * 1%5) is equal to
(1%2 * 1%3) * 1%5.
Now that we have a monad, what can we do with it? Well, it can help us do calculations with probabilities. We can treat probabilistic events as values with contexts and the probability monad will make sure that those probabilities get reflected in the probabilities of the final result.
Say we have two normal coins and one loaded coin that gets tails an astounding nine times out of ten and heads only one time out of ten. If we throw all the coins at once, what are the odds of all of them landing tails? First, let’s make probability values for a normal coin flip and for a loaded one:
data Coin = Heads | Tails deriving (Show, Eq)
coin :: Prob Coin
coin = Prob [(Heads,1%2),(Tails,1%2)]
loadedCoin :: Prob Coin
loadedCoin = Prob [(Heads,1%10),(Tails,9%10)]
And finally, the coin throwing action:
import Data.List (all)
flipThree :: Prob Bool
flipThree = do
a <- coin
b <- coin
c <- loadedCoin
return (all (==Tails) [a,b,c])
Giving it a go, we see that the odds of all three landing tails are not that good, despite cheating with our loaded coin:
ghci> getProb flipThree
[(False,1 % 40),(False,9 % 40),(False,1 % 40),(False,9 % 40),
(False,1 % 40),(False,9 % 40),(False,1 % 40),(True,9 % 40)]
All three of them will land tails nine times out of forty, which is
less than 25%. We see that our monad doesn’t know how to join all of the
False outcomes where all coins don’t land tails into one
outcome. That’s not a big problem, since writing a function to put all
the same outcomes into one outcome is pretty easy and is left as an
exercise to the reader (you!)
In this section, we went from having a question (what if lists also carried information about probability?) to making a type, recognizing a monad and finally making an instance and doing something with it. I think that’s quite fetching! By now, we should have a pretty good grasp on monads and what they’re about.