I recently wrote some code using wreq
that seemed to use much more memory than I thought it should. The problem turned out not to be
with wreq but with the way that I was using mapM. An equivalent snippet of code is:
I reasoned that mapM would construct its result lazily, then head would force evaluation of just the first element of the list. This isn’t the case, as explained here. The function mapM is basically equivalent to this:
So the monadic action m is evaluated to build up the list elements.
One of the answers on the StackOverflow page says to use a step by step series to only evaluate the bits that are required:
The idea is that it represents a sequence of monadic actions. A Left is a final value of type b, while Right (a, Stream m a b) represents an intermediate value of type a along with the remaining stream.
The Monad instance is fairly straightforward. The return function turns a plain value into a final value (hence the Left), and the bind either stops with the final value or produces the new value along with the next stream.
There are also instances for Functor and Applicative but we don’t need them here.
A handy function is liftIO which turns a normal monadic action into a stream:
It just runs the io action, and pipes it to a Left and then returns it in a Stream.
To use readFileS we wrap it with runStream:
*Main> Left x <- runStream $ readFileS "MapM.lhs"
*Main> print $ B.length x
So we can produce final values, but what about intermediate ones? This is what yield does:
At this point we have no idea about the remaining stream, so we return the unit ().
For testing the code here we’ll take the definition of collect from Stream as well. It just walks
through the entire Stream and collects the values, ignoring the final unit value.
Now we can try out yield using monadic notation:
*Main> collect yield123
We can mix normal Haskell control structures like if/then/else into the monadic notation:
*Main> collect $ yieldEvens 0
We could read some files using our readFileS function and yield the results:
*Main> length <$> collect readAFewFiles
We can generalise this to apply a monadic function to a list of arguments, which is basically what mapM does:
And we can even make an infinite stream:
Take from a stream and a definition of head for a stream:
So we can efficiently take the head of the stream without evaluating the entire thing:
*Main> (fmap B.length) <$> headStream readForever
I should point out that the example of reading a file a bunch of times could be achieved without Stream just by storing a list of the monadic actions, and then evaluating the one that we want:
which can be used as follows:
*Main> B.length <$> (head $ listOfActions)
The difference is that the list is somewhat static, in that we can’t mix control structures into it as we can do with Stream.
Here’s one way to encode Stream-like behaviour using free monads. I define two actions, yield and final. The yield action stores an input value of type a, a monadic function a -> IO b, and the rest of the structure, which turns out to be conveniently represented as a function b -> k. Being a function of b lets the rest of the structure depend on the result at the current node in the structure. The final action just stores the value and monadic action, and is a terminal node in the free monad.
For convenience, Command is a simpler type signature:
As in my earlier post, we need instances for Functor and Monad. They are fairly straightforward:
Here are two helpers to make Command’s monadic usage easier:
To run a Command we walk its structure recursively and
run the IO actions as needed:
As with Stream, we can mix control structures with the creation of the free monad: