Carlo Hamalainen


MINC interfaces in Nipype

2016-01-16

About two years ago I wrote volgenmodel-nipype, a port of Andrew Janke's volgenmodel to the Nipype workflow system. Nipype provides a framework for wrapping legacy command-line tools in a simple to use interface, which also plugs in to a workflow engine that can run jobs on a multicore PC, IPython parallel, SGE/PBS, etc.

Using a workflow that takes care of naming and tracking input/output files is very convenient. To blur an image (using mincblur) one can create a Node with the Blur interface, and then use .connect to send the output of some other node into this node:

blur = pe.Node(interface=Blur(fwhm=step_x*15),
                              name='blur_' + snum_txt)

workflow.connect(norm, 'output_threshold_mask', blur, 'input_file')

When I first developed volgenmodel-nipype I wrote my own Nipype interfaces for quite a few MINC tools. Over the 2015 Xmas holidays I got those interfaces merged into the master branch of Nipype.

I took this opportunity to tidy up volgenmodel-nipype. There are no locally defined MINC interfaces. I added a public dataset, available in a separate repository: https://github.com/carlohamalainen/volgenmodel-fast-example. Previously this data wasn't publicly available. I also added some Docker scripts to run the whole workflow and compare the result against a known good model output, which I run in a weekly cronjob as a poor-person's continuous integration test suite.

The mouse brain sample data produces a model that looks like this:

Note to self: profunctors

2015-10-21

Note to self about deriving the Profunctor typeclass. Source is here: here.

This is a literate Haskell file, and it can be built using Stack:

git clone https://github.com/carlohamalainen/playground.git
cd playground/haskell/profunctors
stack build

Then use stack ghci instead of cabal repl. The main executable is in a path like ./.stack-work/install/x86_64-linux/lts-3.6/7.10.2/bin/profunctors-exe.

This blog post follows some of the examples from I love profunctors.

First, some extensions and imports:

> {-# LANGUAGE MultiParamTypeClasses #-}
> {-# LANGUAGE FlexibleInstances     #-}
> {-# LANGUAGE InstanceSigs          #-}
> {-# LANGUAGE RankNTypes            #-}
> {-# LANGUAGE ScopedTypeVariables   #-}
> 
> module Profunctors where
> 
> import Control.Applicative
> import Data.Char
> import Data.Functor.Constant
> import Data.Functor.Identity
> import Data.Tuple (swap)
> import qualified Data.Map as M
> 
> main = print "boo"

Motivating example

The basic problem here is to write a function that capitalizes each word in a string. First, write a function that capitalizes a single word:

> capWord :: String -> String
> capWord [] = []
> capWord (h:t) = (toUpper h):(map toLower t)

The straightforward solution (ignoring the loss of extra spaces between words since unwords . words is not an isomorphism) is to use this composition:

> capitalize :: String -> String
> capitalize = unwords . (map capWord) . words

Example output:

*Profunctors> capitalize "hey yo WHAT DID THIS          DO?"
"Hey Yo What Did This Do?"

Why stop here? Let’s generalise the capitalize function by factoring out the words and unwords functions. Call them w and u and make them arguments:

> capitalize1 :: (String -> [String]) -> ([String] -> String) -> String -> String
> capitalize1 w u = u . (map capWord) . w

Now, capitalize ≡ capitalize1 words unwords.

We may as well factor out map capWord as well:

> capitalize2 :: (String -> [String])
>              -> ([String] -> String)
>              -> ([String] -> [String])
>              -> String -> String
> capitalize2 w u f = u . f . w

We have: capitalize ≡ capitalize2 words unwords (map capWord).

Now look at the types - there is no reason to be restricted to String and [String] so use the most general types that make the composition u . f . w work:

     w          f          u
c -------> d -------> b -------> d

so w :: c -> d and similar for f and u. This lets us write

> capitalize3 :: (c -> a)
>             -> (b -> d)
>             -> (a -> b)
>             -> (c -> d)
> capitalize3 w u f = u . f . w

Next, we can generalize the type of f. To help with this step, recall that -> is a functor (there is an instance Functor (->)) so write the last two types in the signature with prefix notation:

> capitalize3' :: (c -> a)
>              -> (b -> d)
>              -> (->) a b
>              -> (->) c d
> capitalize3' w u f = u . f . w

Now we can use a general functor h instead of ->:

> capitalize4 :: (c -> a)
>             -> (b -> d)
>             -> h a b -- was (->) a b
>             -> h c d -- was (->) c d
> capitalize4 w u f = u . f . w

Naturally this won’t work because the type signature has the functor h but the body of capitalize4 is using function composition (the .) as the type error shows:

 | Couldn't match type ‘h’ with ‘(->)’
||   ‘h’ is a rigid type variable bound by
||       the type signature for
||         capitalize3' :: (c -> a) -> (b -> d) -> h a b -> h c d
|| Expected type: h c d
||   Actual type: c -> d

Fortunately for us, we can make a typeclass that captures the behaviour that we want. We have actually arrived at the definition of a profunctor.

> class Profunctor f where
>   dimap :: (c -> a) -> (b -> d) -> f a b -> f c d
> 
> instance Profunctor (->) where
>   dimap :: (c -> a) -> (b -> d) -> (a -> b) -> c -> d
>   dimap h g k = g . k . h

Now we can write the capitalize function using a typeclass constraint on Profunctor which lets us use the dimap function instead of explicit function composition:

> capitalize5 :: String -> String
> capitalize5 s = dimap words unwords (map capWord) s

This is overkill for the capitalization problem, but it shows how structure can come out of simple problems if you keep hacking away.

MVar note

2015-09-02

A short note about using MVars in Haskell. Source is here: https://github.com/carlohamalainen/playground/tree/master/haskell/mvar.

Unlike earlier blog posts, this one should be built using Stack. Something like:

git clone https://github.com/carlohamalainen/playground.git
cd playground/haskell/mvar
stack build

Then use stack ghci instead of cabal repl. The main executable is .stack-work/dist/x86_64-linux/Cabal-1.22.4.0/build/mvar-exe/mvar-exe.

> {-# LANGUAGE ScopedTypeVariables #-}
> 
> module Main where
> 
> import Control.Concurrent
> import Control.Monad
> import Control.Concurrent.ParallelIO.Local
> import System.IO

Here is the situation: we have a function that makes a call to some restricted resource, say a network API, and we would like calls to this API from our application to be serialized across multiple threads. For the purposes of this blog post, here is a dummy function that sleeps a bit and returns x + 1. Pretend that it’s calling a magical API on the network somewhere.

> getExpensiveThing :: Int -> IO Int
> getExpensiveThing x = do
>   threadDelay $ 1 * 10^6
>   return $ x + 1

We have a general task that makes use of the expensive resource:

> doThings :: Int -> IO ()
> doThings tid = do
>   x <- getExpensiveThing tid
>   putStrLn $ "doThings: thread " ++ show tid ++ " got " ++ show x

At the top level we need to run a few doThings in parallel:

> main0 :: IO ()
> main0 = do
>   hSetBuffering stdout LineBuffering -- Otherwise output is garbled.
> 
>   let tasks = map doThings [1..5]
> 
>   withPool 4 $ \pool -> parallel_ pool tasks

The problem with main0 is that the calls to getExpensiveThing can happen simultaneously, so we need to use some kind of thread synchronisation primitive. I initially thought that I’d have to use a semaphore, a queue, or something fancy, but an MVar can do the trick.

We only need three operations on MVar:

Use newEmptyMVar to create a new MVar which is initially empty:

> newEmptyMVar :: IO (MVar a)

Use takeMVar to get the contents of the MVar. If the MVar is empty, takeMVar will wait until it is full.

> takeMVar :: MVar a -> IO a

Finally, use putMVar to put a value into an MVar. If the MVar is full, then putMVar will wait until the MVar is empty. If multiple threads are blocked on an MVar, they are woken up in FIFO order.

> putMVar :: MVar a -> a -> IO ()

So what we can do is have getExpensiveThing use takeMVar to block until a worker requires a value. The return value can be passed back via another MVar, which the worker is itself waiting on. The data type MVar is polymorphic in its type parameter, so there is no trouble in having an MVar of an MVar, or an MVar of a tuple containing another MVar, and so on. This is what we’ll use to represent a blocking action with input value of type a and output value of type b:

> data InOut a b = InOut (MVar (a, MVar b))

The outer MVar wraps a tuple, where the first component is the raw input value of type a, and the second component is the MVar in which the result value will be passed back. Here is the new getExpensiveThing:

> getExpensiveThing' :: InOut Int Int -> IO ()
> getExpensiveThing' (InOut io) = forever $ do
>   ((input :: Int), (output :: MVar Int)) <- takeMVar io
>   threadDelay $ 1 * 10^6
>   putMVar output (input + 1)

The output MVar is contained inside the top level MVar. This way, getExpensiveThing’ has a unique channel back to the calling function. I used ScopedTypeVariables to be able to write the types of input and output inline, but this is just for clarity in this blog post. Also note that getExpensiveThing’ runs forever using forever from Control.Monad.

Here is the updated doThings that uses the MVar to communicate with getExpensiveThing’:

> doThings' :: InOut Int Int -> Int -> IO ()
> doThings' (InOut io) tid = do
>   result <- newEmptyMVar      -- For our result.
>   putMVar io (tid, result)    -- Send our input (tid) and the result MVar.
>   x <- takeMVar result        -- Get the value from the result MVar.
> 
>   putStrLn $ "doThings': thread " ++ show tid ++ " got " ++ show x

Finally, main needs a top-level MVar which is the first parameter to doThings’ and a forked thread to run getExpensiveThing’:

> main :: IO ()
> main = do
>   hSetBuffering stdout LineBuffering -- Otherwise output is garbled.
> 
>   topMVar <- newEmptyMVar
> 
>   _ <- forkIO $ getExpensiveThing' (InOut topMVar)
> 
>   let tasks = map (doThings' (InOut topMVar)) [1..5]
> 
>   withPool 4 $ \pool -> parallel_ pool tasks

Now each evaluation of threadDelay (the sensitive bit of code that represents a call to a resource) happens sequentially although the order is nondeterministic.

$ stack build && .stack-work/dist/x86_64-linux/Cabal-1.22.4.0/build/mvar-exe/mvar-exe
doThings': thread 1 got 2
doThings': thread 5 got 6
doThings': thread 2 got 3
doThings': thread 4 got 5
doThings': thread 3 got 4

Just for fun, let’s make some helper functions to make calling a special worker via an MVar a bit cleaner. In general, calling a worker requires creating a results MVar, pushing the input and results MVar to the InOut MVar, and finally taking the result.

> callWorker :: InOut a b -> a -> IO b
> callWorker (InOut m) a = do
>     result <- newEmptyMVar
>     putMVar m (a, result)
>     takeMVar result

To save ourselves having to fork a worker, we can write a combinator that takes a worker and an action and runs the action with access to the newly created MVar:

> withWorker :: (InOut a b -> IO ()) -> (InOut a b -> IO c) -> IO c
> withWorker worker action = do
>   m <- newEmptyMVar
>   let io = InOut m
>   _ <- forkIO $ worker io
>   action io

Now doThings’’ is a bit shorter, at the expense of not knowing (at a glance) what the io thing is going to do.

> doThings'' :: InOut Int Int -> Int -> IO ()
> doThings'' io tid = do
>   x <- callWorker io tid
> 
>   putStrLn $ "doThings'': thread " ++ show tid ++ " got " ++ show x

Finally, main’ is largely unchanged except for withWorker at the top level.

> main' :: IO ()
> main' = withWorker getExpensiveThing' $ \io -> do
>   hSetBuffering stdout LineBuffering -- Otherwise output is garbled.
> 
>   let tasks = map (doThings'' io) [1..5]
> 
>   withPool 4 $ \pool -> parallel_ pool tasks

Running main’:

*Main> main'
doThings'': thread 2 got 3
doThings'': thread 3 got 4
doThings'': thread 4 got 5
doThings'': thread 1 got 2
doThings'': thread 5 got 6

Yesod 1.1 to 1.4 notes

2015-08-16

This blog runs on a barebones blogging framework that I knocked together using Yesod 1.1 back in 2013. I recently ported it over to Yesod 1.4. Apart from the few changes that I have detailed below, everything worked straight away. Refactoring code in Haskell is a very different experience compared to fully dynamic languages.

Here are some notes on the changes that I encountered between Yesod 1.1 and 1.4. Perhaps these will be useful for someone.

aformM

Previously I used aformM to get the current time in a form:

commentForm :: EntryId -> Form Comment
commentForm entryId = renderDivs $ Comment
    <$> pure entryId
    <*> aformM (liftIO getCurrentTime)
    <*> areq textField (fieldSettingsLabel MsgCommentName) Nothing
    <*> aopt emailField (fieldSettingsLabel MsgCommentEmail) Nothing
    <*> aopt urlField (fieldSettingsLabel MsgCommentUrl) Nothing
    <*> areq htmlField (fieldSettingsLabel MsgCommentText) Nothing
    <*> pure False <* recaptchaAForm

Now, use lift (liftIO getCurrentTime):

commentForm :: EntryId -> Form Comment
commentForm entryId = renderDivs $ Comment
    <$> pure entryId
    -- <*> aformM (liftIO getCurrentTime)
    <*> lift (liftIO getCurrentTime)
    <*> areq textField (fieldSettingsLabel MsgCommentName) Nothing
    <*> aopt emailField (fieldSettingsLabel MsgCommentEmail) Nothing
    <*> aopt urlField (fieldSettingsLabel MsgCommentUrl) Nothing
    <*> areq htmlField (fieldSettingsLabel MsgCommentText) Nothing
    <*> pure False <* recaptchaAForm

MinLen

Some new names clash with the Prelude, e.g. maximum is not the usual function from the Prelude, but rather something from Data.MinLen that encodes type-level natural numbers.

*Main> :t maximum
maximum :: MonoFoldableOrd mono => MinLen (Succ nat) mono -> Element mono

*Main> :t P.maximum 
P.maximum :: Ord a => [a] -> a

No unKey or PersistInt64

Persistent values in the old system looked like this:

Entity {entityKey = Key {unKey = PersistInt64 1},
        entityVal = title: "first post" mashed title: "first-post" year: 2015 month: 8 day: 14 content: "Hi there!" visible: False}

Entity {entityKey = Key {unKey = PersistInt64 2},
        entityVal = title: "second post" mashed title: "second-post" year: 2015 month: 8 day: 14 content: "Hi there! Do de dah!" visible: False}

and we could use PersistInt64 to construct the value, or unKey to deconstruct it.

*Main> :t PersistInt64
PersistInt64 :: GHC.Int.Int64 -> PersistValue

*Main> :t unKey
unKey :: KeyBackend backend entity -> PersistValue

Now values look like:

Entity {entityKey = EntryKey {unEntryKey = SqlBackendKey {unSqlBackendKey = 1}}, entityVal = "first post"}
Entity {entityKey = EntryKey {unEntryKey = SqlBackendKey {unSqlBackendKey = 2}}, entityVal = "second post"}

Old code like

foo :: PersistValue -> GHC.Int.Int64
foo (PersistInt64 i) = i

niceEntryId :: KeyBackend backend entity -> String
niceEntryId x = show $ foo $ unKey x

becomes

niceEntryId :: Key Entry -> Text
niceEntryId = DT.pack . show . unSqlBackendKey . unEntryKey

We could also use toPathPiece:

*Main> :t toPathPiece :: Key Entry -> Text
toPathPiece :: Key Entry -> Text :: Key Entry -> Text

If you're wondering how to find such a thing, look at the output of :info Key in ghci which includes these lines:

instance PathPiece (Key User) -- Defined at Model.hs:10:1
instance PathPiece (Key Entry) -- Defined at Model.hs:10:1
instance PathPiece (Key Comment) -- Defined at Model.hs:10:1

I believe that the more general option is fromSqlKey:

unKey' :: ToBackendKey SqlBackend record => Key record -> Text
unKey' = DT.pack . show . fromSqlKey

This should work over any SQL backend, unlike the older code that was tied to the particular implementation (e.g. 64bit ints).

Similarly, old code constructed a Key from an integer:

let entryId = Key $ PersistInt64 (fromIntegral i)

New code uses toSqlKey since the PersistInt64 constructor isn't available:

let entryId = toSqlKey i :: Key Entry

Links

Original blog framework, written with Yesod 1.1.9.4: https://github.com/carlohamalainen/cli-yesod-blog.

New blog framework, compiles against Yesod 1.4: https://github.com/carlohamalainen/cli-yesod-blog-1.4.

Yesod 1.4 cabal sandbox

2015-08-06

Bokeh slider for "Phenology of two interdependent traits in migratory birds in response to climate change"

2015-07-31

A while I ago I made a slider using ipywidgets that could be embedded in a html page (handy for blog posts). This week I decided to see where things were at with IPython or Jupyter.

As of July 2015 the ipywidgets package is unsupported. The author recommends using IPython's built-in interactive tools. However IPython doesn't have static widgets yet, according to this issue. A StackOverflow answer mentioned Bokeh so I decided to give that a go.

Bokeh slider

Here is a slider that replicates my earlier ipywidgets effort:

This is pretty nice. It's an interactive slider, works on desktop and mobile, and doesn't have any of the notebook stuff around it. Just the graph with the interactive widget. Bokeh also provides tools for zooming and panning around. It's also worth mentioning that Bokeh provides a GUI library (things like hboxes, vboxes, layouts, etc) and my impression is that you could have multiple plots changing based on one slider, two plots tied together on some parameter, or whatever else you dreamt up.

The slider is implemented in bokehslider.py and is run using bokeh-server --ip 0.0.0.0 --script bokehslider.py. One strange thing that I ran into was that the slider wasn't interactive unless I opened up port 5006 on my server, even though Nginx is doing the proxy_pass stuff. I suspect that some of the Bokeh-generated Javascript expects to be able to connect to the host on 5006.

Here's the relevant Nginx config settings:

server {

    # listen, root, other top level config...

    # Reverse proxy for Bokeh server running on port 5006:

    location /bokeh {
        proxy_pass http://104.200.25.78:5006/bokeh;
    }

    location /static {
        proxy_pass http://104.200.25.78:5006;
    }

    location /bokehjs {
        proxy_pass http://104.200.25.78:5006;
    }

    # rest of the config...

}

In terms of coding, the Bokeh model is a bit different to the usual plotting procedure in that you set up data sources, e.g.

obj.line_source  = ColumnDataSource(data=dict(
                                            x_cV=[],
                                            arrival_date=[],
                                            laying_date=[],
                                            hatching_date=[],))

and then plot commands use that data source. You don't pass NumPy arrays in directly:

plot = figure(plot_height=400, plot_width=400,
              tools=toolset, x_range=[130, 180], y_range=[110, 180])

plot.line('x_cV', 'x_cV',          source=obj.line_source, line_width=4, color='black')
plot.line('x_cV', 'arrival_date',  source=obj.line_source, line_width=4, color='purple', legend='Arrival time')
plot.line('x_cV', 'laying_date',   source=obj.line_source, line_width=4, color='red',    legend='Laying time')
plot.line('x_cV', 'hatching_date', source=obj.line_source, line_width=4, color='green',  legend='Hatching date')

Then, the input_change method calls my update_data method which actually updates the data sources. It doesn't have to explicitly make a call to redraw the plot.

def update_data(self):
    u_q = self.u_q_slider.value

    self.line_source.data  = get_line_data_for_bokeh(float(u_q))

Links

https://github.com/carlohamalainen/phenology-two-trait-migratory-bird/tree/bokeh-slider

http://bokeh.pydata.org/en/latest/docs/server_gallery/sliders_server.html

https://www.reddit.com/r/IPython/comments/3bgg7t/ipython_widgets_in_a_static_html_file

https://github.com/ipython/ipywidgets/issues/16

http://stackoverflow.com/questions/22739592/how-to-embed-an-interactive-matplotlib-plot-in-a-webpage

https://jakevdp.github.io/blog/2013/12/05/static-interactive-widgets

Classy mtl

2015-07-20

This post has a minimal stand-alone example of the classy lenses and prisms from George Wilson’s talk about mtl. The source code for George’s talk is here: https://github.com/gwils/next-level-mtl-with-classy-optics.

Literate Haskell source for this post is here: https://github.com/carlohamalainen/playground/tree/master/haskell/classy-mtl.

First, some imports:

> {-# LANGUAGE OverloadedStrings    #-}
> {-# LANGUAGE TemplateHaskell      #-}
> 
> module Classy where
> 
> import Control.Lens
> import Control.Monad.Except
> import Control.Monad.Reader
> import Data.Text

Toy program - uses the network and a database

The case study in George’s talk was a program that has to interact with a database and the network. We have a type for the database connection info:

> type DbConnection = Text
> type DbSchema     = Text
> 
> data DbConfig = DbConfig
>     { _dbConn :: DbConnection
>     , _schema :: DbSchema
>     }

For the network we have a port and some kind of SSL setting:

> type Port = Integer
> type Ssl  = Text
> 
> data NetworkConfig = NetworkConfig
>     { _port     :: Port
>     , _ssl      :: Ssl
>     }

At the top level, our application has a database and a network configuration:

> data AppConfig = AppConfig
>     { _appDbConfig   :: DbConfig
>     , _appNetConfig  :: NetworkConfig
>     }

Types for errors that we see when dealing with the database and the network:

> data DbError = QueryError Text | InvalidConnection
> 
> data NetworkError = Timeout Int | ServerOnFire
> 
> data AppError = AppDbError  { dbError  :: DbError      }
>               | AppNetError { netError :: NetworkError }

Classy lenses and prisms

Use Template Haskell to make all of the classy lenses and prisms. Documentation for makeClassy and makeClassyPrisms is in Control.Lens.TH.

> makeClassy ''DbConfig
> makeClassy ''NetworkConfig
> makeClassy ''AppConfig
> 
> makeClassyPrisms ''DbError
> makeClassyPrisms ''NetworkError
> makeClassyPrisms ''AppError

We get the following typeclasses:

  • HasDbConfig
  • HasNetworkConfig
  • HasAppConfig
  • AsNetworkError
  • AsDbError
  • AsAppError

For example, here is the generated class HasDbConfig:

*Classy> :i HasDbConfig
class HasDbConfig c_a6IY where
  dbConfig :: Functor f => (DbConfig -> f DbConfig) -> c0 -> f c0
  dbConn   :: Functor f => (DbConnection -> f DbConnection) -> c0 -> f c0
  schema   :: Functor f => (DbSchema -> f DbSchema) -> c0 -> f c0
instance HasDbConfig DbConfig -- Defined at Classy.lhs:58:3

If we write HasDbConfig r in the class constraints of a type signature then we can use the lenses dbConfig, dbConn, and schema to get the entire config, connection string, and schema, from something of type r.

In contrast, the constraint AsNetworkError r means that we can use the prisms _NetworkError, _Timeout, and _ServerOnFire on a value of type r to get at the network error details.

*Classy> :i AsNetworkError
class AsNetworkError r_a759 where
  _NetworkError ::
    (Choice p, Control.Applicative.Applicative f) =>
    p NetworkError (f NetworkError) -> p r0 (f r0)

  _Timeout ::
    (Choice p, Control.Applicative.Applicative f) =>
    p Int (f Int) -> p r0 (f r0)

  _ServerOnFire ::
    (Choice p, Control.Applicative.Applicative f) =>
    p () (f ()) -> p r0 (f r0)
    -- Defined at Classy.lhs:63:3

instance AsNetworkError NetworkError -- Defined at Classy.lhs:63:3

Using the class constraints

The first function is loadFromDb which uses a reader environment for database configuration, can throw a database error, and do IO actions.

> loadFromDb :: ( MonadError e m,
>                 MonadReader r m,
>                 AsDbError e,
>                 HasDbConfig r,
>                 MonadIO m) => m Text
> loadFromDb = do
> 
>   -- Due to "MonadReader r m" and "HasDbConfig r"
>   -- we can ask for the database config:
>   rdr <- ask
>   let dbconf  = rdr ^. dbConfig :: DbConfig
> 
>   -- We can ask for the connection string directly:
>   let connstr  = rdr ^. dbConn :: DbConnection
> 
>   -- We have "AsDbError e", so we can throw a DB error:
>   throwError $ (_InvalidConnection #) ()
>   throwError $ (_QueryError #) "Bad SQL!"
> 
>   return "foo"

Another function, sendOverNet uses a reader environment with a network config, throws network errors, and does IO actions.

> sendOverNet :: ( MonadError e m,
>                  MonadReader r m,
>                  AsNetworkError e,
>                  AsAppError e,
>                  HasNetworkConfig r,
>                  MonadIO m) => Text -> m ()
> sendOverNet mydata = do
> 
>   -- We have "MonadReader r m" and "HasNetworkConfig r"
>   -- so we can ask about the network config:
>   rdr <- ask
>   let netconf = rdr ^. networkConfig  :: NetworkConfig
>       p       = rdr ^. port           :: Port
>       s       = rdr ^. ssl            :: Ssl
> 
>   liftIO $ putStrLn $ "Pretending to connect to the network..."
> 
>   -- We have "AsNetworkError e" so we can throw a network error:
>   throwError $ (_NetworkError #) (Timeout 100)
> 
>   -- We have "AsAppError e" so we can throw an application-level error:
>   throwError $ (_AppNetError #) (Timeout 100)
> 
>   return ()

If we load from the database and also send over the network then we get extra class constraints:

> loadAndSend :: ( AsAppError e,
>                  AsNetworkError e,
>                  AsDbError e,
>                  HasNetworkConfig r,
>                  HasDbConfig r,
>                  MonadReader r m,
>                  MonadError e m,
>                  MonadIO m) => m ()
> loadAndSend = do
>   liftIO $ putStrLn "Loading from the database..."
>   t <- loadFromDb
> 
>   liftIO $ putStrLn "Sending to the network..."
>   sendOverNet t

Things that won’t compile

We can’t throw the database error InvalidConnection without the right class constraint:

> nope1 :: (MonadError e m, AsNetworkError e) => m ()
> nope1 = throwError $ (_InvalidConnection #) ()
Could not deduce (AsDbError e)
arising from a use of ‘_InvalidConnection’

We can’t throw an application error if we are only allowed to throw network errors, even though this specific application error is a network error:

> nope2 :: (MonadError e m, AsNetworkError e) => m ()
> nope2 = throwError $ (_AppNetError #) (Timeout 100)
Could not deduce (AsAppError e)
arising from a use of ‘_AppNetError’

We can’t get the network config from a value of type r if we only have the constraint about having the database config:

> nope3 :: (MonadReader r m, HasDbConfig r) => m ()
> nope3 = do
>   rdr <- ask
>   let netconf = rdr ^. networkConfig
> 
>   return ()
Could not deduce (HasNetworkConfig r)
arising from a use of ‘networkConfig’

What is the #?

The # is an infix alias for review. More details are in Control.Lens.Review.

*Classy> :t review _InvalidConnection ()
review _InvalidConnection () :: AsDbError e => e

*Classy> :t throwError $ review _InvalidConnection ()
throwError $ review _InvalidConnection () :: (AsDbError e, MonadError e m) => m a

What is the monad transformer stack?

We didn’t specify it! The functions loadFromDb and sendOverNet have the general monad m in their type signatures, not a specific transformer stack like ReaderT AppConfig (ExceptT AppError IO) a.

What else?

Ben Kolera did a talk at BFPG about stacking monad transformers. He later modified the code from his talk to use the classy lens/prism approach. You can see the code before and after, and also see a diff. As far as I could see there is one spot in the code where an error is thrown, which motivated me to create the stand-alone example in this post with the body for loadFromDb and sendOverNet sketched out.

Lens Has/As for API changes

2015-06-30

Tinkering with lenses to deal with API changes.

Literate Haskell source for this post: https://github.com/carlohamalainen/playground/tree/master/haskell/lens-has.

First, some extensions and imports.

> {-# LANGUAGE GADTs                        #-}
> {-# LANGUAGE FlexibleInstances            #-}
> {-# LANGUAGE MultiParamTypeClasses        #-}
> {-# LANGUAGE TemplateHaskell              #-}
> module LensHas where
> import Control.Applicative
> import Control.Lens
> import Numeric.Natural

Introduction

Suppose we are working with a database service that stores files. Perhaps we communicate with it via a REST API. A file stored in the system has a location, which is a FilePath:

> type Location = FilePath

We need to keep track of a few other things like the parent (referring to a collection of files) and a hash of the file. For simplicity I’ll make those two fields Strings since the details aren’t important to us here.

> data DataFile = DataFile {
>     _dataFileLocation :: Location
>   , _dataFileParent   :: String
>   , _dataFileHash     :: String
> } deriving Show

(Ignore the underscores if you haven’t used lenses before.)

After some time the API changes and we need to keep track of some different fields, so our data type changes to:

> data DataFile2 = DataFile2 {
>     _dataFile2Location   :: Location
>   , _dataFile2Parent     :: String
>   , _dataFile2OtherField :: Float -- new field
>                                   -- hash is not here anymore
> } deriving Show

For compatibility we’d like to keep both definitions around, perhaps allowing the user to choose the v1 or v2 API with a configuration option. So how do we deal with our code that has to use DataFile or DataFile2? One option is to use a sum type:

> data DataFileSum = DFS1 DataFile | DFS2 DataFile2

Any function that uses a DataFile must instead use DataFileSum and do case analysis on whether it is a v1 or v2.

In my particular situation I had a number of functions that used just the Location part of the type. Is there a way to avoid the sum type?

Setter/Getter typeclasses

Use typeclasses to represent setting or getting the location value:

> class SetLocation a where
>   setLocation :: a -> Location -> a
> class GetLocation a where
>   getLocation :: a -> Location

Write the instance definitions for each case:

> instance SetLocation DataFile where
>   setLocation d newLocation = d { _dataFileLocation = newLocation }
> 
> instance GetLocation DataFile where
>   getLocation = _dataFileLocation
> instance SetLocation DataFile2 where
>   setLocation d newLocation = d { _dataFile2Location = newLocation }
> 
> instance GetLocation DataFile2 where
>   getLocation = _dataFile2Location

Now we use the general getLocation and setLocation functions instead of the specific data constructors of DataFile and DataFile2:

> main1 = do
>   let df = DataFile "/foo/bar.txt" "something" "700321159acb26a5fd6d5ce0116a6215"
> 
>   putStrLn $ "Original data file: " ++ show df
>   putStrLn $ "Location in original: " ++ getLocation df
> 
>   let df' = setLocation df "/blah/bar.txt"
> 
>   putStrLn $ "Updated data file:    " ++ getLocation df'

A function that uses a datafile can now be agnostic about which one it is, as long as the typeclass constraint is satisfied so that it has the appropriate getter/setter:

> doSomething :: GetLocation a => a -> IO ()
> doSomething d = print $ getLocation d

Using doSomething:

*LensHas> doSomething $ DataFile "/foo/bar.txt" "parent" "12345"
"/foo/bar.txt"

*LensHas> doSomething $ DataFile2 "/foo/bar.txt" "parent" 42.2
"/foo/bar.txt"

Lenses

Lenses already deal with the concept of getters and setters, so let’s try to replicate the previous code in that framework.

First, make lenses for the two data types (this uses Template Haskell):

> makeLenses ''DataFile
> makeLenses ''DataFile2

Instead of type classes for setting and getting, make a single type class that represents the fact that a thing has a location.

> class HasLocation a where
>     location :: Lens' a Location

For the instance definitions we can use the lenses that were automatically made for us by the earlier makeLenses lines:

> instance HasLocation DataFile where
>     location = dataFileLocation :: Lens' DataFile Location
> 
> instance HasLocation DataFile2 where
>     location = dataFile2Location :: Lens' DataFile2 Location

Here is main1 rewritten to use the location lens:

> main2 = do
>   let df = DataFile "/foo/bar.txt" "something" "700321159acb26a5fd6d5ce0116a6215"
> 
>   putStrLn $ "Original data file: " ++ show df
>   putStrLn $ "Location in original: " ++ df^.location
> 
>   let df' = df & location .~ "/blah/bar.txt"
> 
>   putStrLn $ "Updated data file:    " ++ getLocation df'

If you haven’t used lenses before the operators like ^. might look insane, but there is a pattern to them. Check out http://intolerable.me/lens-operators-intro for an excellent guide with examples.

One benefit of the lens approach is that we don’t have to manually write the setters and getters, as they come for free from the lenses for the original two data types. Another benefit is that lenses compose, so if the Location type was more than just a string, we wouldn’t have to manually deal with the composition of getLocation with getSubPartOfLocation and so on.

The doSomething function can be rewritten using the HasLocation typeclass:

> doSomething' :: HasLocation a => a -> IO ()
> doSomething' d = print $ d^.location

Generalising HasLocation

Let’s generalise the HasLocation typeclass. Consider natural numbers (the Natural type).

First case: here’s a typeclass to represent the fact that a Foo can always be thought of as a Natural:

> class AsNatural1 a where
>     nat1 :: Lens' a Natural
> data Foo = Foo {
>   _fooName :: String
> , _fooNat  :: Natural
> } deriving Show
> 
> makeLenses ''Foo
> instance AsNatural1 Foo where
>   nat1 = fooNat :: Lens' Foo Natural

Second case: a natural is a natural by definition.

> instance AsNatural1 Natural where
>   nat1 = id

Third case: an Integer might be a Natural. The previous typeclasses used a Lens’ but here we need a Prism’:

> class AsNatural2 a where
>     nat2 :: Prism' a Natural
> instance AsNatural2 Integer where
>   nat2 = prism' toInteger (\n -> if n >= 0 then (Just . fromInteger) n else Nothing)

We are doing much the same thing, and if we compare the two typeclasses the difference is in the type of “optical” thing being used (a lens or a prism):

> class AsNatural1 a where
>     nat1 :: Lens' a Natural
> 
> class AsNatural2 a where
>     nat2 :: Prism' a Natural

It turns out that the type to use is Optic’:

> class AsNatural p f s where
>   natural :: Optic' p f s Natural

(We get the extra parameters p and f which seem to be unavoidable.)

Now we can do all of the previous definitions using the single typeclass:

> -- Lens into Foo:
> 
> instance (p ~ (->), Functor f) => AsNatural p f Foo where
>   natural = fooNat :: Lens' Foo Natural
> 
> -- Natural is a Natural:
> 
> instance AsNatural p f Natural where
>   natural = id
> 
> -- An Integer might be a natural:
> 
> instance (Choice p, Applicative f) => AsNatural p f Integer where
>   natural = prism' toInteger (\n -> if n >= 0 then (Just . fromInteger) n else Nothing)

Now we can work with a Foo, a Natural, or an Integer as a Natural by using the single optical natural:

> main3 :: IO ()
> main3 = do
>   -- Underlying thing is a Lens:
>   print $ (Foo "name" 34) ^. natural
>   print $ (Foo "name" 34) ^. natural + 1
>   print $ (42 :: Natural) ^. natural + 1
> 
>   -- Underlying thing is a Prism (hence the applicative form):
>   print $ (+1) <$> ((50 :: Integer)  ^? natural)
>   print $ (+1) <$> ((-99 :: Integer) ^? natural)

Output:

*LensHas> main3
34
35
43
Just 51
Nothing

Credit

The AsNatural type is a simplified version of the “As…” typeclasses in the coordinate package, e.g. AsMinutes. Thanks to Tony Morris on #haskell.au for helping with my changing-API question and pointing out the “As…” typeclasses. Also see the IRC logs in coordinate/etc where Ed Kmett explains some things about Optic.


Posts: RSS