Lens Has/As for API changes

Tinkering with lenses to deal with API changes.

Literate Haskell source for this post: https://github.com/carlohamalainen/playground/tree/master/haskell/lens-has.

First, some extensions and imports.

> {-# LANGUAGE GADTs                        #-}
> {-# LANGUAGE FlexibleInstances            #-}
> {-# LANGUAGE MultiParamTypeClasses        #-}
> {-# LANGUAGE TemplateHaskell              #-}
> module LensHas where
> import Control.Applicative
> import Control.Lens
> import Numeric.Natural

Introduction

Suppose we are working with a database service that stores files. Perhaps we communicate with it via a REST API. A file stored in the system has a location, which is a FilePath:

> type Location = FilePath

We need to keep track of a few other things like the parent (referring to a collection of files) and a hash of the file. For simplicity I’ll make those two fields Strings since the details aren’t important to us here.

> data DataFile = DataFile {
>     _dataFileLocation :: Location
>   , _dataFileParent   :: String
>   , _dataFileHash     :: String
> } deriving Show

(Ignore the underscores if you haven’t used lenses before.)

After some time the API changes and we need to keep track of some different fields, so our data type changes to:

> data DataFile2 = DataFile2 {
>     _dataFile2Location   :: Location
>   , _dataFile2Parent     :: String
>   , _dataFile2OtherField :: Float -- new field
>                                   -- hash is not here anymore
> } deriving Show

For compatibility we’d like to keep both definitions around, perhaps allowing the user to choose the v1 or v2 API with a configuration option. So how do we deal with our code that has to use DataFile or DataFile2? One option is to use a sum type:

> data DataFileSum = DFS1 DataFile | DFS2 DataFile2

Any function that uses a DataFile must instead use DataFileSum and do case analysis on whether it is a v1 or v2.

In my particular situation I had a number of functions that used just the Location part of the type. Is there a way to avoid the sum type?

Setter/Getter typeclasses

Use typeclasses to represent setting or getting the location value:

> class SetLocation a where
>   setLocation :: a -> Location -> a
> class GetLocation a where
>   getLocation :: a -> Location

Write the instance definitions for each case:

> instance SetLocation DataFile where
>   setLocation d newLocation = d { _dataFileLocation = newLocation }
>
> instance GetLocation DataFile where
>   getLocation = _dataFileLocation
> instance SetLocation DataFile2 where
>   setLocation d newLocation = d { _dataFile2Location = newLocation }
>
> instance GetLocation DataFile2 where
>   getLocation = _dataFile2Location

Now we use the general getLocation and setLocation functions instead of the specific data constructors of DataFile and DataFile2:

> main1 = do
>   let df = DataFile "/foo/bar.txt" "something" "700321159acb26a5fd6d5ce0116a6215"
>
>   putStrLn $ "Original data file: " ++ show df
>   putStrLn $ "Location in original: " ++ getLocation df
>
>   let df' = setLocation df "/blah/bar.txt"
>
>   putStrLn $ "Updated data file:    " ++ getLocation df'

A function that uses a datafile can now be agnostic about which one it is, as long as the typeclass constraint is satisfied so that it has the appropriate getter/setter:

> doSomething :: GetLocation a => a -> IO ()
> doSomething d = print $ getLocation d

Using doSomething:

*LensHas> doSomething $ DataFile "/foo/bar.txt" "parent" "12345"
"/foo/bar.txt"

*LensHas> doSomething $ DataFile2 "/foo/bar.txt" "parent" 42.2
"/foo/bar.txt"

Lenses

Lenses already deal with the concept of getters and setters, so let’s try to replicate the previous code in that framework.

First, make lenses for the two data types (this uses Template Haskell):

> makeLenses ''DataFile
> makeLenses ''DataFile2

Instead of type classes for setting and getting, make a single type class that represents the fact that a thing has a location.

> class HasLocation a where
>     location :: Lens' a Location

For the instance definitions we can use the lenses that were automatically made for us by the earlier makeLenses lines:

> instance HasLocation DataFile where
>     location = dataFileLocation :: Lens' DataFile Location
>
> instance HasLocation DataFile2 where
>     location = dataFile2Location :: Lens' DataFile2 Location

Here is main1 rewritten to use the location lens:

> main2 = do
>   let df = DataFile "/foo/bar.txt" "something" "700321159acb26a5fd6d5ce0116a6215"
>
>   putStrLn $ "Original data file: " ++ show df
>   putStrLn $ "Location in original: " ++ df^.location
>
>   let df' = df & location .~ "/blah/bar.txt"
>
>   putStrLn $ "Updated data file:    " ++ getLocation df'

If you haven’t used lenses before the operators like ^. might look insane, but there is a pattern to them. Check out http://intolerable.me/lens-operators-intro for an excellent guide with examples.

One benefit of the lens approach is that we don’t have to manually write the setters and getters, as they come for free from the lenses for the original two data types. Another benefit is that lenses compose, so if the Location type was more than just a string, we wouldn’t have to manually deal with the composition of getLocation with getSubPartOfLocation and so on.

The doSomething function can be rewritten using the HasLocation typeclass:

> doSomething' :: HasLocation a => a -> IO ()
> doSomething' d = print $ d^.location

Generalising HasLocation

Let’s generalise the HasLocation typeclass. Consider natural numbers (the Natural type).

First case: here’s a typeclass to represent the fact that a Foo can always be thought of as a Natural:

> class AsNatural1 a where
>     nat1 :: Lens' a Natural
> data Foo = Foo {
>   _fooName :: String
> , _fooNat  :: Natural
> } deriving Show
>
> makeLenses ''Foo
> instance AsNatural1 Foo where
>   nat1 = fooNat :: Lens' Foo Natural

Second case: a natural is a natural by definition.

> instance AsNatural1 Natural where
>   nat1 = id

Third case: an Integer might be a Natural. The previous typeclasses used a Lens’ but here we need a Prism’:

> class AsNatural2 a where
>     nat2 :: Prism' a Natural
> instance AsNatural2 Integer where
>   nat2 = prism' toInteger (n -> if n >= 0 then (Just . fromInteger) n else Nothing)

We are doing much the same thing, and if we compare the two typeclasses the difference is in the type of “optical” thing being used (a lens or a prism):

> class AsNatural1 a where
>     nat1 :: Lens' a Natural
>
> class AsNatural2 a where
>     nat2 :: Prism' a Natural

It turns out that the type to use is Optic’:

> class AsNatural p f s where
>   natural :: Optic' p f s Natural

(We get the extra parameters p and f which seem to be unavoidable.)

Now we can do all of the previous definitions using the single typeclass:

> -- Lens into Foo:
>
> instance (p ~ (->), Functor f) => AsNatural p f Foo where
>   natural = fooNat :: Lens' Foo Natural
>
> -- Natural is a Natural:
>
> instance AsNatural p f Natural where
>   natural = id
>
> -- An Integer might be a natural:
>
> instance (Choice p, Applicative f) => AsNatural p f Integer where
>   natural = prism' toInteger (n -> if n >= 0 then (Just . fromInteger) n else Nothing)

Now we can work with a Foo, a Natural, or an Integer as a Natural by using the single optical natural:

> main3 :: IO ()
> main3 = do
>   -- Underlying thing is a Lens:
>   print $ (Foo "name" 34) ^. natural
>   print $ (Foo "name" 34) ^. natural + 1
>   print $ (42 :: Natural) ^. natural + 1
>
>   -- Underlying thing is a Prism (hence the applicative form):
>   print $ (+1)  ((50 :: Integer)  ^? natural)
>   print $ (+1)  ((-99 :: Integer) ^? natural)

Output:

*LensHas> main3
34
35
43
Just 51
Nothing

Credit

The AsNatural type is a simplified version of the “As…” typeclasses in the coordinate package, e.g. AsMinutes. Thanks to Tony Morris on #haskell.au for helping with my changing-API question and pointing out the “As…” typeclasses. Also see the IRC logs in coordinate/etc where Ed Kmett explains some things about Optic.

Update Github fork

Situation:

Repo on Github that I want to contribute to: https://github.com/whoever/whatever.git

My fork of that repo on github: git@github.com:myself/whatever.git

Over time the upstream repo is updated but my fork on github does not automatically get those updates. To push the latest changes from https://github.com/whoever/whatever.git to git@github.com:myself/whatever.git follows these steps.

Clone our repository:

git clone git@github.com:myself/whatever.git
cd whatever

Add the upstream repo as a remote, and call it upstream:

git remote add upstream https://github.com/whoever/whatever.git

Get all the branches of the upstream repo into remote-tracking branches. These branches will be named upstream/master, upstream/some-feature, etc.

git fetch upstream

Now replay all of the upstream’s commits from the master branch against ours:

git checkout master
git rebase upstream/master
git push

Credit: http://stackoverflow.com/a/7244456

DCMTK: No presentation context for: (unknown SOP class) 1.3.12.2.1107.5.9.1

I had a DCMTK DICOM server running with the command

storescp -p 7000 -v --fork -fe '.IMA' --sort-on-study-uid 'my_prefix'

One of our Siemens instruments couldn’t push some files to the DCMTK server, so I tried to send the files manually using storescu:

parallel -j 30 storescu localhost 7000 -- `find siemens_data/ -type f`

While the Siemens instrument gave nothing of note, I found that storescu failed with:

E: No presentation context for: (unknown SOP class) 1.3.12.2.1107.5.9.1
E: Store SCU Failed: 0006:0208 DIMSE No valid Presentation Context ID

Apparently 1.3.12.2.1107.5.9.1 is a “private” Siemens SOP class. To fix things in my situation I tried adding the -pm flag so that storescp would accept unknown SOP classes:

storescp -p 7000 -v --fork -pm -fe '.IMA' --sort-on-study-uid 'my_prefix'

This fixed things so that the Siemens instrument could send data, but storescu still failed. It turned out, after reading this forum post, that storescu needs to know about the new SOP class.

I copied /etc/dcmtk/storescu.cfg and edited out the 125th presentation context definition and added the Siemens UID:

# PresentationContext125 = XRayRadiationDoseSRStorageUncompressed
PresentationContext125 = 1.3.12.2.1107.5.9.1Uncompressed

There are already 128 presentation contexts in the file and you can’t have more than that (some limitation in DICOM?). Now
pushing files using storescu works, if we refer to the config file and the Default profile entry:

parallel -j 30 storescu -xf storescu.cfg Default localhost 7000 -- `find extract/ -type f`

The forum post that I linked to earlier has an example config file with a [PrivateSiemens] section in the [[Profiles]]. I tried this but (as far as I understand) you have to copy all of the SOP classes that you might see, and run storescu referring to the PrivateSiemens profile name. So you may as well edit out an unused presentation context in the Default configuration.

Archived Comments

Date: 2015-07-23 18:29:35.372382 UTC

Author: J. Riesmeier

You could also give the DCMTK command line tool “dcmsend” a try: http://blog.jriesmeier.com/2011/10/sending-dicom-files-more-easily/.

Date: 2016-11-01 00:30:26.567742 UTC

Author: Tom C

Thanks Carlo, this solved my problem!

Date: 2017-03-01 04:58:52.305799 UTC

Author: Ashok B

Thanks Carlo.

It worked for me 🙂