Type-safe options in Idris

Inspiration for this blog post came from my work on volgenmodel-nipype, a neuroimaging workflow written using Nipype. The Nipype package allows one to wrap legacy command line applications using traits to model the various command line options. Here’s a snippet for mincextract taken from nipypeminc.py:

class ExtractInputSpec(StdOutCommandLineInputSpec):
    input_file = File(
                    desc='input file',
                    exists=True,
                    mandatory=True,
                    argstr='%s',
                    position=-2,)

    output_file = File(
                    desc='output file',
                    position=-1)

    _xor_write = ('write_ascii', 'write_byte',
                  'write_short', 'write_int', 'write_long',
                  'write_float', 'write_double', 'write_signed',
                  'write_unsigned',)

    write_ascii = traits.Bool(
                desc='Write out data as ascii strings (default).',
                argstr='-ascii',
                xor=_xor_write)

    write_byte = traits.Bool(
                desc='Write out data as bytes.',
                argstr='-byte',
                xor=_xor_write)

    # snipped...

Note the xor condition, which means that the user cannot set write_ascii and write_byte at the same time. So this would be ok:

prog = Extract(input_file='/tmp/foo.mnc', write_ascii=True)

but this would be rejected:

prog = Extract(input_file='/tmp/foo.mnc', write_ascii=True, write_byte=True)

A few times I made mistakes with these xor specifications which resulted in run time errors. Recently I learned that Haskell’s type system can be used to make certain errors compile time errors instead of run-time errors, and I naturally wondered if an xor-type of condition could be encoded in Haskell’s type system, and if not that, perhaps in Idris.

A basic implementation in Haskell of a command line wrapper might look like this:

> module OptionsInHaskell where
> import Control.Monad (when)
> import Data.List (intersect)

A command line option has a description, argument string, and a value. For simplicity we will only consider integer values.

> data Option = MkOption { optDesc :: String
>                        , optArgStr :: String
>                        , optValue  :: Int
>                        } deriving Show

Again, for simplicity, two options are considered equal if their arg strings match:

> instance Eq Option where
>   (MkOption _ a _) == (MkOption _ a' _) = a == a'

Here are some example options:

> opt1 :: Option
> opt1 = MkOption "Value of Foo."   "-foo"  34
>
> opt2 :: Option
> opt2 = MkOption "Do bar."         "-bar"  99
>
> opt3 :: Option
> opt3 = MkOption "Blah."           "-blah" 0

A program consists of a path to a binary, a list of options, and a list of xor conditions, which are lists of options that cannot be set simultaneously.

> data Program = Program { progPath :: FilePath
>                        , progOptions :: [Option]
>                        , progXorOptions :: [[Option]]
>                        } deriving Show

A list of options has a clash if it intersects with any of the xor-lists in more than two elements:

> clash :: [Option] -> [[Option]] -> Bool
> clash opts xors = any (x -> length (intersect opts x) >= 2) xors

We won’t bother with the full details of spawning a process, tidying up output files, capturing stdout and stderr, and so on, so this runProgram function just prints some details and checks that the options list is acceptable:

> runProgram :: Program -> IO ()
> runProgram (Program path opts xorOpts) = do
>   putStrLn $ "Pretending to run: " ++ path
>   putStrLn $ "with options: " ++ show opts
>   putStrLn $ "and xor lists: " ++ show xorOpts
>   when (clash opts xorOpts) $ error "eek, options clash :("

Here’s a program with no xor conditions; it runs ok:

> prog1 :: Program
> prog1 = Program "/usr/local/bin/foo" [opt1, opt2, opt3] []
*OptionsInHaskell> runProgram prog1
Pretending to run: /usr/local/bin/foo
with options: [ MkOption {optDesc = "Value of Foo.", optArgStr = "-foo", optValue = 34}
              , MkOption {optDesc = "Do bar.", optArgStr = "-bar", optValue = 99}
              , MkOption {optDesc = "Blah.", optArgStr = "-blah", optValue = 0}
              ]
and xor lists: []

On the other hand, this program is not valid since options 1, 2, and 3 are set, but the xor list specifies that options 1 and 2 cannot be set at the same time:

> prog2 :: Program
> prog2 = Program "/usr/local/bin/foo" [opt1, opt2, opt3] [[opt1, opt2]]
*OptionsInHaskell> runProgram prog2
Pretending to run: /usr/local/bin/foo
with options: [ MkOption {optDesc = "Value of Foo.", optArgStr = "-foo", optValue = 34}
              , MkOption {optDesc = "Do bar.", optArgStr = "-bar", optValue = 99}
              , MkOption {optDesc = "Blah.", optArgStr = "-blah", optValue = 0}
              ]
and xor lists: [ [ MkOption {optDesc = "Value of Foo.", optArgStr = "-foo", optValue = 34}
                 , MkOption {optDesc = "Do bar.", optArgStr = "-bar", optValue = 99}
                 ]
               ]
*** Exception: eek, options clash :(

I’m not sure if we can make this a compile time error in Haskell, so I’ll turn instead to Idris, where can exploit dependent types and other nice things.

First define an option data type, the equality instance, and a few examples:

module OptionsInIdris

%default total

data Option = MkOption String String Int

instance Show Option where
  show (MkOption x y z) = "Option " ++ show x ++ " " ++ show y ++ " " ++ show z

instance Eq Option where
  (MkOption _ a _) == (MkOption _ a' _) = a == a'

opt1 : Option
opt1 = MkOption "Value of Foo." "-foo" 34

opt2 : Option
opt2 = MkOption "Do bar." "-bar" 99

opt3 : Option
opt3 = MkOption "Blah." "-blah" 0

Next we need to encode in the type system the result of an option list clashing or not:

data ClashValue = Clashing | NotClashing

Checking if an option list has a clash is basically the same as in Haskell except that we return a ClashValue instead of a Bool:

notclash : List Option -> List (List Option) -> ClashValue
notclash opts xors = if (any (x => length (intersect opts x) >= 2) xors)
                           then Clashing
                           else NotClashing
   where intersect : Eq a => List a -> List a -> List a
         intersect [] _ = []
         intersect (x :: xs) ys = if x `elem` ys then x :: intersect xs ys
                                                 else intersect xs ys

Next, the tricky bit. We create a data type IsNotClashing which has only one constructor, called Ok, that produces a value IsNotClashing NotClashing. There is no way to produce the value IsNotClashing Clashing.

data IsNotClashing : ClashValue -> Type where
  Ok : IsNotClashing NotClashing

Example values, used later:

opts123 : List Option
opts123 = [opt1, opt2, opt3]

opts12 : List Option
opts12 = [opt1, opt2]

opts13 : List Option
opts13 = [opt1, opt3]

myOptions12 : List Option
myOptions12 = [opt1, opt2]

myOptions23 : List Option
myOptions23 = [opt2, opt3]

myXors23 : List (List Option)
myXors23 = [[opt2, opt3]]

The heart of the solution is the ValidOptionList data type. We take an option list and an xor list and, if a proof can be constructed for the value Ok using the expression IsNotClashing (notclash opts xors), then we produce the actual value ValidOptionList opts. Due to the definition of Ok, this condition means that notclash opts xors must evaluate to NotClashing. Hopefully this makes it clear why the data types ClashValue and IsNotClashing were needed.

data ValidOptionList : List Option -> Type where
  MkValidOptionList : (opts : List Option) -- WrappedOptionList opts
                   -> (xors : List (List Option))
                   -> {default Ok prf : IsNotClashing (notclash opts xors)}
                   -> ValidOptionList opts

Finally, the runProgram function takes a path to an executable and a valid list of options. The fact that the list of options is valid is encoded in the type system.

runProgram : String -> ValidOptionList opts -> String
runProgram binary (MkValidOptionList opts xorsHere) = "pretended to run the program with options: " ++ show opts

This program has options 1 and 2 set with the xor condition saying that options 2 and 3 cannot be set at the same time, so it type checks:

okProgram : String
okProgram = runProgram "/usr/local/prog" (MkValidOptionList myOptions12 myXors23)

On the other hand, this program with options 2 and 3 set does not type check, as expected:

notOkProgram : String
notOkProgram = runProgram "/usr/local/prog" (MkValidOptionList myOptions23 myXors23)

The first part of the error is a bit scary:

 Can't unify
         IsNotClashing NotClashing
 with
         IsNotClashing (boolElim (foldrImpl (flip (.) . flip (x => y => x || Delay (Prelude.Classes.Nat instance of Prelude.Classes.Ord, method > (Nat instance of Prelude.Classes.Ord, method compare (length (OptionsInIdris.notclash, intersect myOptions23 myXors23 myOptions23 y)) 2) (length (OptionsInIdris.notclash, intersect myOptions23 myXors23 myOptions23 y)) 2 || Delay (Nat instance of Prelude.Classes.Eq, method == (length (OptionsInIdris.notclash, intersect myOptions23 myXors23 myOptions23 y)) 2)))) id id myXors23 False) (Delay Clashing) (Delay NotClashing))

but the second part has the goods:

 Specifically:
         Can't unify
                 NotClashing
         with
                 Clashing

So there we have it. Compile-time error checking in Idris of a disjointness condition in the options for wrapping a legacy command line program.

Further reading:

Literate source for this post: OptionsInHaskell.lhs.

Idris source: OptionsInIdris.idr.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s