Defunctionalization

Why are programs written in Java so verbose? The answer seems to be that Java does not support function pointers or functions as first class values.

For example, solving an ODE in Python proceeds as follows. You define a function for the derivative, and then pass that to the solver along with some initial conditions and other parameters. Easy enough.

def yDot(y, c=[1.0, 1.0], omega=0.1):
    return [omega * (c[1] - y[1]), omega * (y[0] - c[0])]

finalValue = ode_solver(yDot, y0=[0.0, 1.0], t_final=16.0)

The equivalent in Java (example taken from the Apache commons maths library) requires an entire class to be written, implementing the oddly named interface FirstOrderDifferentialEquations.

private static class CircleODE implements FirstOrderDifferentialEquations {

    private double[] c;
    private double omega;

    public CircleODE(double[] c, double omega) {
        this.c     = c;
        this.omega = omega;
    }

    public int getDimension() {
        return 2;
    }

    public void computeDerivatives(double t, double[] y, double[] yDot) {
        yDot[0] = omega * (c[1] - y[1]);
        yDot[1] = omega * (y[0] - c[0]);
    }

}

// then in some other class...

FirstOrderIntegrator dp853 = new DormandPrince853Integrator(1.0e-8, 100.0, 1.0e-10, 1.0e-10);
FirstOrderDifferentialEquations ode = new CircleODE(new double[] { 1.0, 1.0 }, 0.1);
double[] y = new double[] { 0.0, 1.0 }; // initial state
dp853.integrate(ode, 0.0, y, 16.0, y); // now y contains final state at time t=16.0

This is a lot of work and obfuscates the problem at hand, the definition of a derivative and a call to a solver. Java seems to be unique in its pedantic style, at least among common languages in use today. In C one can use function pointers, Fortran can pass functions, Python has first-class functions, Haskell has higher-order functions, and so on.

It turns out that Java programmers are forced to do defunctionalization since Java does not support higher order functions. Here’s a quote from a blog post on plover.net:

Defunctionalization is a program transformation that removes the higher-order functions from a program. The idea is that you replace something like λx.x+y with a data structure that encapsulates a value of y somewhere, say (HOLD y). And instead of using the language’s built-in function application to apply this object directly to an argument x, you write a synthetic applicator that takes (HOLD y) and x and returns x + y. And anyone who wanted to apply λx.x+y to some argument x in some context in which y was bound should first construct (HOLD y), then use the synthetic applicator on (HOLD y) and x.

In Haskell we might implement yDot as follows:

> module Defun where
>
> yDot :: [Double] -> Double -> [Double] -> [Double]
> yDot c omega y = [omega * (c !! 1 - y !! 1), omega * (y !! 0 - c !! 0)]

The parameters c and omega are the slowest varying, so we put them before y. Since all functions in Haskell are curried, we can conveniently produce the function that we need by partially applying the c and omega values:

*Defun> :t yDot [1.0, 1.0] 0.1
yDot [1.0, 1.0] 0.1 :: [Double] -> [Double]

In this way yDot is a higher order function. To make it first-order we have to defunctionalize it. Following the example on plover.net we define a data structure to hole the c and omega values:

> data Hold = MkHold [Double] Double

And we need a function to “apply” this value to get the actual yDot value.

> fakeApply :: Hold -> [Double] -> [Double]
> fakeApply (MkHold c omega) y = [omega * (c !! 1 - y !! 1), omega * (y !! 0 - c !! 0)]

Basically Hold and fakeApply are equivalent to the CircleODE class above.

Example:

> hold :: Hold
> hold = MkHold [1.0, 1.0] 0.1
>
> result :: [Double]
> result = fakeApply hold [1.0, 1.0]

Defunctionalization appears to be the cause of the excessive use of nouns in Java code, resulting in things like the Abstract Singleton Proxy Factory Bean, or the Abstract Factory design pattern.

Further reading:

Literate Haskell source for this post is here: https://github.com/carlohamalainen/playground/tree/master/haskell/java-defun.

FunctionalDependencies

Suppose we have two datatypes, OptBool and OptFile for storing boolean and file path options. Perhaps this might be for a program that provides an interface to legacy command line applications.

> {-# LANGUAGE MultiParamTypeClasses  #-}
> {-# LANGUAGE TypeSynonymInstances   #-}
> {-# LANGUAGE FlexibleInstances      #-}
> {-# LANGUAGE FunctionalDependencies #-}
>
> module Fundep where
> data OptBool = OptBool { optBoolDesc   :: String
>                        , optBoolValue  :: Bool
>                        } deriving Show
> data OptFile = OptFile { optFileDesc :: String
>                        , optFileValue :: FilePath
>                        } deriving Show

We’d like to be able to set the value of an option without having to specify the record name, so instead of

> opt { optBoolValue = True }

we want to write

> setValue opt True

As a first attempt we make a type class Option:, where we have enabled MultiParamTypeClasses because the type signature for setValue has to refer to the option, of type a, and the value of type b. We also enable TypeSynonymInstances and FlexibleInstances since FilePath is a type synonym.

> class Option a b where
>     setDesc   :: a -> String -> a
>     setValue  :: a -> b -> a

Instance declarations:

> instance Option OptBool Bool where
>     setDesc opt d  = opt { optBoolDesc  = d }
>     setValue opt b = opt { optBoolValue = b }
>
> instance Option OptFile FilePath where
>     setDesc opt d  = opt { optFileDesc  = d }
>     setValue opt f = opt { optFileValue = f }

All seems well but the following code doesn’t compile:

> opt1' = setDesc (OptBool "bool" True) "boolean option"

with the error message

    No instance for (Option OptBool b1) arising from a use of `setDesc'
    The type variable `b1' is ambiguous
    Possible fix: add a type signature that fixes these type variable(s)
    Note: there is a potential instance available:
      instance Option OptBool Bool -- Defined at Fundeps.lhs:40:12
    Possible fix: add an instance declaration for (Option OptBool b1)
    In the expression: setDesc (OptBool "bool" True) "boolean option"
    In an equation for opt1':
        opt1' = setDesc (OptBool "bool" True) "boolean option"

The problem is that both a and b in the class declaration are free variables, but really this is not the case. The trick is to enable the FunctionalDependencies language extension, and then specify that the type a in the class declaration for Option implies the type b. This makes sense if you think about the type of setValue. Once we know the type of the first parameter, we then know the type of the value field (assuming that the instance declaraion uses OptBoolValue or optFileValue or whatever).

> class Option a b | a -> b where
>     setDesc   :: a -> String -> a
>     setValue  :: a -> b -> a

Now this is ok:

> opt1' :: OptBool
> opt1' = setDesc (OptBool "bool" True) "boolean option"

As a final note, writing the implication b -> a as below

> class Option a b | b -> a where
>     setDesc   :: a -> String -> a
>     setValue  :: a -> b -> a

restricts us unnecessarily. If we had another type with a boolean value field,

> data OptBool' = OptBool' { optBoolDesc'  :: String
>                          , optBoolValue' :: Bool
>                          } deriving Show
> instance Option OptBool' Bool where
>     setDesc opt d  = opt { optBoolDesc'  = d }
>     setValue opt b = opt { optBoolValue' = b }

then this code would not compile

> opt1'' :: OptBool'
> opt1'' = setDesc (OptBool' "bool" True) "boolean option"

due to

    Functional dependencies conflict between instance declarations:
      instance Option OptBool Bool -- Defined at Fundeps.lhs:41:12
      instance Option OptBool' Bool -- Defined at Fundeps.lhs:91:12

In contrast the implication a -> b means that, for example, the type OptBool implies the type Bool.

Literate Haskell source for this blog post is available here: https://github.com/carlohamalainen/playground/tree/master/haskell/fundeps.