Using Hypothesis to solve the farmer-chicken-fox puzzle

The farmer-chicken-fox puzzle goes something like this: a farmer is at a shop, having bought a chicken, fox, and a bag of corn. The farmer would like to get to her house on the other side of the river using a small boat. For some reason she can take at most one item at a time. If the chicken is left alone with the corn, it will eat the corn. If the fox is left alone with the chicken, it will eat the chicken. How does the farmer get across the river without losing any item?

A straightforward way to solve this is to use a breadth-first search, enumerating all valid moves from a set of states. The initial state would be

shop: {farmer, chicken, fox, corn}
house: {}

One valid first move is to go to the other side with the chicken, giving the new state:

shop: {fox, corn}
house: {farmer, chicken}

Another way is to use the Hypothesis library to simulate a finite state machine, and assert that no sequence of rules leads to the state where everyone is on the house side of the river. This is an adaptation of the Die Hard water jugs problem.

We model the state using two sets of strings:

class FarmerChickenFox(RuleBasedStateMachine):

    def __init__(self):
        self.shop  = set(['farmer', 'chicken', 'fox', 'corn'])
        self.house = set([])
        super().__init__()

Here’s the rule to take the chicken. We work out which side the farmer is in, and then move the farmer and the chicken to the other side. If this transition results in an invalid state (e.g. the fox left alone with the chicken) we undo the state-change.

    @rule()
    def take_chicken(self):
        self.save_state()

        if 'farmer' in self.shop and 'chicken' in self.shop:
            self.shop.remove('farmer')
            self.shop.remove('chicken')
            self.house.add('farmer')
            self.house.add('chicken')
            if not self.state_ok(): self.undo_state()
        elif 'farmer' in self.house and 'chicken' in self.house:
            self.house.remove('farmer')
            self.house.remove('chicken')
            self.shop.add('farmer')
            self.shop.add('chicken')
            if not self.state_ok(): self.undo_state()

We have two invariants to ensure that the state of the system is consistent:

    @invariant()
    def fox_not_with_chicken(self):
        return not self.is_fox_alone_with_chicken()

    @invariant()
    def chicken_not_with_corn(self):
        return not self.is_chicken_alone_with_corn()

The trick is to create an invariant to check if everyone is on the house side:

    @invariant()
    def not_solved(self):
        note("::: shop: {s}, house: {h}".format(s=self.shop, h=self.house))
        assert len(self.house) != 4

Running using pytest finds the solution, including the steps used:

$ pytest farmerchickenfox.py

(snipped)

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = FarmerChickenFox({})

    @invariant()
    def not_solved(self):
        note("::: shop: {s}, house: {h}".format(s=self.shop, h=self.house))
>       assert len(self.house) != 4
E       AssertionError: assert 4 != 4
E        +  where 4 = len({'chicken', 'corn', 'farmer', 'fox'})
E        +    where {'chicken', 'corn', 'farmer', 'fox'} = FarmerChickenFox({}).house

farmerchickenfox.py:117: AssertionError
-------------------------------------------------------------------------------------------- Hypothesis --------------------------------------------------------------------------------------------
Falsifying example: run_state_machine(factory=FarmerChickenFox, data=data(...))
state = FarmerChickenFox()
::: shop: {'fox', 'chicken', 'farmer', 'corn'}, house: set()
state.take_chicken()
::: shop: {'fox', 'corn'}, house: {'chicken', 'farmer'}
state.go_alone()
::: shop: {'fox', 'farmer', 'corn'}, house: {'chicken'}
state.take_corn()
::: shop: {'fox'}, house: {'corn', 'chicken', 'farmer'}
state.take_chicken()
::: shop: {'fox', 'chicken', 'farmer'}, house: {'corn'}
state.take_fox()
::: shop: {'chicken'}, house: {'corn', 'fox', 'farmer'}
state.go_alone()
::: shop: {'chicken', 'farmer'}, house: {'corn', 'fox'}
state.take_chicken()
::: shop: set(), house: {'corn', 'fox', 'farmer', 'chicken'}
state.teardown()

You can reproduce this example by temporarily adding @reproduce_failure('4.1.0', b'AAEBAQABAgEBAQMBAAEB') as a decorator on your test case
===================================================================================== 1 failed in 0.46 seconds =====================================================================================

Full source: farmerchickenfox.py.

An expression evaluator in CSV

Many business problems boil down to reading data from somewhere, transforming it, and writing it somewhere else.

We could implement the transformations in code, but non-technical users might like to see and edit the rules without having to deploy a new build.

Here’s an example rule:

  1. Read the value at http://example.com/foo/bar/x, refer to it as val.
  2. Return 10*(1.0/val).

Or, a more complicated rule:

  1. Read the value at http://example.com/foo/bar/x, refer to it as val_x.
  2. Read the value at http://example.com/foo/bar/y, refer to it as val_y.
  3. Return the average of 1.0/val_x and 1.0/val_y.

The question is how to flatten this out into a format suitable for a CSV file, to allow users to easily update the rules using Excel.

One approach would be to implement an expression DSL, but this gets a bit painful when the input space is cells in a spreadsheet or CSV file. There are also questions about encoding the order of evaluation.

Reverse Polish notation is a simple way to encode arbitrary mathematical formulas in a flat sequence. Here’s how to write the first example:

CONSTANT 1
GET_DATA /foo/bar/x
DIV
CONSTANT 10
MUL

This sequence of operations will do the following:

  1. Push 1 onto the stack.
  2. Get data from the URL, push onto the stack.
  3. Perform the divide operation (1.0 divided by the value we got from the URL), and push that onto the stack.
  4. Push 10 onto the stack.
  5. Multiply the result from step 3 by 10.

We might use the following format in a CSV file. The ORDER column ensures the correct sequencing of operations.

IDOUTPUT_IDORDEROPCONSTDATA_SOURCE
KEY0001RULE0010CONSTANT1 
KEY0002RULE0011GET_DATA /foo/bar/x
KEY0003RULE0012DIV  
KEY0004RULE0013CONSTANT10 
KEY0005RULE0014MUL  

And here is how to encode the second example:

IDOUTPUT_IDORDEROPCONSTDATA_SOURCE
KEY0200RULE0020CONSTANT1 
KEY0201RULE0021GET_DATA /foo/bar/x
KEY0202RULE0022DIV  
KEY0203RULE0023CONSTANT1 
KEY0204RULE0024GET_DATA /foo/bar/y
KEY0205RULE0025DIV  
KEY0206RULE0026PLUS  
KEY0207RULE0027CONSTANT2 
KEY0208RULE0028DIV  

Evaluating a sequence of operations is straightforward. Start with an empty stack (in Python, this is just a normal list). If the next operation is CONSTANT or GET_DATA, push the value onto the stack. Otherwise, an operation like PLUS will need two operands, so pop two things off the stack and then do that actual operation. As a bonus, we can render a normal mathematical expression as we go: instead of putting a floating point number onto the stack, put a string onto the stack.

Here is the entire evaluator:

def eval_rule(rule):
    s = []

    expr = []

    for (_, x) in rule.sort_values('ORDER').iterrows():
        op = x['OP']

        if op == 'GET_DATA':
            s.append(x['GET_DATA'])
            expr.append('(GET_DATA: ' + str(x['DATA_SOURCE']) + ')')

        elif op == 'CONSTANT':
            s.append(x['CONST'])
            expr.append(str(x['CONST']))

        elif op == 'MUL':
            b = s.pop()
            a = s.pop()
            s.append(a*b)

            b2 = expr.pop()
            a2 = expr.pop()
            expr.append('(' + a2 + '*' + b2 + ')')

        elif op == 'PLUS':
            b = s.pop()
            a = s.pop()
            s.append(a+b)

            b2 = expr.pop()
            a2 = expr.pop()
            expr.append('(' + a2 + '+' + b2 + ')')

        elif op == 'MINUS':
            b = s.pop()
            a = s.pop()
            s.append(a-b)

            b2 = expr.pop()
            a2 = expr.pop()
            expr.append('(' + a2 + '-' + b2 + ')')

        elif op == 'DIV':
            denominator = s.pop()
            numerator   = s.pop()
            s.append(numerator/denominator)

            denominator2 = expr.pop()
            numerator2   = expr.pop()
            expr.append('(' + numerator2 + '/' + denominator2 + ')')
        else:
            raise ValueError('Unknown operator: ' + op)

    if len(s) != 1:
        raise ValueError('Expected one item on the evaluation stack, but found: ' + str(s))

    if len(expr) != 1:
        raise ValueError('Expected one item on the expression stack, but found: ' + str(expr))

    return s[0], expr[0]

Just for fun, we will also evaluate the example from the Wikipedia page on Reverse Polish notation:

IDOUTPUT_IDORDEROPCONSTDATA_SOURCE
KEY0101RULE0030CONSTANT15 
KEY0102RULE0031CONSTANT7 
KEY0103RULE0032CONSTANT1 
KEY0104RULE0033CONSTANT1 
KEY0105RULE0034PLUS  
KEY0106RULE0035MINUS  
KEY0107RULE0036DIV  
KEY0108RULE0037CONSTANT3 
KEY0109RULE0038MUL  
KEY0110RULE0039CONSTANT2 
KEY0111RULE00310CONSTANT1 
KEY0112RULE00311CONSTANT1 
KEY0113RULE00312PLUS  
KEY0114RULE00313PLUS  
KEY0153RULE00314MINUS  

Here’s the output:

$ python3 rpn.py
RULE001
((1.0/(GET_DATA: /foo/bar/x))*10.0)
0.23809523809523808

RULE002
(((1.0/(GET_DATA: /foo/bar/x))+(1.0/(GET_DATA: /foo/bar/y)))/2.0)
0.02857142857142857

RULE003
(((15.0/(7.0-(1.0+1.0)))*3.0)-(2.0+(1.0+1.0)))
5.0

 

Reverse Polish Notation gives us a compact way to represent a sequence of operators with no ambiguity about the order of evaluation.