Using Hypothesis to solve the farmer-chicken-fox puzzle

The farmer-chicken-fox puzzle goes something like this: a farmer is at a shop, having bought a chicken, fox, and a bag of corn. The farmer would like to get to her house on the other side of the river using a small boat. For some reason she can take at most one item at a time. If the chicken is left alone with the corn, it will eat the corn. If the fox is left alone with the chicken, it will eat the chicken. How does the farmer get across the river without losing any item?

A straightforward way to solve this is to use a breadth-first search, enumerating all valid moves from a set of states. The initial state would be

shop: {farmer, chicken, fox, corn}
house: {}

One valid first move is to go to the other side with the chicken, giving the new state:

shop: {fox, corn}
house: {farmer, chicken}

Another way is to use the Hypothesis library to simulate a finite state machine, and assert that no sequence of rules leads to the state where everyone is on the house side of the river. This is an adaptation of the Die Hard water jugs problem.

We model the state using two sets of strings:

class FarmerChickenFox(RuleBasedStateMachine):

    def __init__(self):
        self.shop  = set(['farmer', 'chicken', 'fox', 'corn'])
        self.house = set([])
        super().__init__()

Here’s the rule to take the chicken. We work out which side the farmer is in, and then move the farmer and the chicken to the other side. If this transition results in an invalid state (e.g. the fox left alone with the chicken) we undo the state-change.

    @rule()
    def take_chicken(self):
        self.save_state()

        if 'farmer' in self.shop and 'chicken' in self.shop:
            self.shop.remove('farmer')
            self.shop.remove('chicken')
            self.house.add('farmer')
            self.house.add('chicken')
            if not self.state_ok(): self.undo_state()
        elif 'farmer' in self.house and 'chicken' in self.house:
            self.house.remove('farmer')
            self.house.remove('chicken')
            self.shop.add('farmer')
            self.shop.add('chicken')
            if not self.state_ok(): self.undo_state()

We have two invariants to ensure that the state of the system is consistent:

    @invariant()
    def fox_not_with_chicken(self):
        return not self.is_fox_alone_with_chicken()

    @invariant()
    def chicken_not_with_corn(self):
        return not self.is_chicken_alone_with_corn()

The trick is to create an invariant to check if everyone is on the house side:

    @invariant()
    def not_solved(self):
        note("::: shop: {s}, house: {h}".format(s=self.shop, h=self.house))
        assert len(self.house) != 4

Running using pytest finds the solution, including the steps used:

$ pytest farmerchickenfox.py

(snipped)

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = FarmerChickenFox({})

    @invariant()
    def not_solved(self):
        note("::: shop: {s}, house: {h}".format(s=self.shop, h=self.house))
>       assert len(self.house) != 4
E       AssertionError: assert 4 != 4
E        +  where 4 = len({'chicken', 'corn', 'farmer', 'fox'})
E        +    where {'chicken', 'corn', 'farmer', 'fox'} = FarmerChickenFox({}).house

farmerchickenfox.py:117: AssertionError
-------------------------------------------------------------------------------------------- Hypothesis --------------------------------------------------------------------------------------------
Falsifying example: run_state_machine(factory=FarmerChickenFox, data=data(...))
state = FarmerChickenFox()
::: shop: {'fox', 'chicken', 'farmer', 'corn'}, house: set()
state.take_chicken()
::: shop: {'fox', 'corn'}, house: {'chicken', 'farmer'}
state.go_alone()
::: shop: {'fox', 'farmer', 'corn'}, house: {'chicken'}
state.take_corn()
::: shop: {'fox'}, house: {'corn', 'chicken', 'farmer'}
state.take_chicken()
::: shop: {'fox', 'chicken', 'farmer'}, house: {'corn'}
state.take_fox()
::: shop: {'chicken'}, house: {'corn', 'fox', 'farmer'}
state.go_alone()
::: shop: {'chicken', 'farmer'}, house: {'corn', 'fox'}
state.take_chicken()
::: shop: set(), house: {'corn', 'fox', 'farmer', 'chicken'}
state.teardown()

You can reproduce this example by temporarily adding @reproduce_failure('4.1.0', b'AAEBAQABAgEBAQMBAAEB') as a decorator on your test case
===================================================================================== 1 failed in 0.46 seconds =====================================================================================

Full source: farmerchickenfox.py.

An expression evaluator in CSV

Many business problems boil down to reading data from somewhere, transforming it, and writing it somewhere else.

We could implement the transformations in code, but non-technical users might like to see and edit the rules without having to deploy a new build.

Here’s an example rule:

  1. Read the value at http://example.com/foo/bar/x, refer to it as val.
  2. Return 10*(1.0/val).

Or, a more complicated rule:

  1. Read the value at http://example.com/foo/bar/x, refer to it as val_x.
  2. Read the value at http://example.com/foo/bar/y, refer to it as val_y.
  3. Return the average of 1.0/val_x and 1.0/val_y.

The question is how to flatten this out into a format suitable for a CSV file, to allow users to easily update the rules using Excel.

One approach would be to implement an expression DSL, but this gets a bit painful when the input space is cells in a spreadsheet or CSV file. There are also questions about encoding the order of evaluation.

Reverse Polish notation is a simple way to encode arbitrary mathematical formulas in a flat sequence. Here’s how to write the first example:

CONSTANT 1
GET_DATA /foo/bar/x
DIV
CONSTANT 10
MUL

This sequence of operations will do the following:

  1. Push 1 onto the stack.
  2. Get data from the URL, push onto the stack.
  3. Perform the divide operation (1.0 divided by the value we got from the URL), and push that onto the stack.
  4. Push 10 onto the stack.
  5. Multiply the result from step 3 by 10.

We might use the following format in a CSV file. The ORDER column ensures the correct sequencing of operations.

 

 

 

 

 

ID OUTPUT_ID ORDER OP CONST DATA_SOURCE
KEY0001 RULE001 0 CONSTANT 1  
KEY0002 RULE001 1 GET_DATA   /foo/bar/x
KEY0003 RULE001 2 DIV    
KEY0004 RULE001 3 CONSTANT 10  
KEY0005 RULE001 4 MUL    

And here is how to encode the second example:

 

 

 

 

 

ID OUTPUT_ID ORDER OP CONST DATA_SOURCE
KEY0200 RULE002 0 CONSTANT 1  
KEY0201 RULE002 1 GET_DATA   /foo/bar/x
KEY0202 RULE002 2 DIV    
KEY0203 RULE002 3 CONSTANT 1  
KEY0204 RULE002 4 GET_DATA   /foo/bar/y
KEY0205 RULE002 5 DIV    
KEY0206 RULE002 6 PLUS    
KEY0207 RULE002 7 CONSTANT 2  
KEY0208 RULE002 8 DIV    

Evaluating a sequence of operations is straightforward. Start with an empty stack (in Python, this is just a normal list). If the next operation is CONSTANT or GET_DATA, push the value onto the stack. Otherwise, an operation like PLUS will need two operands, so pop two things off the stack and then do that actual operation. As a bonus, we can render a normal mathematical expression as we go: instead of putting a floating point number onto the stack, put a string onto the stack.

Here is the entire evaluator:

def eval_rule(rule):
    s = []

    expr = []

    for (_, x) in rule.sort_values('ORDER').iterrows():
        op = x['OP']

        if op == 'GET_DATA':
            s.append(x['GET_DATA'])
            expr.append('(GET_DATA: ' + str(x['DATA_SOURCE']) + ')')

        elif op == 'CONSTANT':
            s.append(x['CONST'])
            expr.append(str(x['CONST']))

        elif op == 'MUL':
            b = s.pop()
            a = s.pop()
            s.append(a*b)

            b2 = expr.pop()
            a2 = expr.pop()
            expr.append('(' + a2 + '*' + b2 + ')')

        elif op == 'PLUS':
            b = s.pop()
            a = s.pop()
            s.append(a+b)

            b2 = expr.pop()
            a2 = expr.pop()
            expr.append('(' + a2 + '+' + b2 + ')')

        elif op == 'MINUS':
            b = s.pop()
            a = s.pop()
            s.append(a-b)

            b2 = expr.pop()
            a2 = expr.pop()
            expr.append('(' + a2 + '-' + b2 + ')')

        elif op == 'DIV':
            denominator = s.pop()
            numerator   = s.pop()
            s.append(numerator/denominator)

            denominator2 = expr.pop()
            numerator2   = expr.pop()
            expr.append('(' + numerator2 + '/' + denominator2 + ')')
        else:
            raise ValueError('Unknown operator: ' + op)

    if len(s) != 1:
        raise ValueError('Expected one item on the evaluation stack, but found: ' + str(s))

    if len(expr) != 1:
        raise ValueError('Expected one item on the expression stack, but found: ' + str(expr))

    return s[0], expr[0]

Just for fun, we will also evaluate the example from the Wikipedia page on Reverse Polish notation:

 

 

 

 

 

ID OUTPUT_ID ORDER OP CONST DATA_SOURCE
KEY0101 RULE003 0 CONSTANT 15  
KEY0102 RULE003 1 CONSTANT 7  
KEY0103 RULE003 2 CONSTANT 1  
KEY0104 RULE003 3 CONSTANT 1  
KEY0105 RULE003 4 PLUS    
KEY0106 RULE003 5 MINUS    
KEY0107 RULE003 6 DIV    
KEY0108 RULE003 7 CONSTANT 3  
KEY0109 RULE003 8 MUL    
KEY0110 RULE003 9 CONSTANT 2  
KEY0111 RULE003 10 CONSTANT 1  
KEY0112 RULE003 11 CONSTANT 1  
KEY0113 RULE003 12 PLUS    
KEY0114 RULE003 13 PLUS    
KEY0153 RULE003 14 MINUS    

Here’s the output:

$ python3 rpn.py
RULE001
((1.0/(GET_DATA: /foo/bar/x))*10.0)
0.23809523809523808

RULE002
(((1.0/(GET_DATA: /foo/bar/x))+(1.0/(GET_DATA: /foo/bar/y)))/2.0)
0.02857142857142857

RULE003
(((15.0/(7.0-(1.0+1.0)))*3.0)-(2.0+(1.0+1.0)))
5.0

 

Reverse Polish Notation gives us a compact way to represent a sequence of operators with no ambiguity about the order of evaluation.