 # Free Probabilities

As the Monty Python crew would say: "Now for something completely different!"

TL;DR: I'm going to turn a `Monad` of probabilities into a `Free Monad` of probabilities and this is not nearly so scary as it sounds. Also, it's actually useful!

I've been using a bit more Haskell recently, and after watching a demo from a colleague I wanted to solve a problem that's been bouncing around my brain for a while.

I do a fair bit of both game playing and game design (in the board game and tabletop roleplaying sense of 'game'), and I'm often interested in either generating random values or investigating how likely the result of a random process is.

Let's give an example; if I model a dice roll with the "spread" of possible outcomes, it might look like this:

This basically just means that if I roll that dice, I have a 1 in 6 chance of rolling any of the 6 numbers. Now, there are rules for combining conditional probabilities: let's start adding these in.

Haskell has basically given us one for free; if something always happens, we can adjust all of the potential outcomes to incorporate the "something". This can be modeled by being able to `map` over the values in our `Spread`.

Let's add 10 to our dice roll, regardless of what's rolled:

So far, so good. Let's see if we can take this a bit further; turn `Spread` into an `Applicative`.

Here we set up two things; first, `pure` enables us to take any individual value and turn into a `Spread` of that value. From a logical point of view, the total probability of all of the items in a spread must add up to "1" (we'll enforce that with smart constructors later) so there's only one choice here. `pure` returns a `Spread` with a single item in it - probability of that outcome? Certain.

`<*>` is the operator that allows us to take a `Spread` of functions `a -> b` and a `Spread` of inputs `a` and return a `Spread b`. Hmm. How should that work?

Well, to work out the probability of an event A which is conditional on event B, you just multiply the two probabilities together. So `<*>` turns out to be reasonably straight forward: you take all possible combinations of functions and inputs, and return each output with a probability of (probability of function * probability of input).

So that's done, but doesn't look immediately useful. It does, however, allow us to escalate one more level: `Monad`.

So we're back on conditional probabilities again. We take each of our values from the input spread and apply our function to it. Then we multiply the probability of the outcome in the "child" spread with the probability of the "parent" input - and finally we concatinate the whole lot back together into a single list and wrap it up in `Spread` again.

The way the laws of probability (and, well, fraction multiplication) work, if the total probability in each of our `Spread`s is 1, the total probability across all of our outcomes after a bind will also be 1. Neat! Now we have something we can use.

Let's add 10 to every dice that rolls more than 3!

This is starting to look good.

We do have a problem though: this is a very useful representation for when we want to know every possible outcome and it's probability of happening - but that's not always desirable, or possible.

Let's take a famous example; a game where you flip a coin. If it comes up tails it pays out £1.00 - if you get heads you pay again with double the pay out. How much do you want to pay to take part in this game?

Which promptly creates an infinite last of possible outcome states. In an other use case, it can be nice to just pick an outcome from the sample space. For example, if I want to model how much damage a warrior in my Pathfinder game does with a long sword I may want to look at the probability spread… or I might just want to pick a result at random.

So I want to take this existing monadic data structure, but execute it with different execution strategies. Which meant when someone at work mentioned `Free` monads in a demo and mentioned that they capture the shape of a monad without executing it, my ears pricked up.

The theory says that we can take any `Functor` and turn it into a `Free Monad`; so turning a `Monad` into a `Free Monad` must be even easier, no?

Well, the first step is pretty straight forward. Now we can create `Prob` representations of dependent probabilities just like before!

Let's have a few functions to create `Spread`s which are both guaranteed to be "meaningful" and lifted into our new `Prob` type.

Now we can rewrite our previous example:

Success! Kind of. This looks great, and type checks, but I can't actually evaluate the result any more.

Let's see if we can deal with that. The `Control.Monad.Free` library provides a hopefully named function called `iterM`.

The full type annotation looks like this:

Ouch. Well, we have a `Monad` we want to turn things into (`Spread`). And we have the `Functor` which our `Free Spread` is created from which is… `Spread`. So let's start plugging in names:

Looking carefully, it looks like all I actually need to supply is a function `(Spread (Spread a)) -> Spread a`. Let's see if we can find an easy way to do that in Hoogle, a search engine that allows us to search for function signatures.

Searching for Monad m => (m (m a)) -> m a turns up `join` which is already part of the `Monad` type class. Abstraction for the win!

Good stuff. Now life gets really interesting. Let's add in ways of picking a single sample out of a `Prob` without evaluating the entire outcome space.

First, we need a way of picking a single outcome from a `Spread`. We'll break it down into two functions; `pickFromSpread` starts from the knowledge that the probabilities in a `Spread` always add up to 1:

It has a return type of `IO a` because picking a random value means the function is not referentially transparent.

Our second function decides whether to take pick the first sample from a list of `[(a, Probability)]` based on the Probability of the first item compared to the total of all of the Probabilities in the list. We pass in the total of the remaining probabilities as an argument each time, as the list may be infinite so we don't want to `sum` across it.

We can now go from `Spread a` to `IO a`. Let's plug that into `iterM`'s type signature again and see what we get:

This looks pretty similar to before, where we used `join` to unwrap nested `Spread` structures. If we could turn that `Spread (IO a)` into a `IO (IO a)` than we could call `join` with it - which we can, because that's exactly what `pickFromSpread` does!

Now we can call start pulling random samples out of a `Prob`:

This is very fast, and even works well on infinitely recursive definitions like our coin flip above:

This technique begins to look interesting when you realize that this technique allows you to take anything implemented as a `Functor` and supply an alternative execution method. Want to supply test values in your test instead of values from `IO`? This might just let you do that.

That's basically all for now, but I will leave you with a final example.

First, a function that makes all of this usable in practice; the `normalize` function takes a `Spread` and groups together repeats of the same outcome into a single value:

Then a model of an attack in Pathfinder (1st edition), modeling a strike with a weapon against a foe and building in things like critical hits:

With some results:

I hope you've enjoyed this brief visit to useful abstractions in Haskell; it's definitely a language where as you learn it you realise that you have great power, and great responsibility to the next maintainer!