# A Classic from David MacKay

A classic textbook in information theory is Information Theory, by David MacKay. When I was first starting out, I came across this book (it is, after all, free - graduate students rejoice!) and I really enjoyed his exposition on p. 48 about "a first inference problem." I recommend reading it if you haven't, it's great.

Anyway, I've reproduced the problem here and I'll share a Turing solution below.

The trickiness comes from the window which truncates the data. This is actually super easy to accomplish in Stan, Turing, etc with basically single line changes to the most natural model.

```
using Turing
using Bijectors
using Gadfly
using DataFrames, DataFramesMeta
#using Zygote
#Turing.setadbackend(:zygote)
Gadfly.set_default_plot_size(900px, 300px)
```

Here's how things look in the simpler case where there is no truncation.

$$\begin{aligned} \lambda &\sim \mathrm{Uniform}(0, N)\\ d_i &\sim \mathrm{Exponential}(\lambda)\\ \end{aligned}$$And some data to match:

```
λ_true = 12.0
data = rand(Exponential(λ_true), 500)
plot(DataFrame(x=data),x=:x, xintercept=[20],
Geom.histogram, Geom.vline(color="red"),
Coord.cartesian(xmin=0.0, ymin=0.0), Guide.yticks(label = false))
```

You can see that even though $\lambda_{true} = 12 < 20$ the window (red) would still truncate quite a bit of the data in the original problem description. Here's the model written out in Turing:

```
@model function decay_estimator(decays::Array{Float64})
λ ~ Uniform(0, 30)
decays ~ filldist(Exponential(λ), length(decays))
end;
```

**Note:**The

`filldist`

function is a way to avoid looping over the `decays`

array. It populates a matching array of random variables so we have something like `Array{Float64} ~ Array{Distribution}`

. In other words, it allows something like element-wise sampling for iid variables.
```
iterations = 2000
ϵ = 0.02
τ = 10;
λ_chain = sample(
decay_estimator(data),
HMC(ϵ, τ), iterations,
progress=true, drop_warmup=true);
```

```
plot(DataFrame(λ_chain), x=:λ, xintercept=[λ_true], Theme(alphas=[0.6]),
Geom.density, Geom.vline(color="green"),
Coord.cartesian(ymin=0.0), Guide.yticks(label = false))
```

Well this model pretty much nails it. The original value of $\lambda_{true}$ was 12 and that peak is right on the money. Let's try the same model with truncated data... !

```
data_trunc = [d for d in data if d <= 20.0]
plot(DataFrame(x=data_trunc),x=:x, xintercept=[20], Theme(alphas=[0.6]),
Geom.histogram, Geom.vline(color="red"),
Coord.cartesian(xmin=0.0, ymin=0.0), Guide.yticks(label = false))
```

```
λ_trunc_chain = sample(
decay_estimator(data_trunc),
HMC(ϵ, τ), iterations,
progress=false, drop_warmup=true);
```

```
trunc_df = @where(DataFrame(λ_trunc_chain), :λ .< 50)
plot(trunc_df, x=:λ, xintercept=[λ_true], Theme(alphas=[0.6]),
Geom.density, Geom.vline(color="green"),
Coord.cartesian(xmin=0.0, ymin=0.0), Guide.yticks(label = false))
```

That is most definitely not correct. So, we'll need a new model.

$$\begin{aligned} \lambda &\sim \mathrm{Uniform}(0, N)\\ d_i &\sim \mathrm{Exponential}\vert_{0}^{20}(\lambda)\\ \end{aligned}$$**Note:**I’m not aware of any specific notation for truncated distributions (and I spent like 3 whole minutes looking) so $\mathcal{D}|_{a}^{b}(x)$ seemed fine.

```
@model function decay_estimator_trunc(decays::Array{Float64})
λ ~ Uniform(0, 50)
T = typeof(λ)
for i in 1:length(decays)
decays[i] ~ truncated(Exponential(λ), T(0.0), T(20.0)) #wtf
end
end;
```

You might be wondering what's the story with `T`

. Good question! My understanding is that there's an issue in Distributions.jl where the `truncated`

function ends up mixing up simple `Number`

types (`0.0`

and `20.0`

in this case) with `Dual`

types (the types that track derivative information for auto-differentiation). There's a discussion on this open issue. Anyway, by explicitly casting the `Number`

values as `T`

(which is some `Dual`

type we peeked from $\lambda$ here) we can make it work!

```
λ_trunc_correct_chain = sample(
decay_estimator_trunc(data_trunc),
HMC(ϵ, τ), iterations,
progress=true, drop_warmup=true);
plot(DataFrame(λ_trunc_correct_chain), x=:λ, xintercept=[λ_true], Theme(alphas=[0.6]),
Geom.density, Geom.vline(color="green"),
Coord.cartesian(xmin=0.0, ymin=0.0), Guide.yticks(label = false))
```

Ahhhh! Much better !