A Classic from David MacKay
A classic textbook in information theory is Information Theory, by David MacKay. When I was first starting out, I came across this book (it is, after all, free - graduate students rejoice!) and I really enjoyed his exposition on p. 48 about "a first inference problem." I recommend reading it if you haven't, it's great.
Anyway, I've reproduced the problem here and I'll share a Turing solution below.
The trickiness comes from the window which truncates the data. This is actually super easy to accomplish in Stan, Turing, etc with basically single line changes to the most natural model.
using Turing
using Bijectors
using Gadfly
using DataFrames, DataFramesMeta
#using Zygote
#Turing.setadbackend(:zygote)
Gadfly.set_default_plot_size(900px, 300px)
Here's how things look in the simpler case where there is no truncation.
$$\begin{aligned} \lambda &\sim \mathrm{Uniform}(0, N)\\ d_i &\sim \mathrm{Exponential}(\lambda)\\ \end{aligned}$$And some data to match:
λ_true = 12.0
data = rand(Exponential(λ_true), 500)
plot(DataFrame(x=data),x=:x, xintercept=[20],
Geom.histogram, Geom.vline(color="red"),
Coord.cartesian(xmin=0.0, ymin=0.0), Guide.yticks(label = false))
You can see that even though $\lambda_{true} = 12 < 20$ the window (red) would still truncate quite a bit of the data in the original problem description. Here's the model written out in Turing:
@model function decay_estimator(decays::Array{Float64})
λ ~ Uniform(0, 30)
decays ~ filldist(Exponential(λ), length(decays))
end;
filldist
function is a way to avoid looping over the decays
array. It populates a matching array of random variables so we have something like Array{Float64} ~ Array{Distribution}
. In other words, it allows something like element-wise sampling for iid variables.
iterations = 2000
ϵ = 0.02
τ = 10;
λ_chain = sample(
decay_estimator(data),
HMC(ϵ, τ), iterations,
progress=true, drop_warmup=true);
plot(DataFrame(λ_chain), x=:λ, xintercept=[λ_true], Theme(alphas=[0.6]),
Geom.density, Geom.vline(color="green"),
Coord.cartesian(ymin=0.0), Guide.yticks(label = false))
Well this model pretty much nails it. The original value of $\lambda_{true}$ was 12 and that peak is right on the money. Let's try the same model with truncated data... !
data_trunc = [d for d in data if d <= 20.0]
plot(DataFrame(x=data_trunc),x=:x, xintercept=[20], Theme(alphas=[0.6]),
Geom.histogram, Geom.vline(color="red"),
Coord.cartesian(xmin=0.0, ymin=0.0), Guide.yticks(label = false))
λ_trunc_chain = sample(
decay_estimator(data_trunc),
HMC(ϵ, τ), iterations,
progress=false, drop_warmup=true);
trunc_df = @where(DataFrame(λ_trunc_chain), :λ .< 50)
plot(trunc_df, x=:λ, xintercept=[λ_true], Theme(alphas=[0.6]),
Geom.density, Geom.vline(color="green"),
Coord.cartesian(xmin=0.0, ymin=0.0), Guide.yticks(label = false))
That is most definitely not correct. So, we'll need a new model.
$$\begin{aligned} \lambda &\sim \mathrm{Uniform}(0, N)\\ d_i &\sim \mathrm{Exponential}\vert_{0}^{20}(\lambda)\\ \end{aligned}$$@model function decay_estimator_trunc(decays::Array{Float64})
λ ~ Uniform(0, 50)
T = typeof(λ)
for i in 1:length(decays)
decays[i] ~ truncated(Exponential(λ), T(0.0), T(20.0)) #wtf
end
end;
You might be wondering what's the story with T
. Good question! My understanding is that there's an issue in Distributions.jl where the truncated
function ends up mixing up simple Number
types (0.0
and 20.0
in this case) with Dual
types (the types that track derivative information for auto-differentiation). There's a discussion on this open issue. Anyway, by explicitly casting the Number
values as T
(which is some Dual
type we peeked from $\lambda$ here) we can make it work!
λ_trunc_correct_chain = sample(
decay_estimator_trunc(data_trunc),
HMC(ϵ, τ), iterations,
progress=true, drop_warmup=true);
plot(DataFrame(λ_trunc_correct_chain), x=:λ, xintercept=[λ_true], Theme(alphas=[0.6]),
Geom.density, Geom.vline(color="green"),
Coord.cartesian(xmin=0.0, ymin=0.0), Guide.yticks(label = false))
Ahhhh! Much better !