A classic textbook in information theory is Information Theory, by David MacKay. When I was first starting out, I came across this book (it is, after all, free - graduate students rejoice!) and I really enjoyed his exposition on p. 48 about "a first inference problem." I recommend reading it if you haven't, it's great.

Anyway, I've reproduced the problem here and I'll share a Turing solution below.

image.png

The trickiness comes from the window which truncates the data. This is actually super easy to accomplish in Stan, Turing, etc with basically single line changes to the most natural model.

using Turing
using Bijectors
using Gadfly
using DataFrames, DataFramesMeta
#using Zygote
#Turing.setadbackend(:zygote)
Gadfly.set_default_plot_size(900px, 300px)

Here's how things look in the simpler case where there is no truncation.

$$\begin{aligned} \lambda &\sim \mathrm{Uniform}(0, N)\\ d_i &\sim \mathrm{Exponential}(\lambda)\\ \end{aligned}$$

And some data to match:

λ_true = 12.0
data = rand(Exponential(λ_true), 500)
plot(DataFrame(x=data),x=:x, xintercept=[20],
    Geom.histogram, Geom.vline(color="red"),
    Coord.cartesian(xmin=0.0, ymin=0.0), Guide.yticks(label = false))
x -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 -58 -56 -54 -52 -50 -48 -46 -44 -42 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100 102 104 106 108 110 112 114 -100 0 100 200 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ?

You can see that even though $\lambda_{true} = 12 < 20$ the window (red) would still truncate quite a bit of the data in the original problem description. Here's the model written out in Turing:

@model function decay_estimator(decays::Array{Float64})
    λ ~ Uniform(0, 30)
    decays ~ filldist(Exponential(λ), length(decays))
end;

Note: The filldist function is a way to avoid looping over the decays array. It populates a matching array of random variables so we have something like Array{Float64} ~ Array{Distribution}. In other words, it allows something like element-wise sampling for iid variables.
iterations = 2000
ϵ = 0.02
τ = 10;

λ_chain = sample(
    decay_estimator(data),
    HMC(ϵ, τ), iterations, 
    progress=true, drop_warmup=true);
Sampling: 100%|█████████████████████████████████████████| Time: 0:00:01

plot(DataFrame(λ_chain), x=:λ, xintercept=[λ_true], Theme(alphas=[0.6]),
    Geom.density, Geom.vline(color="green"),
    Coord.cartesian(ymin=0.0), Guide.yticks(label = false))
λ 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5 19.0 19.5 20.0 20.5 21.0 0 10 20 30 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5 19.0 19.5 20.0 20.5 21.0 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ?

Well this model pretty much nails it. The original value of $\lambda_{true}$ was 12 and that peak is right on the money. Let's try the same model with truncated data... :grimacing:!

data_trunc = [d for d in data if d <= 20.0]
plot(DataFrame(x=data_trunc),x=:x, xintercept=[20], Theme(alphas=[0.6]),
    Geom.histogram, Geom.vline(color="red"),
    Coord.cartesian(xmin=0.0, ymin=0.0), Guide.yticks(label = false))
x -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 -20 0 20 40 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ?
λ_trunc_chain = sample(
    decay_estimator(data_trunc), 
    HMC(ϵ, τ), iterations, 
    progress=false, drop_warmup=true);

trunc_df = @where(DataFrame(λ_trunc_chain), :λ .< 50)
plot(trunc_df, x=:λ, xintercept=[λ_true], Theme(alphas=[0.6]),
    Geom.density, Geom.vline(color="green"),
    Coord.cartesian(xmin=0.0, ymin=0.0), Guide.yticks(label = false))
λ -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 -12.0 -11.5 -11.0 -10.5 -10.0 -9.5 -9.0 -8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5 19.0 19.5 20.0 20.5 21.0 21.5 22.0 22.5 23.0 23.5 24.0 -20 0 20 40 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ?

That is most definitely not correct. So, we'll need a new model.

$$\begin{aligned} \lambda &\sim \mathrm{Uniform}(0, N)\\ d_i &\sim \mathrm{Exponential}\vert_{0}^{20}(\lambda)\\ \end{aligned}$$

Note: I’m not aware of any specific notation for truncated distributions (and I spent like 3 whole minutes looking) so $\mathcal{D}|_{a}^{b}(x)$ seemed fine.
@model function decay_estimator_trunc(decays::Array{Float64})
    λ ~ Uniform(0, 50)
    T = typeof(λ)
    for i in 1:length(decays)
        decays[i] ~ truncated(Exponential(λ), T(0.0), T(20.0)) #wtf
    end
end;

You might be wondering what's the story with T. Good question! My understanding is that there's an issue in Distributions.jl where the truncated function ends up mixing up simple Number types (0.0 and 20.0 in this case) with Dual types (the types that track derivative information for auto-differentiation). There's a discussion on this open issue. Anyway, by explicitly casting the Number values as T (which is some Dual type we peeked from $\lambda$ here) we can make it work!

λ_trunc_correct_chain = sample(
    decay_estimator_trunc(data_trunc), 
    HMC(ϵ, τ), iterations, 
    progress=true, drop_warmup=true);

plot(DataFrame(λ_trunc_correct_chain), x=:λ, xintercept=[λ_true], Theme(alphas=[0.6]),
    Geom.density, Geom.vline(color="green"),
    Coord.cartesian(xmin=0.0, ymin=0.0), Guide.yticks(label = false))
Sampling: 100%|█████████████████████████████████████████| Time: 0:00:17
λ -40 -30 -20 -10 0 10 20 30 40 50 60 70 -28 -27 -26 -25 -24 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 -30 0 30 60 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ?

Ahhhh! Much better :tada:!