using Turing
using Bijectors
using Gadfly
using DataFrames, DataFramesMeta
using StatsPlots
Gadfly.set_default_plot_size(900px, 300px)

Changepoint models are those in which an underlying process exhibits a discrete change in its behavior partway through your data. In other words, you have a specific point at which the behavior before and after differ in some material way. As an example, consider the following data taken from this post on the Julia discourse:

τ_true, μ₁, μ₂, σ_true = 500, 45, 30, 4
c = vcat(rand(Normal(μ₁,σ_true), τ_true), 
         rand(Normal(μ₂,σ_true), 1000-τ_true));

Gadfly.plot(DataFrame(y=c, x=1:1000), x=:x, y=:y, Geom.line)
x -1500 -1000 -500 0 500 1000 1500 2000 2500 -1000 -950 -900 -850 -800 -750 -700 -650 -600 -550 -500 -450 -400 -350 -300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 1050 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1750 1800 1850 1900 1950 2000 -1000 0 1000 2000 -1000 -900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 -60 -58 -56 -54 -52 -50 -48 -46 -44 -42 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100 102 104 106 108 110 112 114 116 118 120 -100 0 100 200 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 y

Creating a model for this can be tricky, not least because MCMC and especially HMC :heart: differentiable (ie continuous!) variables, and the changepoint is discrete in the simplest implementation:

$$ \begin{aligned} \tau &\sim \mathrm{DiscreteUniform}(1, \mathrm{length}(\mathcal{D}))\\ \mu_1, \mu_2 &\sim \mathrm{Normal}(\mu_0, \sigma_0)\\ d_i &\sim \begin{cases} \mathrm{Normal}(\mu_1, \sigma) & i \leq \tau \\ \mathrm{Normal}(\mu_2, \sigma) & i > \tau \\ \end{cases} \end{aligned} $$

From the title of the post, you might have inferred that we don't need to have a discrete signal. Here's the core idea laid out on Gelman's blog:

The other thing I think you can do is a basis expansion in a basis that includes functions like -C/(1+exp(-(x-k)/a) (ie. a shifted rescaled sigmoid), put priors on the location for the shift and the degree of the shift (C) and the quickness of the shift (a)… There’s something to be said for approximating things as step functions, and there’s also something to be said for approximating things as infinitely smooth approximations to step functions :-)getting outside the rigidity of your model and thinking about approximations to your model can be helpful.

To create a smooth approximation, we look to a scaled, shifted version of the sigmoid function:

$$ \begin{aligned} \mathrm{sigmoid}(x) &= \frac{e^x}{1+e^x} \\ l(a, b, x) &= a(x-b)\\ \mathrm{sigm}(a, b, x) &= \mathrm{sigmoid}\circ l(a,b,x) \end{aligned} $$

I think of $a$ as the "specificity" of the step change, and call it "spec" below. Perhaps naming isn't my strong suit. In any case...

logit = bijector(Beta())   # bijection:  (0, 1)  →  ℝ
inv_logit = inv(logit)     # bijection:       ℝ  →  (0, 1)

function sigm(μ, σ, x)
    return inv_logit(σ*(x - μ))
end;

x=-10:.1:20
df = DataFrame(y=sigm.(10, .5, x), x=x)
Gadfly.plot(df, x=:x, y=:y, Geom.line)
x -50 -40 -30 -20 -10 0 10 20 30 40 50 60 -40 -39 -38 -37 -36 -35 -34 -33 -32 -31 -30 -29 -28 -27 -26 -25 -24 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 -50 0 50 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00 -1 0 1 2 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 y

See? It's a smooth step function! That means we'll get gradients galore! We use that to interpolate between the two likelihoods we want to step between:

$$\begin{aligned} \tau &\sim \mathrm{Uniform}(1, \mathrm{length}(\mathcal{D}))\\ \mu_1, \mu_2 &\sim \mathrm{Normal}(\mu_0, \sigma_0)\\ \sigma &\sim \mathrm{HalfNormal}(0, 10)\\ \zeta_i &= \mathrm{sigm}(a, \tau, i)\\ d_i &\sim \mathrm{Normal}((1 - \zeta_i) \mu_1 + \zeta_i \mu_2, \sigma) \end{aligned} $$

In Turing:

@model function changepoint(data)
    # priors
    spec = 0.01
    μ_1 ~ Normal(0, 2)
    μ_2 ~ Normal(0, 2)
    τ ~ Uniform(1, 1000)
    σ ~ truncated(Normal(1, 2), 0, 20)
    
    # likelihood
    for i in 1:length(data)
        switch = sigm(τ, spec, i)
        z = (1-switch) * μ_1 + switch * μ_2
        data[i] ~ Normal(z, σ)
    end
end;

Now we're ready to roll - I had to fuss with both $\epsilon$ and spec before landing on standardizing the data as the most important first step, so you may find better hyperparameter values.

std_c = (c .- mean(c)) ./ std(c);

# Settings of the Hamiltonian Monte Carlo (HMC) sampler.
iterations = 500
ϵ = 0.005
τ = 10;

cp_chain2 = sample(
    changepoint(std_c), 
    HMC(ϵ, τ), iterations, 
    progress=true, drop_warmup=false);
Sampling: 100%|█████████████████████████████████████████| Time: 0:00:10

StatsPlots.plot(cp_chain2, size=(3500,2200), titlefont = (34), tickfont=(18), guide="")

:tada: Voilà! :tada: