Quadratic State Space Examples

Second-order state-space models here have pruning as in Andreasen, Fernandez-Villaverde, and Rubio-Ramirez (2017).

At this point, the package only supports linear time-invariant models without a separate p vector. The canonical form is

\[u_{n+1} = A_0 + A_1 u_n + u_n^{\top} A_2 u_n + B w_{n+1}\]

with

\[z_n = C_0 + C_1 u_n + u_n^{\top} C_2 u_n + v_n\]

and optionally $v_n \sim N(0, D)$ and $w_{n+1} \sim N(0,I)$. If you pass noise into the solver, it no longer needs to be Gaussian.

Note

Quadratic state-space models do not have the full feature coverage as the linear models. In particular, the auto-differentiation rules are only currently implemented for the logpdf required for estimation, and the simulation doesn't have much flexibility on which model elements can be missing.

Simulating a Quadratic (and Time-Invariant) State Space Model

Creating a QuadraticStateSpaceModel is similar to the Linear version described previously.

using DifferenceEquations, LinearAlgebra, Distributions, Random, Plots, DataFrames, Zygote,
      DiffEqBase
A_0 = [-7.824904812740593e-5, 0.0]
A_1 = [0.95 6.2;
       0.0 0.2]
A_2 = cat([-0.0002 0.0334; 0.0 0.0],
    [0.034 3.129; 0.0 0.0]; dims = 3)
B = [0.0; 0.01;;] # matrix
C_0 = [7.8e-5, 0.0]
C_1 = [0.09 0.67;
       1.00 0.00]
C_2 = cat([-0.00019 0.0026; 0.0 0.0],
    [0.0026 0.313; 0.0 0.0]; dims = 3)
D = [0.01, 0.01] # diagonal observation noise
u0 = zeros(2)
T = 30

prob = QuadraticStateSpaceProblem(
    A_0, A_1, A_2, B, u0, (0, T); C_0, C_1, C_2, observables_noise = D, syms = [:a, :b])
sol = solve(prob)

retcode: Success
Interpolation: Piecewise constant interpolation
t: 0:30
u: 31-element Vector{Vector{Float64}}:
 [0.0, 0.0]
 [-7.824904812740593e-5, 0.016093725270737328]
 [0.1004389470451015, -0.0025825260883207313]
 [0.07932859820152999, 0.0017001041854039305]
 [0.08584139505074852, -0.001004063955800017]
 [0.07524179805629824, -0.008339664899995361]
 [0.01986994318617173, -0.005381222665202106]
 [-0.014481900748124528, 0.0008175998920747815]
 [-0.008765709152344453, 0.0030168818301629994]
 [0.010325609805172734, 0.01313915675745908]
 ⋮
 [0.2959252732123367, 0.009654904030831527]
 [0.34137660459276653, 0.00547791069702207]
 [0.3583885694460185, 0.011338018513010278]
 [0.41133536435286866, -0.004581036390738136]
 [0.3621942138828638, -0.002526477696202869]
 [0.3282750284787751, 0.010018879114998856]
 [0.37441292152249095, -0.012212377512334066]
 [0.28003046286539884, -0.007722589373452348]
 [0.21809752723165532, -0.005708302207914771]

As in the linear case, this model can be simulated and plotted

plot(sol)

And the observables and noise can be stored

observables = hcat(sol.z...)  # Observables required to be matrix.  Issue #55
observables = observables[:, 2:end] # see note above on likelihood and timing
noise = sol.W

1×30 Matrix{Float64}:
 1.60937  -0.580127  0.221661  -0.134408  …  -1.42162  -0.528011  -0.416378

Ensembles work as well,

trajectories = 50
u0_dist = MvNormal([1.0 0.1; 0.1 1.0])  # mean zero initial conditions
prob = QuadraticStateSpaceProblem(A_0, A_1, A_2, B, u0_dist, (0, T); C_0, C_1,
    C_2, observables_noise = D, syms = [:a, :b])
ens_sol = solve(EnsembleProblem(prob), DirectIteration(), EnsembleThreads(); trajectories)
summ = EnsembleSummary(ens_sol)  # calculate summarize statistics such as quantiles
plot(summ)

Joint Likelihood with Noise

To calculate the likelihood, the Kalman Filter is no longer applicable. However, we can still calculate the joint likelihood as we did in the linear examples. Using the simulated observables and noise,

function joint_likelihood_quad(A_0, A_1, A_2, B, C_0, C_1, C_2, D, u0, noise, observables)
    prob = QuadraticStateSpaceProblem(A_0, A_1, A_2, B, u0, (0, size(observables, 2)); C_0,
        C_1, C_2, observables, observables_noise = D, noise)
    return solve(prob).logpdf
end
u0 = [0.0, 0.0]
joint_likelihood_quad(A_0, A_1, A_2, B, C_0, C_1, C_2, D, u0, noise, observables)

52.00109030104491

Which, in turn, can itself be differentiated.

gradient(
    (A_0, A_1, A_2, B, C_0, C_1, C_2, noise) -> joint_likelihood_quad(
        A_0, A_1, A_2, B, C_0, C_1, C_2, D, u0, noise, observables),
    A_0,
    A_1,
    A_2,
    B,
    C_0,
    C_1,
    C_2,
    noise)

([404.2341219369865, 3029.6647689771376], [54.89663985905509 2.518655849104415; 351.0828960199931 14.672941590658485], [14.518030807518624 0.3357883489784851; 88.47088914971908 1.7518356267007371;;; 0.3357883489784851 0.05050744980936954; 1.7518356267007371 0.26667374665834104], [213.45262721684978; 1612.3503711749734;;], [53.371488861705885, 29.85911590355791], [17.150393011066527 0.24084869641905948; 14.202918337349333 0.22054314650692589], [4.517270336035106 0.046076958188089406; 4.800932328038424 0.02999199705302016;;; 0.046076958188089406 0.012015119633900175; 0.02999199705302016 -0.0008885019324152079], [1.216966497072188 1.9530406907678066 … 0.10086428431718612 0.02681175898840818])

Note that this is not only calculating the gradient of the likelihood with respect to the underlying canonical representations for the quadratic state space form, but also the entire noise vector.

As in the linear case, this likelihood calculation can be nested such that a separate differentiable function could generate the quadratic state space model, and the gradients could be over a smaller set of structural parameters.