Neural ODEs with DiffEqFlux
Let's see how easy it is to make dense neural ODE layers from scratch. Flux is used here just for the glorot_uniform function and the ADAM optimizer.
This example is taken from the DiffEqFlux documentation.
using ComponentArrays
using OrdinaryDiffEq
using Plots
using UnPack
using DiffEqFlux: sciml_train
using Flux: glorot_uniform, ADAM
using Optim: LBFGSFirst, let's set up the problem and create the truth data.
u0 = Float32[2.0; 0.0]
datasize = 30
tspan = (0.0f0, 1.5f0)
function trueODEfunc(du, u, p, t)
true_A = [-0.1 2.0; -2.0 -0.1]
du .= ((u .^ 3)'true_A)'
end
t = range(tspan[1], tspan[2], length = datasize)
prob = ODEProblem(trueODEfunc, u0, tspan)
ode_data = Array(solve(prob, Tsit5(), saveat = t))Next we'll make a function that creates dense neural layer components. It is similar to Flux.Dense, except it doesn't handle the activation function. We'll do that separately.
dense_layer(in, out) = ComponentArray{Float32}(W = glorot_uniform(out, in), b = zeros(out))Our parameter vector will be a ComponentArray that holds the ODE initial conditions and the dense neural layers.
layers = (L1 = dense_layer(2, 50), L2 = dense_layer(50, 2))
θ = ComponentArray(u = u0, p = layers)We now have convenient struct-like access to the weights and biases of the layers for our neural ODE function while giving our optimizer something that acts like a flat array.
function dudt(u, p, t)
@unpack L1, L2 = p
return L2.W * tanh.(L1.W * u .^ 3 .+ L1.b) .+ L2.b
end
prob = ODEProblem(dudt, u0, tspan)predict_n_ode(θ) = Array(solve(prob, Tsit5(), u0 = θ.u, p = θ.p, saveat = t))
function loss_n_ode(θ)
pred = predict_n_ode(θ)
loss = sum(abs2, ode_data .- pred)
return loss, pred
end
loss_n_ode(θ)Let's set up a training observation callback and train!
cb = function (θ, loss, pred; doplot = false)
display(loss)
# plot current prediction against data
pl = scatter(t, ode_data[1, :], label = "data")
scatter!(pl, t, pred[1, :], label = "prediction")
display(plot(pl))
return false
end
cb(θ, loss_n_ode(θ)...)
data = Iterators.repeated((), 1000)
res1 = sciml_train(loss_n_ode, θ, ADAM(0.05); cb = cb, maxiters = 100)
cb(res1.minimizer, loss_n_ode(res1.minimizer)...; doplot = true)
res2 = sciml_train(loss_n_ode, res1.minimizer, LBFGS(); cb = cb)
cb(res2.minimizer, loss_n_ode(res2.minimizer)...; doplot = true)