Calibrating Models to Data

In this tutorial, we will showcase the tooling for fitting models to data. Let's take our favorite 2nd order Lorenz equation form as our model:

using EasyModelAnalysis

@parameters t σ ρ β
@variables x(t) y(t) z(t)
D = Differential(t)

eqs = [D(D(x)) ~ σ * (y - x),
    D(y) ~ x * (ρ - z) - y,
    D(z) ~ x * y - β * z]

@named sys = ODESystem(eqs)
sys = structural_simplify(sys)

u0 = [D(x) => 2.0,
    x => 1.0,
    y => 0.0,
    z => 0.0]

p = [σ => 28.0,
    ρ => 10.0,
    β => 8 / 3]

tspan = (0.0, 100.0)
prob = ODEProblem(sys, u0, tspan, p, jac = true)
sol = solve(prob)

retcode: Success
Interpolation: specialized 4th order "free" interpolation, specialized 2nd order "free" stiffness-aware interpolation
t: 1495-element Vector{Float64}:
   0.0
   0.00014131524176735154
   0.0015544676594408668
   0.012956381792766277
   0.038266700044138006
   0.07852291796751046
   0.12210864065288088
   0.17155375443120538
   0.2311290630460472
   0.30252383872495503
   ⋮
  99.50500396864174
  99.57665667367455
  99.64283640514518
  99.72166966002771
  99.79967562312255
  99.86801143860899
  99.93304884850775
  99.99013919862291
 100.0
u: 1495-element Vector{Vector{Float64}}:
 [2.0, 0.0, 0.0, 1.0]
 [1.9960454103704428, 0.0014132521265988688, 9.986094350108765e-8, 1.0002823510089436]
 [1.956746201469907, 0.01555657123298723, 1.2096178746316906e-5, 1.0030752466310615]
 [1.6563910957343149, 0.13029732534611171, 0.0008464888325952064, 1.023645017393648]
 [1.1011942420041494, 0.3874130558030291, 0.007445311135321615, 1.0582098436702096]
 [0.5557298463522196, 0.7959541011040255, 0.031202921104640697, 1.0901348104540605]
 [0.44950102412404563, 1.229007863529698, 0.07390400544865926, 1.1102092717261387]
 [0.9266726317320947, 1.7062381668292075, 0.14159577334379553, 1.1416871365805596]
 [2.2781005968110417, 2.2773793271385414, 0.25114094267068454, 1.233177108242911]
 [4.867419855792264, 3.014693593518013, 0.44047994481421116, 1.4824570579833207]
 ⋮
 [80.19625687470715, 0.22319498068561278, 6.377998459496999, -10.92101238125288]
 [93.91126047193579, -1.9435112337378833, 5.7144265707275865, -4.580907294954086]
 [92.47053186268388, -2.2128845219633724, 4.966589670343976, 1.653117805222435]
 [78.34378276619968, 0.2736929875090918, 3.6720087645277166, 8.448758815091098]
 [59.61988722700268, 5.207887350906977, 5.364596386122552, 13.834326658988568]
 [42.06608771509131, 6.848721336825014, 10.923029356998175, 17.327760613533872]
 [17.746361864729497, 2.561462421673927, 14.710142641007621, 19.336065831253446]
 [-13.425204706776736, -1.9987153748274389, 12.635789734327103, 19.49627693453924]
 [-19.39944052157736, -2.411052627709995, 11.88863957641005, 19.334495160672596]

Let's create a dataset with some set of parameters, and show how the datafit function can be used to discover the parameters that generated the data. To start, let's show the data format by generating a dataset. A dataset contains an array t of the time points for the data, and maps [observable => timeseries] where the timeseries is an array of values for the observable at the time points of t. We can use the get_timeseries function to generate a dataset like:

tsave = [1.0, 2.0, 3.0]
data = [x => get_timeseries(prob, x, tsave), z => get_timeseries(prob, z, tsave)]

2-element Vector{Pair{Num, Vector{Float64}}}:
 x(t) => [11.352378507900013, 11.818374125301172, -10.730122690040698]
 z(t) => [8.983834702143962, 4.623642052356029, 6.172956575154105]

Now let's do a datafit. We need to choose initial parameters for the fitting process and call the datafit:

psub_ini = [σ => 27.0, β => 3.0]
fit = datafit(prob, psub_ini, tsave, data)

2-element Vector{Pair{Num, Float64}}:
 σ => 28.000000012195933
 β => 2.66666667211938

Recall that our starting parameters, the parameters the dataset was generated from, was [σ => 28.0, ρ => 10.0, β => 8 / 3]. Looks like this did a good job at recovering them!