Benchmarking a neural Lyapunov method

In this demonstration, we'll benchmark the neural Lyapunov method used in the policy search demo. In that demonstration, we searched for a neural network policy to stabilize the upright equilibrium of the inverted pendulum. Here, we will use the benchmark function to run approximately the same training, then check the performance the of the resulting controller and neural Lyapunov function by simulating the closed loop system to see (1) how well the controller drives the pendulum to the upright equilibrium, and (2) how well the neural Lyapunov function performs as a classifier of whether a state is in the region of attraction or not. These results will be represented by a confusion matrix using the simulation results as ground truth. (Keep in mind that training does no simulation.)

Copy-Pastable Code

using NeuralPDE, NeuralLyapunov, Lux
using Boltz.Layers: PeriodicEmbedding
import OptimizationOptimisers, OptimizationOptimJL
using StableRNGs, Random

rng = StableRNG(0)
Random.seed!(200)

# Define dynamics and domain
function open_loop_pendulum_dynamics(x, u, p, t)
    θ, ω = x
    ζ, ω_0 = p
    τ = u[]
    return [ω, -2ζ * ω_0 * ω - ω_0^2 * sin(θ) + τ]
end

lb = Float32[0.0, -2.0];
ub = Float32[2π, 2.0];
upright_equilibrium = Float32[π, 0.0]
p = Float32[0.5, 1.0]
state_syms = [:θ, :ω]
parameter_syms = [:ζ, :ω_0]

# Define neural network discretization
# We use an input layer that is periodic with period 2π with respect to θ
dim_state = length(lb)
dim_hidden = 25
dim_phi = 3
dim_u = 1
dim_output = dim_phi + dim_u
chain = [Chain(
             PeriodicEmbedding([1], Float32[2π]),
             Dense(dim_state + 1, dim_hidden, tanh),
             Dense(dim_hidden, dim_hidden, tanh),
             Dense(dim_hidden, 1)
         ) for _ in 1:dim_output]
ps, st = Lux.setup(rng, chain)

# Define neural network discretization
strategy = QuasiRandomTraining(10000)

# Define neural Lyapunov structure
periodic_pos_def = function (state, fixed_point)
    θ, ω = state
    θ_eq, ω_eq = fixed_point
    return (sin(θ) - sin(θ_eq))^2 + (cos(θ) - cos(θ_eq))^2 + (ω - ω_eq)^2 / 10
end

structure = PositiveSemiDefiniteStructure(
    dim_phi;
    pos_def = (x, x0) -> log(1 + periodic_pos_def(x, x0))
)
structure = add_policy_search(structure, dim_u)

minimization_condition = DontCheckNonnegativity(check_fixed_point = false)

# Define Lyapunov decrease condition
decrease_condition = AsymptoticStability(strength = periodic_pos_def)

# Construct neural Lyapunov specification
spec = NeuralLyapunovSpecification(structure, minimization_condition, decrease_condition)

# Define optimization parameters
opt = [OptimizationOptimisers.Adam(0.05f0), OptimizationOptimJL.BFGS()]
optimization_args = [[:maxiters => 300], [:maxiters => 300]]

# Run benchmark
endpoint_check = (x) -> ≈([sin(x[1]), cos(x[1]), x[2]], [0, -1, 0], atol = 5e-3)
benchmarking_results = benchmark(
    open_loop_pendulum_dynamics,
    lb,
    ub,
    spec,
    chain,
    strategy,
    opt;
    simulation_time = 200,
    n = 1000,
    fixed_point = upright_equilibrium,
    p,
    optimization_args,
    state_syms,
    parameter_syms,
    policy_search = true,
    endpoint_check,
    init_params = ps, 
    init_states = st
)

Detailed Description

Much of the set up is the same as in the policy search demo, so see that page for details.

using NeuralPDE, NeuralLyapunov, Lux
import Boltz.Layers: PeriodicEmbedding
using Random, StableRNGs

Random.seed!(200)

# Define dynamics and domain
function open_loop_pendulum_dynamics(x, u, p, t)
    θ, ω = x
    ζ, ω_0 = p
    τ = u[]
    return [ω, -2ζ * ω_0 * ω - ω_0^2 * sin(θ) + τ]
end

lb = Float32[0.0, -2.0];
ub = Float32[2π, 2.0];
upright_equilibrium = Float32[π, 0.0]
p = Float32[0.5, 1.0]
state_syms = [:θ, :ω]
parameter_syms = [:ζ, :ω_0]

# Define neural network discretization
# We use an input layer that is periodic with period 2π with respect to θ
dim_state = length(lb)
dim_hidden = 25
dim_phi = 3
dim_u = 1
dim_output = dim_phi + dim_u
chain = [Chain(
             PeriodicEmbedding([1], Float32[2π]),
             Dense(3, dim_hidden, tanh),
             Dense(dim_hidden, dim_hidden, tanh),
             Dense(dim_hidden, 1)
         ) for _ in 1:dim_output]
ps, st = Lux.setup(StableRNG(0), chain)

# Define neural network discretization
strategy = QuasiRandomTraining(10000)

# Define neural Lyapunov structure
periodic_pos_def = function (state, fixed_point)
    θ, ω = state
    θ_eq, ω_eq = fixed_point
    return (sin(θ) - sin(θ_eq))^2 + (cos(θ) - cos(θ_eq))^2 + (ω - ω_eq)^2 / 10
end

structure = PositiveSemiDefiniteStructure(
    dim_phi;
    pos_def = (x, x0) -> log(1 + periodic_pos_def(x, x0))
)
structure = add_policy_search(structure, dim_u)

minimization_condition = DontCheckNonnegativity(check_fixed_point = false)

# Define Lyapunov decrease condition
decrease_condition = AsymptoticStability(strength = periodic_pos_def)

# Construct neural Lyapunov specification
spec = NeuralLyapunovSpecification(structure, minimization_condition, decrease_condition)

At this point of the policy search demo, we constructed the PDESystem, discretized it using NeuralPDE.jl, and solved the resulting OptimizationProblem using Optimization.jl. All of that occurs in the benchmark function, so we instead provide that function with the optimizer and optimization arguments to use.

import OptimizationOptimisers, OptimizationOptimJL

# Define optimization parameters
opt = [OptimizationOptimisers.Adam(0.05f0), OptimizationOptimJL.BFGS()]
optimization_args = [[:maxiters => 300], [:maxiters => 300]]

Since the pendulum is periodic in $0$, we'll use a custom endpoint check that reflects that property.

endpoint_check = (x) -> ≈([sin(x[1]), cos(x[1]), x[2]], [0, -1, 0], atol=5e-3)

Finally, we can run the benchmark function. For demonstration purposes, we'll use EnsembleSerial(), which simulates each trajectory without any parallelism when evaluating the trained Lyapunov function and controller. The default ensemble_alg is EnsembleThreads(), which uses multithreading (local parallelism only); see the DifferentialEquations.jl docs for more information and other options. Another option is EnsembleGPUArray, which parallelizes the ODE solves on the GPU. Note that this option is imported from DiffEqGPU and has certain restrictions on the dynamics. For example, the dynamics may not allocate memory (build arrays), so in-place dynamics must be defined in addition to the out-of-place dynamics that NeuralLyapunov usually requires. (Providing both can be achieved by defining both methods for the same function and passing in an ODEFunction made from that function.) For this reason, the default value of EnsembleThreads() is recommended, even when training occurs on GPU.

using OrdinaryDiffEq: EnsembleSerial

benchmarking_results = benchmark(
    open_loop_pendulum_dynamics,
    lb,
    ub,
    spec,
    chain,
    strategy,
    opt;
    simulation_time = 200,
    n = 1000,
    fixed_point = upright_equilibrium,
    p,
    optimization_args,
    state_syms,
    parameter_syms,
    policy_search = true,
    ensemble_alg = EnsembleSerial(),
    endpoint_check,
    init_params = ps,
    init_states = st
);

We can observe the confusion matrix and training time:

benchmarking_results.confusion_matrix
┌─────────────┬───────────┬───────────┐
│ Tot = 1000  │ Actual │ Actual │
│             │ positives │ negatives │
├─────────────┼───────────┼───────────┤
│ Prediced │ 1000 │ 0 │
│ positives │           │           │
├─────────────┼───────────┼───────────┤
│ Prediced │ 0 │ 0 │
│ negatives │           │           │
└─────────────┴───────────┴───────────┘
benchmarking_results.training_time
229.214753039

The benchmark function also outputs a DataFrame, data, with the simulation results. The first three rows are shown below.

benchmarking_results.data[1:3, :]
3×7 DataFrame
RowInitial StateFinal StateVdVdtPredicted in RoAActually in RoAClassification
SubArray…Array…Float32Float64BoolBoolString
1Float32[5.82765, -1.63]Float32[3.14217, 3.13712f-7]45.7105-44.8276truetrueTP
2Float32[3.57199, 1.782]Float32[3.14217, 5.63853f-7]13.1541-36.7215truetrueTP
3Float32[1.86296, -1.206]Float32[3.14217, 2.69142f-8]23.9957-35.8321truetrueTP

The benchmark function also outputs the Lyapunov function $V$ and its time-derivative $V̇$.

states = benchmarking_results.data[!, "Initial State"]
V_samples = benchmarking_results.data[!, "V"]
all(benchmarking_results.V.(states) .== V_samples)
true
V̇_samples = benchmarking_results.data[!, "dVdt"]
all(benchmarking_results.V̇.(states) .== V̇_samples)
true

The "Actually in RoA" column is just the result of applying endpoint_check applied to the "End State" column. The "End State" column is the final state of the simulation starting at that "Initial State".

endpoints = benchmarking_results.data[!, "Final State"]
actual = benchmarking_results.data[!, "Actually in RoA"]
all(endpoint_check.(endpoints) .== actual)
true

Similarly, the labels in the "Predicted in RoA" column are the results of the neural Lyapunov classifier.

classifier = (V, V̇, x) -> V̇ < zero(V̇) || endpoint_check(x)
predicted = benchmarking_results.data[!, "Predicted in RoA"]
all(classifier.(V_samples, V̇_samples, states) .== predicted)
true