Policy Search on the Driven Inverted Pendulum

In this demonstration, we'll search for a neural network policy to stabilize the upright equilibrium of the inverted pendulum.

The governing differential equation for the driven pendulum is:

\[\frac{d^2 \theta}{dt^2} + 2 \zeta \omega_0 \frac{d \theta}{dt} + \omega_0^2 \sin(\theta) = \tau,\]

where $\theta$ is the counterclockwise angle from the downward equilibrium, $\zeta$ and $\omega_0$ are system parameters, and $\tau$ is our control input—the torque.

We'll jointly train a neural controller $\tau = u \left( \theta, \frac{d\theta}{dt} \right)$ and neural Lyapunov function $V$ which will certify the stability of the closed-loop system.

Copy-Pastable Code

using NeuralPDE, Lux, ModelingToolkit, NeuralLyapunov
using ModelingToolkit: inputs
using NeuralLyapunovProblemLibrary
import Boltz.Layers: PeriodicEmbedding
import Optimization, OptimizationOptimisers, OptimizationOptimJL
using Random, StableRNGs

rng = StableRNG(0)
Random.seed!(200)

######################### Define dynamics and domain ##########################

@named pendulum = Pendulum(; defaults = [0.5, 1.0])

t, = independent_variables(pendulum)
Dt = Differential(t)
θ, = setdiff(unknowns(pendulum), inputs(pendulum))

bounds = [
    θ ∈ (0, 2π),
    Dt(θ) ∈ (-2.0, 2.0)
]

upright_equilibrium = [π, 0.0]

####################### Specify neural Lyapunov problem #######################

# Define neural network discretization
# We use an input layer that is periodic with period 2π with respect to θ
dim_state = length(bounds)
dim_hidden = 20
dim_phi = 2
dim_u = 1
dim_output = dim_phi + dim_u
chain = [Chain(
             PeriodicEmbedding([1], [2π]),
             Dense(3, dim_hidden, tanh),
             Dense(dim_hidden, dim_hidden, tanh),
             Dense(dim_hidden, 1)
         ) for _ in 1:dim_output]
ps = Lux.initialparameters(rng, chain)

# Define neural network discretization
strategy = QuasiRandomTraining(5000)
discretization = PhysicsInformedNN(chain, strategy; init_params = ps)

# Define neural Lyapunov structure
periodic_pos_def = function (state, fixed_point)
    θ, ω = state
    θ_eq, ω_eq = fixed_point
    return (sin(θ) - sin(θ_eq))^2 + (cos(θ) - cos(θ_eq))^2 + 0.1 * (ω - ω_eq)^2
end

structure = PositiveSemiDefiniteStructure(
    dim_phi;
    pos_def = (x, x0) -> log(1.0 + periodic_pos_def(x, x0))
)
structure = add_policy_search(structure, dim_u)

minimization_condition = DontCheckNonnegativity(check_fixed_point = false)

# Define Lyapunov decrease condition
decrease_condition = AsymptoticStability(strength = periodic_pos_def)

# Construct neural Lyapunov specification
spec = NeuralLyapunovSpecification(
    structure,
    minimization_condition,
    decrease_condition
)

############################# Construct PDESystem #############################

@named pde_system = NeuralLyapunovPDESystem(
    pendulum,
    bounds,
    spec;
    fixed_point = upright_equilibrium
)

######################## Construct OptimizationProblem ########################

prob = discretize(pde_system, discretization)

########################## Solve OptimizationProblem ##########################

res = Optimization.solve(prob, OptimizationOptimisers.Adam(0.01); maxiters = 300)
prob = Optimization.remake(prob, u0 = res.u)
res = Optimization.solve(prob, OptimizationOptimJL.BFGS(); maxiters = 300)

###################### Get numerical numerical functions ######################

net = discretization.phi
_θ = res.u.depvar

pendulum_io, _ = structural_simplify(pendulum, (inputs(pendulum), []); simplify = true, split = false)
open_loop_pendulum_dynamics = ODEInputFunction(pendulum_io)
state_order = unknowns(pendulum_io)
p = [defaults(pendulum)[param] for param in parameters(pendulum)]

V_func, V̇_func = get_numerical_lyapunov_function(
    net,
    _θ,
    structure,
    open_loop_pendulum_dynamics,
    upright_equilibrium;
    p
)

u = get_policy(net, _θ, dim_output, dim_u)

Detailed description

In this example, we'll use the Pendulum model in NeuralLyaupnovProblemLibrary.jl.

Since the angle $\theta$ is periodic with period $2\pi$, our box domain will be one period in $\theta$ and an interval in $\frac{d\theta}{dt}$.

using ModelingToolkit, NeuralLyapunovProblemLibrary
using ModelingToolkit: inputs

@named pendulum = Pendulum(; defaults = [0.5, 1.0])

t, = independent_variables(pendulum)
Dt = Differential(t)
θ, = setdiff(unknowns(pendulum), inputs(pendulum))

bounds = [
    θ ∈ (0, 2π),
    Dt(θ) ∈ (-2.0, 2.0)
]

upright_equilibrium = [π, 0.0]

2-element Vector{Float64}:
 3.141592653589793
 0.0

We'll use an architecture that's $2\pi$-periodic in $\theta$ so that we can train on just one period of $\theta$ and don't need to add any periodic boundary conditions. To achieve that, we use Boltz.Layers.PeriodicEmbedding([1], [2pi]), enforces 2pi-periodicity in input number 1. Additionally, we include output dimensions for both the neural Lyapunov function and the neural controller.

Other than that, setting up the neural network using Lux and NeuralPDE training strategy is no different from any other physics-informed neural network problem. For more on that aspect, see the NeuralPDE documentation.

using Lux
import Boltz.Layers: PeriodicEmbedding
using StableRNGs

# Stable random number generator for doc stability
rng = StableRNG(0)

# Define neural network discretization
# We use an input layer that is periodic with period 2π with respect to θ
dim_state = length(bounds)
dim_hidden = 20
dim_phi = 2
dim_u = 1
dim_output = dim_phi + dim_u
chain = [Chain(
             PeriodicEmbedding([1], [2π]),
             Dense(3, dim_hidden, tanh),
             Dense(dim_hidden, dim_hidden, tanh),
             Dense(dim_hidden, 1)
         ) for _ in 1:dim_output]
ps = Lux.initialparameters(rng, chain)

3-element Vector{@NamedTuple{layer_1::@NamedTuple{}, layer_2::@NamedTuple{weight::Matrix{Float32}, bias::Vector{Float32}}, layer_3::@NamedTuple{weight::Matrix{Float32}, bias::Vector{Float32}}, layer_4::@NamedTuple{weight::Matrix{Float32}, bias::Vector{Float32}}}}:
 (layer_1 = NamedTuple(), layer_2 = (weight = [-0.26433903 0.2830732 -0.62106645; -0.039649405 0.40763298 0.2913169; … ; -0.8883643 0.489705 -1.4679849; -0.17240882 -1.4305003 -1.0682995], bias = [-0.55013764, 0.45465338, 0.3597397, 0.46018323, 0.23264593, 0.4101296, -0.33198538, -0.31283757, -0.10249331, 0.124832004, -0.4786048, -0.47649404, -0.3066547, -0.51732564, -0.068483725, 0.37249732, -0.3898576, -0.48717302, -0.24934287, -0.43892094]), layer_3 = (weight = [-0.21776272 -0.34976596 … -0.07492888 -0.1816312; -0.17461373 -0.6129018 … -0.31065863 -0.28332594; … ; 0.21974047 -0.57390946 … -0.40469018 0.55749774; 0.01034552 0.36827672 … -0.13900807 0.42288408], bias = [-0.019964656, -0.16343273, 0.08402384, 0.09134774, -0.05078796, -0.018607918, -0.19923063, -0.17642573, -0.035865873, -0.03719126, -0.11826002, 0.0431049, 0.14082043, -0.1422888, -0.020249557, 0.12968366, 0.12217136, -0.17707554, -0.21917662, 0.14492567]), layer_4 = (weight = [-0.17068821 -0.03092834 … -0.10102869 -0.1681162], bias = [0.19015142]))
 (layer_1 = NamedTuple(), layer_2 = (weight = [0.4529027 1.443388 -0.31716028; 0.033870935 0.83998877 -0.051817495; … ; -1.1376889 0.35788375 1.5499341; 0.9119216 -1.1838547 -1.599474], bias = [0.34834492, 0.15930657, -0.21511388, 0.025930014, 0.39921978, 0.1258414, -0.46898022, 0.5242924, 0.23717038, -0.29243082, 0.1208722, 0.2826986, -0.08826955, 0.015594073, 0.1750767, -0.09729932, -0.085585356, 0.27821034, -0.22516075, 0.20517863]), layer_3 = (weight = [-0.07793944 -0.50113344 … -0.5660452 -0.07459138; 0.13848497 -0.40615082 … -0.14559324 0.1364178; … ; -0.6411145 -0.070235744 … 0.3140975 -0.12388231; -0.40709347 -0.5675738 … -0.019558502 -0.56194586], bias = [0.1673631, -0.18943103, 0.087536946, -0.09299274, -0.052728675, -0.1021382, 0.20456086, 0.21405019, 0.17581242, 0.16523755, 0.058062863, -0.17226306, -0.11093681, -0.02248674, 0.12039122, -0.14628801, -0.018348983, -0.19442348, 0.15810403, -0.16277598]), layer_4 = (weight = [-0.2152469 0.075140126 … -0.06033419 0.19702055], bias = [0.18441324]))
 (layer_1 = NamedTuple(), layer_2 = (weight = [1.3531283 -1.1611402 -1.1498744; -0.60706973 -0.87534785 -1.1864225; … ; -0.97135144 -1.1263592 1.252127; -0.2671659 1.4633731 -0.17075458], bias = [0.5440468, 0.17839725, 0.5717579, 0.4303946, 0.20876624, -0.5018816, -0.31148928, 0.5507251, -0.2713135, -0.4987273, -0.12064438, -0.20676823, 0.1690429, -0.3154422, -0.4288811, 0.25483117, -0.4936038, 0.45125064, -0.16123244, 0.5619552]), layer_3 = (weight = [0.42750365 -0.3552184 … 0.641006 -0.57306874; 0.02126262 0.16557665 … 0.6031217 -0.50058866; … ; 0.13278042 -0.5698033 … 0.30097595 0.019470165; 0.38752958 0.46178284 … -0.25720507 0.31906873], bias = [-0.17075913, -0.21226116, -0.20997524, -0.09867345, -0.17112853, -0.053462356, 0.16687721, 0.074906684, -0.0567267, 0.17167167, 0.0858172, 0.0014490739, -0.043041777, 0.047712706, -0.17025767, -0.17044912, -0.04975893, -0.080039464, 0.024078798, -0.0031573507]), layer_4 = (weight = [0.036070336 0.30884385 … -0.24699771 -0.33209208], bias = [0.12467814]))

using NeuralPDE

# Define neural network discretization
strategy = QuasiRandomTraining(5000)
discretization = PhysicsInformedNN(chain, strategy; init_params = ps)

We now define our Lyapunov candidate structure along with the form of the Lyapunov conditions we'll be using.

The default Lyapunov candidate from PositiveSemiDefiniteStructure is:

\[V(x) = \left( 1 + \lVert \phi(x) \rVert^2 \right) \log \left( 1 + \lVert x \rVert^2 \right),\]

which structurally enforces positive definiteness. We'll modify the second factor to be $2\pi$-periodic in $\theta$:

using NeuralLyapunov

# Define neural Lyapunov structure
periodic_pos_def = function (state, fixed_point)
    θ, ω = state
    θ_eq, ω_eq = fixed_point
    return (sin(θ) - sin(θ_eq))^2 + (cos(θ) - cos(θ_eq))^2 + 0.1 * (ω - ω_eq)^2
end

structure = PositiveSemiDefiniteStructure(
    dim_phi;
    pos_def = (x, x0) -> log(1.0 + periodic_pos_def(x, x0))
)

In addition to representing the neural Lyapunov function, our neural network must also represent the controller. For this, we use the add_policy_search function, which tells NeuralLyapunov to expect dynamics with a control input and to treat the last dim_u dimensions of the neural network as the output of our controller.

structure = add_policy_search(structure, dim_u)

Since our Lyapunov candidate structurally enforces positive definiteness, we use DontCheckNonnegativity.

minimization_condition = DontCheckNonnegativity(check_fixed_point = false)

# Define Lyapunov decrease condition
decrease_condition = AsymptoticStability(strength = periodic_pos_def)

# Construct neural Lyapunov specification
spec = NeuralLyapunovSpecification(
    structure,
    minimization_condition,
    decrease_condition
)

# Construct PDESystem
@named pde_system = NeuralLyapunovPDESystem(
    pendulum,
    bounds,
    spec;
    fixed_point = upright_equilibrium
)

\[ \begin{align} max\left( 0, \left( 1 + \cos\left( \theta \right) \right)^{2} + \left( -1.2246 \cdot 10^{-16} + \sin\left( \theta \right) \right)^{2} + 0.1 \mathtt{{\theta}ˍt}^{2} + \left( \frac{\left( 2 \cos\left( \theta \right) \left( -1.2246 \cdot 10^{-16} + \sin\left( \theta \right) \right) - 2 \left( 1 + \cos\left( \theta \right) \right) \sin\left( \theta \right) \right) \mathtt{{\theta}ˍt}}{1 + \left( 1 + \cos\left( \theta \right) \right)^{2} + \left( -1.2246 \cdot 10^{-16} + \sin\left( \theta \right) \right)^{2} + 0.1 \mathtt{{\theta}ˍt}^{2}} + \frac{0.2 \left( \mathtt{{\varphi}3}\left( \theta, \mathtt{{\theta}ˍt} \right) - \mathtt{\omega\_0}^{2} \sin\left( \theta \right) - 2 \zeta \mathtt{{\theta}ˍt} \mathtt{\omega\_0} \right) \mathtt{{\theta}ˍt}}{1 + \left( 1 + \cos\left( \theta \right) \right)^{2} + \left( -1.2246 \cdot 10^{-16} + \sin\left( \theta \right) \right)^{2} + 0.1 \mathtt{{\theta}ˍt}^{2}} \right) \left( 1 + \left( \mathtt{{\varphi}1}\left( \theta, \mathtt{{\theta}ˍt} \right) \right)^{2} + \left( \mathtt{{\varphi}2}\left( \theta, \mathtt{{\theta}ˍt} \right) \right)^{2} \right) + \left( \left( 2 \mathtt{{\varphi}1}\left( \theta, \mathtt{{\theta}ˍt} \right) \frac{\mathrm{d}}{\mathrm{d}\theta} \mathtt{{\varphi}1}\left( \theta, \mathtt{{\theta}ˍt} \right) + 2 \mathtt{{\varphi}2}\left( \theta, \mathtt{{\theta}ˍt} \right) \frac{\mathrm{d}}{\mathrm{d}\theta} \mathtt{{\varphi}2}\left( \theta, \mathtt{{\theta}ˍt} \right) \right) \mathtt{{\theta}ˍt} + \left( 2 \mathtt{{\varphi}1}\left( \theta, \mathtt{{\theta}ˍt} \right) \frac{\mathrm{d}}{\mathrm{d}{\theta}ˍt} \mathtt{{\varphi}1}\left( \theta, \mathtt{{\theta}ˍt} \right) + 2 \mathtt{{\varphi}2}\left( \theta, \mathtt{{\theta}ˍt} \right) \frac{\mathrm{d}}{\mathrm{d}{\theta}ˍt} \mathtt{{\varphi}2}\left( \theta, \mathtt{{\theta}ˍt} \right) \right) \left( \mathtt{{\varphi}3}\left( \theta, \mathtt{{\theta}ˍt} \right) - \mathtt{\omega\_0}^{2} \sin\left( \theta \right) - 2 \zeta \mathtt{{\theta}ˍt} \mathtt{\omega\_0} \right) \right) \log\left( 1 + \left( 1 + \cos\left( \theta \right) \right)^{2} + \left( -1.2246 \cdot 10^{-16} + \sin\left( \theta \right) \right)^{2} + 0.1 \mathtt{{\theta}ˍt}^{2} \right) \right) &= 0 \end{align} \]

Now, we solve the PDESystem using NeuralPDE the same way we would any PINN problem.

prob = discretize(pde_system, discretization)

import Optimization, OptimizationOptimisers, OptimizationOptimJL

res = Optimization.solve(prob, OptimizationOptimisers.Adam(0.01); maxiters = 300)
prob = Optimization.remake(prob, u0 = res.u)
res = Optimization.solve(prob, OptimizationOptimJL.BFGS(); maxiters = 300)

net = discretization.phi
_θ = res.u.depvar

ComponentVector{Float32,SubArray...}(φ1 = (layer_1 = Float32[], layer_2 = (weight = Float32[-0.10639657 0.30267337 -0.7722307; 0.06996152 0.47370955 0.2533549; … ; -0.91650856 0.7612054 -1.3797609; -0.38924664 -1.2280734 -0.84210765], bias = Float32[-0.5773875, 0.20820892, 0.49434707, 0.65647376, 0.340033, 0.41590744, -0.48091862, -0.38298103, -0.24671797, 0.06375989, -0.80077225, -0.35859793, -0.09071622, -0.61740166, 0.09251234, 0.51609045, -0.49332482, -0.30852902, -0.2615888, -0.62776905]), layer_3 = (weight = Float32[-0.3389478 -0.2596825 … -0.17831273 -0.24969305; -0.3123554 -0.6475737 … -0.36357367 -0.3147931; … ; -0.04068461 -0.68569547 … -0.53431386 0.7798362; -0.13479988 0.4279255 … -0.26474032 0.39479595], bias = Float32[0.10761466, -0.069829516, 0.083255336, 0.10237692, -0.16406687, 0.04673263, -0.3759698, -0.08605017, -0.1915758, 0.12391668, 0.119334705, 0.22417074, 0.19426678, -0.08829182, -0.1846068, 0.26711664, 0.0914689, -0.26736787, -0.2272467, 0.30066693]), layer_4 = (weight = Float32[-0.10006267 -0.12886049 … -0.13179514 -0.11771296], bias = Float32[0.05587957])), φ2 = (layer_1 = Float32[], layer_2 = (weight = Float32[0.6232376 1.2259133 -0.2814056; 0.29528016 0.6881711 -0.13957243; … ; -1.4525286 0.451149 1.3450874; 1.2691764 -0.98925906 -1.609028], bias = Float32[0.31574285, -0.00844658, -0.14169128, -0.30798885, 0.36180624, 0.036004573, -0.62573534, 0.26874635, -0.07081716, -0.5207596, 0.46874884, 0.055873394, -0.018761924, -0.22622693, 0.23359978, -0.286298, -0.24047802, 0.46379393, -0.1464708, 0.05139818]), layer_3 = (weight = Float32[-0.082826614 -0.5509435 … -0.6922915 0.043114472; 0.094607525 -0.43538997 … 0.1467607 -0.08994789; … ; -0.79381025 -0.25787446 … 0.46954158 -0.17433472; -0.4675272 -0.45737147 … 0.25806606 -0.84071827], bias = Float32[0.0017735198, 0.13871212, 0.2775634, -0.34344304, -0.3589435, 0.12924156, 0.4592532, 0.5095944, 0.32381585, -0.07757287, -0.0066849897, -0.08823631, 0.18555146, 0.37452713, 0.3847045, 0.08027148, -0.28138393, -0.054150622, 0.19430184, -0.011699902]), layer_4 = (weight = Float32[0.1349606 0.34738702 … 0.26712406 0.29420334], bias = Float32[0.43385747])), φ3 = (layer_1 = Float32[], layer_2 = (weight = Float32[1.3856261 -0.6675036 -1.4267935; -0.92330414 -0.4711782 -0.80170673; … ; -1.0064325 -0.9228406 1.3093114; -0.43762106 1.4251043 -0.04046072], bias = Float32[0.39382556, 0.51321405, 0.49340665, 0.44752014, 0.35876116, -0.5167582, -0.25653312, 0.59983104, -0.19539228, -0.6094891, -0.01641454, -0.18480913, 0.07728908, -0.06800781, -0.563009, 0.45286593, -0.5498646, 0.5678011, -0.3215102, 0.63242286]), layer_3 = (weight = Float32[0.7657118 -0.18179917 … 0.54620814 -0.8768274; -0.3034732 0.17606303 … 0.7039245 -0.066652566; … ; 0.19639863 -0.60132575 … 0.27855143 -0.1613769; 0.5796299 0.3654295 … -0.3883962 0.011795003], bias = Float32[-0.19212097, -0.22927378, -0.34929484, -0.05059889, -0.30119523, 0.11735326, 0.24301356, 0.12129468, 0.023024533, 0.027390802, 0.18449031, 0.24796984, 0.047610603, 0.12539183, -0.18222226, -0.19695055, 0.05292825, 0.026986388, -0.07024871, -0.25226706]), layer_4 = (weight = Float32[-0.12918 0.29250497 … -0.08681926 -0.40222758], bias = Float32[0.23931138])))

We can use the result of the optimization problem to build the Lyapunov candidate as a Julia function, as well as extract our controller, using the get_policy function.

pendulum_io, _ = structural_simplify(pendulum, (inputs(pendulum), []); simplify = true, split = false)
open_loop_pendulum_dynamics = ODEInputFunction(pendulum_io)
state_order = unknowns(pendulum_io)
p = [defaults(pendulum)[param] for param in parameters(pendulum)]

V_func, V̇_func = get_numerical_lyapunov_function(
    net,
    _θ,
    structure,
    open_loop_pendulum_dynamics,
    upright_equilibrium;
    p = p
)

u = get_policy(net, _θ, dim_output, dim_u)

Now, let's evaluate our controller. First, we'll get the usual summary statistics on the Lyapunov function and plot $V$, $\dot{V}$, and the violations of the decrease condition.

lb = [0.0, -2.0];
ub = [2π, 2.0];
xs = (-2π):0.1:(2π)
ys = lb[2]:0.1:ub[2]
states = Iterators.map(collect, Iterators.product(xs, ys))
V_samples = vec(V_func(hcat(states...)))
V̇_samples = vec(V̇_func(hcat(states...)))

# Print statistics
println("V(π, 0) = ", V_func(upright_equilibrium))
println(
    "f([π, 0], u([π, 0])) = ",
    open_loop_pendulum_dynamics(upright_equilibrium, u(upright_equilibrium), p, 0.0)
)
println(
    "V ∋ [",
    min(V_func(upright_equilibrium),
    minimum(V_samples)),
    ", ",
    maximum(V_samples),
    "]"
)
println(
    "V̇ ∋ [",
    minimum(V̇_samples),
    ", ",
    max(V̇_func(upright_equilibrium), maximum(V̇_samples)),
    "]"
)

┌ Warning: Mixed-Precision `matmul_cpu_fallback!` detected and Octavian.jl cannot be used for this set of inputs (C [Matrix{Float64}]: A [Base.ReshapedArray{Float32, 2, SubArray{Float32, 1, Vector{Float32}, Tuple{UnitRange{Int64}}, true}, Tuple{}}] x B [Matrix{Float64}]). Falling back to generic implementation. This may be slow.
└ @ LuxLib.Impl ~/.julia/packages/LuxLib/1B1qw/src/impl/matmul.jl:190
┌ Warning: Mixed-Precision `matmul_cpu_fallback!` detected and Octavian.jl cannot be used for this set of inputs (C [Matrix{Float64}]: A [Base.ReshapedArray{Float32, 2, SubArray{Float32, 1, Vector{Float32}, Tuple{UnitRange{Int64}}, true}, Tuple{}}] x B [Matrix{Float64}]). Falling back to generic implementation. This may be slow.
└ @ LuxLib.Impl ~/.julia/packages/LuxLib/1B1qw/src/impl/matmul.jl:190
┌ Warning: Mixed-Precision `matmul_cpu_fallback!` detected and Octavian.jl cannot be used for this set of inputs (C [Matrix{Float64}]: A [Base.ReshapedArray{Float32, 2, SubArray{Float32, 1, Vector{Float32}, Tuple{UnitRange{Int64}}, true}, Tuple{}}] x B [Matrix{Float64}]). Falling back to generic implementation. This may be slow.
└ @ LuxLib.Impl ~/.julia/packages/LuxLib/1B1qw/src/impl/matmul.jl:190
V(π, 0) = 0.0
┌ Warning: Mixed-Precision `matmul_cpu_fallback!` detected and Octavian.jl cannot be used for this set of inputs (C [Matrix{Float64}]: A [Base.ReshapedArray{Float32, 2, SubArray{Float32, 1, Vector{Float32}, Tuple{UnitRange{Int64}}, true}, Tuple{}}] x B [Matrix{Float64}]). Falling back to generic implementation. This may be slow.
└ @ LuxLib.Impl ~/.julia/packages/LuxLib/1B1qw/src/impl/matmul.jl:190
f([π, 0], u([π, 0])) = [0.0, 0.0023310428558180937]
┌ Warning: Mixed-Precision `matmul_cpu_fallback!` detected and Octavian.jl cannot be used for this set of inputs (C [Matrix{Float64}]: A [Base.ReshapedArray{Float32, 2, SubArray{Float32, 1, Vector{Float32}, Tuple{UnitRange{Int64}}, true}, Tuple{}}] x B [Matrix{Float64}]). Falling back to generic implementation. This may be slow.
└ @ LuxLib.Impl ~/.julia/packages/LuxLib/1B1qw/src/impl/matmul.jl:190
V ∋ [0.0, 69.01734616984959]
┌ Warning: Mixed-Precision `matmul_cpu_fallback!` detected and Octavian.jl cannot be used for this set of inputs (C [Matrix{Float64}]: A [Base.ReshapedArray{Float32, 2, SubArray{Float32, 1, Vector{Float32}, Tuple{UnitRange{Int64}}, true}, Tuple{}}] x B [Matrix{Float64}]). Falling back to generic implementation. This may be slow.
└ @ LuxLib.Impl ~/.julia/packages/LuxLib/1B1qw/src/impl/matmul.jl:190
V̇ ∋ [-163.85617475813368, 0.07681333173334363]

using Plots

p1 = plot(
    xs / pi,
    ys,
    V_samples,
    linetype =
    :contourf,
    title = "V",
    xlabel = "θ/π",
    ylabel = "ω",
    c = :bone_1
);
p1 = scatter!([-2 * pi, 0, 2 * pi] / pi, [0, 0, 0],
    label = "Downward Equilibria", color = :red, markershape = :x);
p1 = scatter!(
    [-pi, pi] / pi, [0, 0], label = "Upward Equilibria", color = :green, markershape = :+);
p2 = plot(
    xs / pi,
    ys,
    V̇_samples,
    linetype = :contourf,
    title = "dV/dt",
    xlabel = "θ/π",
    ylabel = "ω",
    c = :binary
);
p2 = scatter!([-2 * pi, 0, 2 * pi] / pi, [0, 0, 0],
    label = "Downward Equilibria", color = :red, markershape = :x);
p2 = scatter!([-pi, pi] / pi, [0, 0], label = "Upward Equilibria", color = :green,
    markershape = :+, legend = false);
p3 = plot(
    xs / pi,
    ys,
    V̇_samples .< 0,
    linetype = :contourf,
    title = "dV/dt < 0",
    xlabel = "θ/π",
    ylabel = "ω",
    colorbar = false,
    linewidth = 0
);
p3 = scatter!([-2 * pi, 0, 2 * pi] / pi, [0, 0, 0],
    label = "Downward Equilibria", color = :green, markershape = :+);
p3 = scatter!([-pi, pi] / pi, [0, 0], label = "Upward Equilibria",
    color = :red, markershape = :x, legend = false);
plot(p1, p2, p3)

Now, let's simulate the closed-loop dynamics to verify that the controller can get our system to the upward equilibrium.

First, we'll start at the downward equilibrium:

state_order = map(st -> SymbolicUtils.isterm(st) ? operation(st) : st, state_order)
state_syms = Symbol.(state_order)

closed_loop_dynamics = ODEFunction(
    (x, p, t) -> open_loop_pendulum_dynamics(x, u(x), p, t);
    sys = SciMLBase.SymbolCache(state_syms, Symbol.(parameters(pendulum)))
)

using OrdinaryDiffEq: Tsit5

# Starting still at bottom ...
downward_equilibrium = zeros(2)
ode_prob = ODEProblem(closed_loop_dynamics, downward_equilibrium, [0.0, 120.0], p)
sol = solve(ode_prob, Tsit5())
plot(sol)

# ...the system should make it to the top
θ_end, ω_end = sol.u[end]
x_end, y_end = sin(θ_end), -cos(θ_end)
[x_end, y_end, ω_end] # Should be approximately [0.0, 1.0, 0.0]

3-element Vector{Float64}:
 -0.003181410307668031
  0.9999949393014218
 -5.71762627906218e-8

Then, we'll start at a random state:

# Starting at a random point ...
x0 = lb .+ rand(2) .* (ub .- lb)
ode_prob = ODEProblem(closed_loop_dynamics, x0, [0.0, 150.0], p)
sol = solve(ode_prob, Tsit5())
plot(sol)

# ...the system should make it to the top
θ_end, ω_end = sol.u[end]
x_end, y_end = sin(θ_end), -cos(θ_end)
[x_end, y_end, ω_end] # Should be approximately [0.0, 1.0, 0.0]

3-element Vector{Float64}:
 -0.0031814125121476873
  0.9999949392944084
 -6.289785571747594e-8