Structuring a Neural Lyapunov function
Users have two ways to specify the structure of their neural Lyapunov function: through Lux and through NeuralLyapunov. NeuralLyapunov.jl is intended for use with NeuralPDE.jl, which is itself intended for use with Lux.jl. Users therefore must provide a Lux model representing the neural network $\phi(x)$ which will be trained, regardless of the transformation they use to go from $\phi$ to $V$.
In some cases, users will find it simplest to make $\phi(x)$ a simple multilayer perceptron and specify their neural Lyapunov function structure as a transformation of the function $\phi$ to the function $V$ using the NeuralLyapunovStructure struct, detailed below.
In other cases (particularly when they find the NeuralPDE parser has trouble tracing their structure), users may wish to represent their neuralLyapunov function structure using Lux.jl/Boltz.jl layers, integrating them into $\phi(x)$ and letting $V(x) = \phi(x)$ (as in NoAdditionalStructure, detailed below).
Users may also combine the two methods, particularly if they find that their structure can be broken down into a component that the NeuralPDE parser has trouble tracing but exists in Lux/Boltz, and another aspect that can be written easily using a NeuralLyapunovStructure but does not correspond any existing Lux/Boltz layer. (Such an example will be provided below.)
NeuralLyapunov.jl supplies two Lux structures and two pooling layers for structuring $\phi(x)$, along with three NeuralLyapunovStructure transformations. Additionally, users can always specify a custom structure using the NeuralLyapunovStructure struct.
Pre-defined NeuralLyapunov transformations
The simplest structure is to train the neural network directly to be the Lyapunov function, which can be accomplished using an NoAdditionalStructure. This is particularly useful with the pre-defined Lux structures detailed in the following section.
NeuralLyapunov.NoAdditionalStructure — FunctionNoAdditionalStructure()Create a NeuralLyapunovStructure where the Lyapunov function is the neural network evaluated at the state. This does impose any additional structure to enforce any Lyapunov conditions.
Corresponds to $V(x) = ϕ(x)$, where $ϕ$ is the neural network.
Dynamics are assumed to be in f(state, p, t) form, as in an ODEFunction. For f(state, input, p, t), consider using add_policy_search.
The condition that the Lyapunov function $V(x)$ must be minimized uniquely at the fixed point $x_0$ is often represented as a requirement for $V(x)$ to be positive away from the fixed point and zero at the fixed point. Put mathematically, it is sufficient to require $V(x) > 0 \, \forall x \ne x_0$ and $V(x_0) = 0$. We call such functions positive definite (around the fixed point $x_0$).
Two structures are provided which partially or fully enforce the minimization condition: NonnegativeStructure, which structurally enforces $V(x) \ge 0$ everywhere, and PositiveSemiDefiniteStructure, which additionally enforces $V(x_0) = 0$.
NeuralLyapunov.NonnegativeStructure — FunctionNonnegativeStructure(network_dim; <keyword_arguments>)Create a NeuralLyapunovStructure where the Lyapunov function is the L2 norm of the neural network output plus a constant δ times a function pos_def.
Corresponds to $V(x) = \lVert ϕ(x) \rVert^2 + δ \, \texttt{pos\_def}(x, x_0)$, where $ϕ$ is the neural network and $x_0$ is the equilibrium point.
This structure ensures $V(x) ≥ 0 \, ∀ x$ when $δ ≥ 0$ and pos_def is always nonnegative. Further, if $δ > 0$ and pos_def is strictly positive definite around fixed_point, the structure ensures that $V(x)$ is strictly positive away from fixed_point. In such cases, the minimization condition reduces to ensuring $V(x_0) = 0$, and so DontCheckNonnegativity(true) should be used.
Arguments
network_dim: output dimensionality of the neural network.
Keyword Arguments
δ: weight ofpos_def, as above; defaults to 0.pos_def(state, fixed_point): a function that is positive (semi-)definite instatearoundfixed_point; defaults to $\log(1 + \lVert x - x_0 \rVert^2)$.grad_pos_def(state, fixed_point): the gradient ofpos_defwith respect tostateatstate. Ifisnothing(grad_pos_def)(as is the default), the gradient ofpos_defwill be evaluated usinggrad.grad: a function for evaluating gradients to be used whenisnothing(grad_pos_def); defaults to, and expects the same arguments as,ForwardDiff.gradient.
Dynamics are assumed to be in f(state, p, t) form, as in an ODEFunction. For f(state, input, p, t), consider using add_policy_search.
See also: DontCheckNonnegativity
NeuralLyapunov.PositiveSemiDefiniteStructure — FunctionPositiveSemiDefiniteStructure(network_dim; <keyword_arguments>)Create a NeuralLyapunovStructure where the Lyapunov function is the product of a positive (semi-)definite function pos_def which does not depend on the network and a nonnegative function non_neg which does depend the network.
Corresponds to $V(x) = \texttt{pos\_def}(x, x_0) * \texttt{non\_neg}(ϕ, x, x_0)$, where $ϕ$ is the neural network and $x_0$ is the equilibrium point.
This structure ensures $V(x) ≥ 0$. Further, if pos_def is strictly positive definite fixed_point and non_neg is strictly positive (as is the case for the default values of pos_def and non_neg), then this structure ensures $V(x)$ is strictly positive definite around fixed_point. In such cases, the minimization condition is satisfied structurally, so DontCheckNonnegativity(false) should be used.
Arguments
- network_dim: output dimensionality of the neural network.
Keyword Arguments
pos_def(state, fixed_point): a function that is positive (semi-)definite instatearoundfixed_point; defaults to $\log(1 + \lVert x - x_0 \rVert^2)$.non_neg(net, state, fixed_point): a nonnegative function of the neural network; note thatnetis the neural network $ϕ$, andnet(state)is the value of the neural network at a point $ϕ(x)$; defaults to $1 + \lVert ϕ(x) \rVert^2$.grad_pos_def(state, fixed_point): the gradient ofpos_defwith respect tostateatstate. Ifisnothing(grad_pos_def)(as is the default), the gradient ofpos_defwill be evaluated usinggrad.grad_non_neg(net, J_net, state, fixed_point): the gradient ofnon_negwith respect tostateatstate;J_netis a function outputting the Jacobian ofnetat the input. Ifisnothing(grad_non_neg)(as is the default), the gradient ofnon_negwill be evaluated usinggrad.grad: a function for evaluating gradients to be used whenisnothing(grad_pos_def) || isnothing(grad_non_neg); defaults to, and expects the same arguments as,ForwardDiff.gradient.
Dynamics are assumed to be in f(state, p, t) form, as in an ODEFunction. For f(state, input, p, t), consider using add_policy_search.
See also: DontCheckNonnegativity
Pre-defined Lux structures
Regardless of what NeuralLyapunov transformation is used to transform $\phi$ into $V$, users should carefully consider their choice of $\phi$. Two options provided by NeuralLyapunov, intended to be used with NoAdditionalStructure, are AdditiveLyapunovNet and MultiplicativeLyapunovNet. These each wrap a different Lux model, effectively performing the transformation from $\phi$ to $V$ within the Lux ecosystem, rather than in the NeuralPDE/ModelingToolkit symbolic ecosystem.
AdditiveLyapunovNet is based on (Gaby et al., 2021), and MultiplicativeLyapunovNet is an analogous structure combining the neural term and the positive definite term via multiplication instead of addition.
NeuralLyapunov.AdditiveLyapunovNet — FunctionAdditiveLyapunovNet(ϕ; ψ, m, r, dim_ϕ, dim_m, fixed_point)Construct a Lyapunov-Net with the following structure:
\[ V(x) = ψ(ϕ(x) - ϕ(x_0)) + r(m(x) - m(x_0)),\]
where $x_0$ is fixed_point and the functions are defined as below. If the functions meet the conditions listed below, the resulting model will be positive definite (around fixed_point), as the $r$ term will be positive definite and the $ψ$ term will be positive semidefinite.
Arguments
ϕ: The base neural network model; its output dimension should bedim_ϕ.ψ: A Lux layer representing a positive semidefinite function that maps the output ofϕto a scalar value; defaults toSoSPooling()(i.e., $\lVert ⋅ \rVert^2$). Users may provide a function instead of a Lux layer, in which case it will be wrapped into a layer viaLux.WrappedFunction.m: Optional pre-processing layer for use beforer. This layer should output a vector of dimensiondim_mand $m(x) = m(x_0)$ should imply that $x$ is an equilibrium to be analyzed by the Lyapunov function. Defaults toLux.NoOpLayer(), which is typically the right choice when analyzing a single equilibrium point. Consider using aBoltz.Layers.PeriodicEmbeddingif any of the state variables are periodic. Users may provide a function instead of a Lux layer, in which case it will be wrapped into a layer viaLux.WrappedFunction.r: A Lux layer representing a positive definite function that maps the output ofmto a scalar value; defaults toSoSPooling()(i.e., $\lVert ⋅ \rVert^2$). Users may provide a function instead of a Lux layer, in which case it will be wrapped into a layer viaLux.WrappedFunction.dim_ϕ: The dimension of the output ofϕ.dim_m: The dimension of the output ofm; defaults tolength(fixed_point)whenfixed_pointis provided anddim_misn't. Users must provide at least one ofdim_mandfixed_point.fixed_point: A vector of lengthdim_mrepresenting the fixed point; defaults tozeros(dim_m)whendim_mis provided andfixed_pointisn't. Users must provide at least one ofdim_mandfixed_point.
NeuralLyapunov.MultiplicativeLyapunovNet — FunctionMultiplicativeLyapunovNet(ϕ; ζ, m, r, dim_m, fixed_point)Construct a Lyapunov-Net with the following structure:
\[ V(x) = ζ(ϕ(x)) (r(m(x) - m(x_0))),\]
where $x_0$ is fixed_point and the functions are defined as below. If the functions meet the conditions listed below, the resulting model will be positive definite (around fixed_point), as the $r$ term will be positive definite and the $ζ$ term will be strictly positive.
Arguments
ϕ: The base neural network model.ζ: A Lux layer representing a strictly positive function that maps the output ofϕto a scalar value; defaults toStrictlyPositiveSoSPooling()(i.e., $1 + \lVert ⋅ \rVert^2$). Users may provide a function instead of a Lux layer, in which case it will be wrapped into a layer viaLux.WrappedFunction.m: Optional pre-processing layer for use beforer. This layer should output a vector of dimensiondim_mand $m(x) = m(x_0)$ should imply that $x$ is an equilibrium to be analyzed by the Lyapunov function. Defaults toLux.NoOpLayer(), which is typically the right choice when analyzing a single equilibrium point. Consider using aBoltz.Layers.PeriodicEmbeddingif any of the state variables are periodic. Users may provide a function instead of a Lux layer, in which case it will be wrapped into a layer viaLux.WrappedFunction.r: A Lux layer representing a positive definite function that maps the output ofmto a scalar value; defaults toSoSPooling()(i.e., $\lVert ⋅ \rVert^2$). Users may provide a function instead of a Lux layer, in which case it will be wrapped into a layer viaLux.WrappedFunction.dim_m: The dimension of the output ofm; defaults tolength(fixed_point)whenfixed_pointis provided anddim_misn't.fixed_point: A vector of lengthdim_mrepresenting the fixed point; defaults tozeros(dim_m)whendim_mis provided andfixed_pointisn't.
Note that using NoAdditionalStructure with MultiplicativeLyapunovNet wrapping a Lux model $\phi$ is the same as using PositiveSemiDefiniteStructure the same $\phi$, but in the former the transformation is handled in the Lux ecosystem and in the latter the transformation is handled in the NeuralPDE/ModelingToolkit ecosystem. Similarly, using NonnegativeStructure with Boltz.Layers.ShiftTo is analogous to using NoAdditionalStructure with AdditiveLyapunovNet. Because the NeuralPDE parser cannot process $\phi$ being evaluated at two different points (in this case $x$ and $x_0$), we cannot represent this structure purely in the NeuralPDE/ModelingToolkit ecosystem.
Helper layers provided for the above structures are also exported:
NeuralLyapunov.SoSPooling — FunctionSoSPooling(; dim = 1)Construct a pooling function that computes the sum of squares along the dimension dim.
NeuralLyapunov.StrictlyPositiveSoSPooling — FunctionStrictlyPositiveSoSPooling(; dim = 1)Construct a pooling function that computes 1 + the sum of squares along the dimension dim.
Defining your own neural Lyapunov function structure with NeuralLyapunovStructure
To define a new structure for a neural Lyapunov function, one must specify the form of the Lyapunov candidate $V$ and its time derivative along a trajectory $\dot{V}$, as well as how to call the dynamics $f$. Additionally, the dimensionality of the output of the neural network must be known in advance.
NeuralLyapunov.NeuralLyapunovStructure — TypeNeuralLyapunovStructure(V, V̇, f_call, network_dim)Specifies the structure of the neural Lyapunov function and its derivative.
Allows the user to define the Lyapunov in terms of the neural network, potentially structurally enforcing some Lyapunov conditions.
Fields
V(phi, state, fixed_point): outputs the value of the Lyapunov function atstate.V̇(phi, J_phi, dynamics, state, params, t, fixed_point): outputs the time derivative of the Lyapunov function atstate.f_call(dynamics, phi, state, params, t): outputs the derivative of the state; this is useful for making closed-loop dynamics which depend on the neural network, such as in the policy search case.network_dim: the dimension of the output of the neural network.
phi and J_phi above are both functions of state alone.
Calling the dynamics
Very generally, the dynamical system can be a system of ODEs $\dot{x} = f(x, u, p, t)$, where $u$ is a control input, $p$ contains parameters, and $f$ depends on the neural network in some way. To capture this variety, users must supply the function f_call(dynamics, phi, state, p, t).
The most common example is when dynamics takes the same form as an ODEFunction. i.e., $\dot{x} = \texttt{dynamics}(x, p, t)$. In that case, f_call(dynamics, phi, state, p, t) = dynamics(state, p, t).
Suppose instead, the dynamics were supplied as a function of state alone: $\dot{x} = \texttt{dynamics}(x)$. Then, f_call(dynamics, phi, state, p, t) = dynamics(state).
Finally, suppose $\dot{x} = \texttt{dynamics}(x, u, p, t)$ has a unidimensional control input that is being trained (via policy search) to be the second output of the neural network. Then f_call(dynamics, phi, state, p, t) = dynamics(state, phi(state)[2], p, t).
Note that, despite the inclusion of the time variable $t$, NeuralLyapunov.jl currently only supports time-invariant systems, so only t = 0.0 is used.
Structuring $V$ and $\dot{V}$
The Lyapunov candidate function $V$ gets specified as a function V(phi, state, fixed_point), where phi is the neural network as a function phi(state). Note that this form allows $V(x)$ to depend on the neural network evaluated at points other than just the input $x$.
The time derivative $\dot{V}$ is similarly defined by a function V̇(phi, J_phi, dynamics, state, params, t, fixed_point). The function J_phi(state) gives the Jacobian of the neural network phi at state. The function dynamics is as above (with parameters params).
References
- Gaby, N.; Zhang, F. and Ye, X. (2021). Lyapunov-Net: A Deep Neural Network Architecture for Lyapunov Function Approximation. CoRR abs/2109.13359, arXiv:2109.13359.