Structuring a Neural Lyapunov function
Users have two ways to specify the structure of their neural Lyapunov function: through Lux and through NeuralLyapunov. NeuralLyapunov.jl is intended for use with NeuralPDE.jl, which is itself intended for use with Lux.jl. Users therefore must provide a Lux model representing the neural network $\phi(x)$ which will be trained, regardless of the transformation they use to go from $\phi$ to $V$.
In some cases, users will find it simplest to make $\phi(x)$ a simple multilayer perceptron and specify their neural Lyapunov function structure as a transformation of the function $\phi$ to the function $V$ using the NeuralLyapunovStructure
struct, detailed below.
In other cases (particularly when they find the NeuralPDE parser has trouble tracing their structure), users may wish to represent their neuralLyapunov function structure using Lux.jl/Boltz.jl layers, integrating them into $\phi(x)$ and letting $V(x) = \phi(x)$ (as in NoAdditionalStructure
, detailed below).
Users may also combine the two methods, particularly if they find that their structure can be broken down into a component that the NeuralPDE parser has trouble tracing but exists in Lux/Boltz, and another aspect that can be written easily using a NeuralLyapunovStructure
but does not correspond any existing Lux/Boltz layer. (Such an example will be provided below.)
NeuralLyapunov.jl supplies two Lux structures and two pooling layers for structuring $\phi(x)$, along with three NeuralLyapunovStructure
transformations. Additionally, users can always specify a custom structure using the NeuralLyapunovStructure
struct.
Pre-defined NeuralLyapunov transformations
The simplest structure is to train the neural network directly to be the Lyapunov function, which can be accomplished using an NoAdditionalStructure
. This is particularly useful with the pre-defined Lux structures detailed in the following section.
NeuralLyapunov.NoAdditionalStructure
— FunctionNoAdditionalStructure()
Create a NeuralLyapunovStructure
where the Lyapunov function is the neural network evaluated at the state. This does impose any additional structure to enforce any Lyapunov conditions.
Corresponds to $V(x) = ϕ(x)$, where $ϕ$ is the neural network.
Dynamics are assumed to be in f(state, p, t)
form, as in an ODEFunction
. For f(state, input, p, t)
, consider using add_policy_search
.
The condition that the Lyapunov function $V(x)$ must be minimized uniquely at the fixed point $x_0$ is often represented as a requirement for $V(x)$ to be positive away from the fixed point and zero at the fixed point. Put mathematically, it is sufficient to require $V(x) > 0 \, \forall x \ne x_0$ and $V(x_0) = 0$. We call such functions positive definite (around the fixed point $x_0$).
Two structures are provided which partially or fully enforce the minimization condition: NonnegativeStructure
, which structurally enforces $V(x) \ge 0$ everywhere, and PositiveSemiDefiniteStructure
, which additionally enforces $V(x_0) = 0$.
NeuralLyapunov.NonnegativeStructure
— FunctionNonnegativeStructure(network_dim; <keyword_arguments>)
Create a NeuralLyapunovStructure
where the Lyapunov function is the L2 norm of the neural network output plus a constant δ times a function pos_def
.
Corresponds to $V(x) = \lVert ϕ(x) \rVert^2 + δ \, \texttt{pos\_def}(x, x_0)$, where $ϕ$ is the neural network and $x_0$ is the equilibrium point.
This structure ensures $V(x) ≥ 0 \, ∀ x$ when $δ ≥ 0$ and pos_def
is always nonnegative. Further, if $δ > 0$ and pos_def
is strictly positive definite around fixed_point
, the structure ensures that $V(x)$ is strictly positive away from fixed_point
. In such cases, the minimization condition reduces to ensuring $V(x_0) = 0$, and so DontCheckNonnegativity(true)
should be used.
Arguments
network_dim
: output dimensionality of the neural network.
Keyword Arguments
δ
: weight ofpos_def
, as above; defaults to 0.pos_def(state, fixed_point)
: a function that is positive (semi-)definite instate
aroundfixed_point
; defaults to $\log(1 + \lVert x - x_0 \rVert^2)$.grad_pos_def(state, fixed_point)
: the gradient ofpos_def
with respect tostate
atstate
. Ifisnothing(grad_pos_def)
(as is the default), the gradient ofpos_def
will be evaluated usinggrad
.grad
: a function for evaluating gradients to be used whenisnothing(grad_pos_def)
; defaults to, and expects the same arguments as,ForwardDiff.gradient
.
Dynamics are assumed to be in f(state, p, t)
form, as in an ODEFunction
. For f(state, input, p, t)
, consider using add_policy_search
.
See also: DontCheckNonnegativity
NeuralLyapunov.PositiveSemiDefiniteStructure
— FunctionPositiveSemiDefiniteStructure(network_dim; <keyword_arguments>)
Create a NeuralLyapunovStructure
where the Lyapunov function is the product of a positive (semi-)definite function pos_def
which does not depend on the network and a nonnegative function non_neg
which does depend the network.
Corresponds to $V(x) = \texttt{pos\_def}(x, x_0) * \texttt{non\_neg}(ϕ, x, x_0)$, where $ϕ$ is the neural network and $x_0$ is the equilibrium point.
This structure ensures $V(x) ≥ 0$. Further, if pos_def
is strictly positive definite fixed_point
and non_neg
is strictly positive (as is the case for the default values of pos_def
and non_neg
), then this structure ensures $V(x)$ is strictly positive definite around fixed_point
. In such cases, the minimization condition is satisfied structurally, so DontCheckNonnegativity(false)
should be used.
Arguments
- network_dim: output dimensionality of the neural network.
Keyword Arguments
pos_def(state, fixed_point)
: a function that is positive (semi-)definite instate
aroundfixed_point
; defaults to $\log(1 + \lVert x - x_0 \rVert^2)$.non_neg(net, state, fixed_point)
: a nonnegative function of the neural network; note thatnet
is the neural network $ϕ$, andnet(state)
is the value of the neural network at a point $ϕ(x)$; defaults to $1 + \lVert ϕ(x) \rVert^2$.grad_pos_def(state, fixed_point)
: the gradient ofpos_def
with respect tostate
atstate
. Ifisnothing(grad_pos_def)
(as is the default), the gradient ofpos_def
will be evaluated usinggrad
.grad_non_neg(net, J_net, state, fixed_point)
: the gradient ofnon_neg
with respect tostate
atstate
;J_net
is a function outputting the Jacobian ofnet
at the input. Ifisnothing(grad_non_neg)
(as is the default), the gradient ofnon_neg
will be evaluated usinggrad
.grad
: a function for evaluating gradients to be used whenisnothing(grad_pos_def) || isnothing(grad_non_neg)
; defaults to, and expects the same arguments as,ForwardDiff.gradient
.
Dynamics are assumed to be in f(state, p, t)
form, as in an ODEFunction
. For f(state, input, p, t)
, consider using add_policy_search
.
See also: DontCheckNonnegativity
Pre-defined Lux structures
Regardless of what NeuralLyapunov transformation is used to transform $\phi$ into $V$, users should carefully consider their choice of $\phi$. Two options provided by NeuralLyapunov, intended to be used with NoAdditionalStructure
, are AdditiveLyapunovNet
and MultiplicativeLyapunovNet
. These each wrap a different Lux model, effectively performing the transformation from $\phi$ to $V$ within the Lux ecosystem, rather than in the NeuralPDE/ModelingToolkit symbolic ecosystem.
AdditiveLyapunovNet
is based on (Gaby et al., 2021), and MultiplicativeLyapunovNet
is an analogous structure combining the neural term and the positive definite term via multiplication instead of addition.
NeuralLyapunov.AdditiveLyapunovNet
— FunctionAdditiveLyapunovNet(ϕ; ψ, m, r, dim_ϕ, dim_m, fixed_point)
Construct a Lyapunov-Net with the following structure:
\[ V(x) = ψ(ϕ(x) - ϕ(x_0)) + r(m(x) - m(x_0)),\]
where $x_0$ is fixed_point
and the functions are defined as below. If the functions meet the conditions listed below, the resulting model will be positive definite (around fixed_point
), as the $r$ term will be positive definite and the $ψ$ term will be positive semidefinite.
Arguments
ϕ
: The base neural network model; its output dimension should bedim_ϕ
.ψ
: A Lux layer representing a positive semidefinite function that maps the output ofϕ
to a scalar value; defaults toSoSPooling()
(i.e., $\lVert ⋅ \rVert^2$). Users may provide a function instead of a Lux layer, in which case it will be wrapped into a layer viaLux.WrappedFunction
.m
: Optional pre-processing layer for use beforer
. This layer should output a vector of dimensiondim_m
and $m(x) = m(x_0)$ should imply that $x$ is an equilibrium to be analyzed by the Lyapunov function. Defaults toLux.NoOpLayer()
, which is typically the right choice when analyzing a single equilibrium point. Consider using aBoltz.Layers.PeriodicEmbedding
if any of the state variables are periodic. Users may provide a function instead of a Lux layer, in which case it will be wrapped into a layer viaLux.WrappedFunction
.r
: A Lux layer representing a positive definite function that maps the output ofm
to a scalar value; defaults toSoSPooling()
(i.e., $\lVert ⋅ \rVert^2$). Users may provide a function instead of a Lux layer, in which case it will be wrapped into a layer viaLux.WrappedFunction
.dim_ϕ
: The dimension of the output ofϕ
.dim_m
: The dimension of the output ofm
; defaults tolength(fixed_point)
whenfixed_point
is provided anddim_m
isn't. Users must provide at least one ofdim_m
andfixed_point
.fixed_point
: A vector of lengthdim_m
representing the fixed point; defaults tozeros(dim_m)
whendim_m
is provided andfixed_point
isn't. Users must provide at least one ofdim_m
andfixed_point
.
NeuralLyapunov.MultiplicativeLyapunovNet
— FunctionMultiplicativeLyapunovNet(ϕ; ζ, m, r, dim_m, fixed_point)
Construct a Lyapunov-Net with the following structure:
\[ V(x) = ζ(ϕ(x)) (r(m(x) - m(x_0))),\]
where $x_0$ is fixed_point
and the functions are defined as below. If the functions meet the conditions listed below, the resulting model will be positive definite (around fixed_point
), as the $r$ term will be positive definite and the $ζ$ term will be strictly positive.
Arguments
ϕ
: The base neural network model.ζ
: A Lux layer representing a strictly positive function that maps the output ofϕ
to a scalar value; defaults toStrictlyPositiveSoSPooling()
(i.e., $1 + \lVert ⋅ \rVert^2$). Users may provide a function instead of a Lux layer, in which case it will be wrapped into a layer viaLux.WrappedFunction
.m
: Optional pre-processing layer for use beforer
. This layer should output a vector of dimensiondim_m
and $m(x) = m(x_0)$ should imply that $x$ is an equilibrium to be analyzed by the Lyapunov function. Defaults toLux.NoOpLayer()
, which is typically the right choice when analyzing a single equilibrium point. Consider using aBoltz.Layers.PeriodicEmbedding
if any of the state variables are periodic. Users may provide a function instead of a Lux layer, in which case it will be wrapped into a layer viaLux.WrappedFunction
.r
: A Lux layer representing a positive definite function that maps the output ofm
to a scalar value; defaults toSoSPooling()
(i.e., $\lVert ⋅ \rVert^2$). Users may provide a function instead of a Lux layer, in which case it will be wrapped into a layer viaLux.WrappedFunction
.dim_m
: The dimension of the output ofm
; defaults tolength(fixed_point)
whenfixed_point
is provided anddim_m
isn't.fixed_point
: A vector of lengthdim_m
representing the fixed point; defaults tozeros(dim_m)
whendim_m
is provided andfixed_point
isn't.
Note that using NoAdditionalStructure
with MultiplicativeLyapunovNet
wrapping a Lux model $\phi$ is the same as using PositiveSemiDefiniteStructure
the same $\phi$, but in the former the transformation is handled in the Lux ecosystem and in the latter the transformation is handled in the NeuralPDE/ModelingToolkit ecosystem. Similarly, using NonnegativeStructure
with Boltz.Layers.ShiftTo
is analogous to using NoAdditionalStructure
with AdditiveLyapunovNet
. Because the NeuralPDE parser cannot process $\phi$ being evaluated at two different points (in this case $x$ and $x_0$), we cannot represent this structure purely in the NeuralPDE/ModelingToolkit ecosystem.
Helper layers provided for the above structures are also exported:
NeuralLyapunov.SoSPooling
— FunctionSoSPooling(; dim = 1)
Construct a pooling function that computes the sum of squares along the dimension dim
.
NeuralLyapunov.StrictlyPositiveSoSPooling
— FunctionStrictlyPositiveSoSPooling(; dim = 1)
Construct a pooling function that computes 1 + the sum of squares along the dimension dim
.
Defining your own neural Lyapunov function structure with NeuralLyapunovStructure
To define a new structure for a neural Lyapunov function, one must specify the form of the Lyapunov candidate $V$ and its time derivative along a trajectory $\dot{V}$, as well as how to call the dynamics $f$. Additionally, the dimensionality of the output of the neural network must be known in advance.
NeuralLyapunov.NeuralLyapunovStructure
— TypeNeuralLyapunovStructure(V, V̇, f_call, network_dim)
Specifies the structure of the neural Lyapunov function and its derivative.
Allows the user to define the Lyapunov in terms of the neural network, potentially structurally enforcing some Lyapunov conditions.
Fields
V(phi::Function, state, fixed_point)
: outputs the value of the Lyapunov function atstate
.V̇(phi::Function, J_phi::Function, dynamics::Function, state, params, t, fixed_point)
: outputs the time derivative of the Lyapunov function atstate
.f_call(dynamics::Function, phi::Function, state, params, t)
: outputs the derivative of the state; this is useful for making closed-loop dynamics which depend on the neural network, such as in the policy search case.network_dim
: the dimension of the output of the neural network.
phi
and J_phi
above are both functions of state
alone.
Calling the dynamics
Very generally, the dynamical system can be a system of ODEs $\dot{x} = f(x, u, p, t)$, where $u$ is a control input, $p$ contains parameters, and $f$ depends on the neural network in some way. To capture this variety, users must supply the function f_call(dynamics::Function, phi::Function, state, p, t)
.
The most common example is when dynamics
takes the same form as an ODEFunction
. i.e., $\dot{x} = \texttt{dynamics}(x, p, t)$. In that case, f_call(dynamics, phi, state, p, t) = dynamics(state, p, t)
.
Suppose instead, the dynamics were supplied as a function of state alone: $\dot{x} = \texttt{dynamics}(x)$. Then, f_call(dynamics, phi, state, p, t) = dynamics(state)
.
Finally, suppose $\dot{x} = \texttt{dynamics}(x, u, p, t)$ has a unidimensional control input that is being trained (via policy search) to be the second output of the neural network. Then f_call(dynamics, phi, state, p, t) = dynamics(state, phi(state)[2], p, t)
.
Note that, despite the inclusion of the time variable $t$, NeuralLyapunov.jl currently only supports time-invariant systems, so only t = 0.0
is used.
Structuring $V$ and $\dot{V}$
The Lyapunov candidate function $V$ gets specified as a function V(phi, state, fixed_point)
, where phi
is the neural network as a function phi(state)
. Note that this form allows $V(x)$ to depend on the neural network evaluated at points other than just the input $x$.
The time derivative $\dot{V}$ is similarly defined by a function V̇(phi, J_phi, dynamics, state, params, t, fixed_point)
. The function J_phi(state)
gives the Jacobian of the neural network phi
at state
. The function dynamics
is as above (with parameters params
).
References
- Gaby, N.; Zhang, F. and Ye, X. (2021). Lyapunov-Net: A Deep Neural Network Architecture for Lyapunov Function Approximation. CoRR abs/2109.13359, arXiv:2109.13359.