Policy Search and Network-Dependent Dynamics
At times, we wish to model a component of the dynamics with a neural network. A common example is the policy search case, when the closed-loop dynamics include a neural network controller. In such cases, we consider the dynamics to take the form of $\frac{dx}{dt} = f(x, u, p, t)$, where $u$ is the control input/the contribution to the dynamics from the neural network. We provide the add_policy_search
function to transform a NeuralLyapunovStructure
to include training the neural network to represent not just the Lyapunov function, but also the relevant part of the dynamics.
Similar to get_numerical_lyapunov_function
, we provide the get_policy
convenience function to construct $u(x)$ that can be combined with the open-loop dynamics $f(x, u, p, t)$ to create closed loop dynamics $f_{cl}(x, p, t) = f(x, u(x), p, t)$.
NeuralLyapunov.add_policy_search
— Functionadd_policy_search(lyapunov_structure, new_dims; control_structure)
Add dependence on the neural network to the dynamics in a NeuralLyapunovStructure
.
Arguments
lyapunov_structure::NeuralLyapunovStructure
: provides structure for $V, V̇$; should assume dynamics take a form off(x, p, t)
.new_dims::Integer
: number of outputs of the neural network to pass into the dynamics throughcontrol_structure
.
Keyword Arguments
control_structure::Function
: transforms the finalnew_dims
outputs of the neural net before passing them into the dynamics; defaults toidentity
, passing in the neural network outputs unchanged.
The returned NeuralLyapunovStructure
expects dynamics of the form f(x, u, p, t)
, where u
captures the dependence of dynamics on the neural network (e.g., through a control input). When evaluating the dynamics, it uses u = control_structure(phi_end(x))
where phi_end
is a function that returns the final new_dims
outputs of the neural network. The other lyapunov_structure.network_dim
outputs are used for calculating $V$ and $V̇$, as specified originally by lyapunov_structure
.
NeuralLyapunov.get_policy
— Functionget_policy(phi, θ, network_dim, control_dim; control_structure)
Generate a Julia function representing the control policy/unmodeled portion of the dynamics as a function of the state.
The returned function can operate on a state vector or columnwise on a matrix of state vectors.
Arguments
phi
: the neural network, represented asphi(state, θ)
if the neural network has a single output, or aVector
of the same with one entry per neural network output.θ
: the parameters of the neural network;θ[:φ1]
should be the parameters of the first neural network output (even if there is only one),θ[:φ2]
the parameters of the second (if there are multiple), and so on.network_dim
: total number of neural network outputs.control_dim
: number of neural network outputs used in the control policy.
Keyword Arguments
control_structure
: transforms the finalcontrol_dim
outputs of the neural net before passing them into the dynamics; defaults toidentity
, passing in the neural network outputs unchanged.