A1004
Title: Value-gradient based formulation of optimal control problem and machine learning algorithm
Authors: Xiang Zhou - City University of Hong Kong (Hong Kong) [presenting]
Abstract: The optimal control problem is typically solved by first finding the value function through the Hamilton-Jacobi equation (HJE) and then taking the minimizer of the Hamiltonian to obtain the control. Instead of focusing on the value function, a new formulation is proposed for the gradient of the value function (value-gradient) as a decoupled system of partial differential equations in the context of a continuous-time deterministic discounted optimal control problem. This is similar to differential learning, but the derivation is simple and based on the Eulerian viewpoint, not from the underlying dynamics directly. An efficient iterative scheme is developed for this system of equations in parallel by utilizing the properties that share the same characteristic curves as the HJE for the value function. For the theoretical part, it is proven that this iterative scheme converges linearly. The characteristic line method is combined with machine learning techniques for the numerical method. Experimental results demonstrate that this new method not only significantly increases the accuracy but also improves the efficiency and robustness of the numerical estimates, particularly with less amount of characteristics data or fewer training steps.