View Submission

A0789

Title: Efficient online reinforcement learning policies for continuous environments Authors: Mohamad Kazem Shirani Faradonbeh - Southern Methodist University (United States) [presenting]
Abstract: One of the most popular dynamical models for continuous environments is linear systems that evolve according to stochastic differential equations. An interesting problem in this class of systems is learning to design control actions to minimize a quadratic cost function when system matrices are unknown. Implementable online reinforcement learning policies that learn the optimal control actions fast are discussed. In fact, the proposed policy efficiently balances exploration versus exploitation by carefully randomizing the parameter estimates such that the regret grows as the square root of time multiplied by the number of parameters. Theoretical performance analysis and simulations for learning to control an aeroplane will be presented to show efficiency.