加强学习方法的混合控制设计方法

论文标题

加强学习方法的混合控制设计方法

A reinforcement learning approach to hybrid control design

论文作者

Gandhi, Meet, Kundu, Atreyee, Bhatnagar, Shalabh

论文摘要

在本文中，我们为数学模型未知的混合系统设计混合控制策略。我们的贡献是三倍。首先，我们提出了一个将混合控制设计问题建模为单一马尔可夫决策过程（MDP）的框架。这一结果促进了从增强学习（RL）文献中使用现成的算法来设计最佳控制策略。其次，我们在拟议的MDP框架中建模了一组混合控制设计问题的基准示例。第三，我们适应了最近提出的近端策略优化（PPO）算法，用于混合动作空间，并将其应用于上述问题。据观察，在每种情况下，算法都会收敛并找到最佳策略。

In this paper we design hybrid control policies for hybrid systems whose mathematical models are unknown. Our contributions are threefold. First, we propose a framework for modelling the hybrid control design problem as a single Markov Decision Process (MDP). This result facilitates the application of off-the-shelf algorithms from Reinforcement Learning (RL) literature towards designing optimal control policies. Second, we model a set of benchmark examples of hybrid control design problem in the proposed MDP framework. Third, we adapt the recently proposed Proximal Policy Optimisation (PPO) algorithm for the hybrid action space and apply it to the above set of problems. It is observed that in each case the algorithm converges and finds the optimal policy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题