使用基于模型的增强学习的安全控制器用于输出反馈线性系统

论文标题

使用基于模型的增强学习的安全控制器用于输出反馈线性系统

Safe Controller for Output Feedback Linear Systems using Model-Based Reinforcement Learning

论文作者

Mahmud, S M Nahid, Abudia, Moad, Nivison, Scott A, Bell, Zachary I., Kamalapurkar, Rushikesh

论文摘要

这项研究的目的是使安全至关重要的系统能够以安全的方式同时学习和执行最佳控制政策，以实现复杂的自主权。通过反复试验学习最佳政策，即传统的加强学习，在安全至关重要的系统中很难实施，尤其是在不可用的任务重新启动时。最近已经开发了基于障碍转换的安全基于模型的增强学习技术来解决此问题。但是，这些方法依赖于全州反馈，从而限制了它们在现实环境中的可用性。在这项工作中，基于新颖的障碍感动态状态估计器的基于输出的基于模型的增强增强学习技术旨在解决此问题。开发的方法有助于同时学习和执行安全关键线性系统的安全控制政策。仿真结果表明，屏障转换是使用输出反馈在安全至关重要系统中实现在线加强学习的有效方法。

The objective of this research is to enable safety-critical systems to simultaneously learn and execute optimal control policies in a safe manner to achieve complex autonomy. Learning optimal policies via trial and error, i.e., traditional reinforcement learning, is difficult to implement in safety-critical systems, particularly when task restarts are unavailable. Safe model-based reinforcement learning techniques based on a barrier transformation have recently been developed to address this problem. However, these methods rely on full state feedback, limiting their usability in a real-world environment. In this work, an output-feedback safe model-based reinforcement learning technique based on a novel barrier-aware dynamic state estimator has been designed to address this issue. The developed approach facilitates simultaneous learning and execution of safe control policies for safety-critical linear systems. Simulation results indicate that barrier transformation is an effective approach to achieve online reinforcement learning in safety-critical systems using output feedback.

下载PDF全文

下载文献需遵守相关版权规定

论文标题