使用分布加强估计的四极管跟踪系统的可解释的随机模型预测控制

论文标题

使用分布加强估计的四极管跟踪系统的可解释的随机模型预测控制

Interpretable Stochastic Model Predictive Control using Distributional Reinforced Estimation for Quadrotor Tracking Systems

论文作者

Wang, Yanran, O'Keeffe, James, Qian, Qiuchen, Boyle, David

论文摘要

本文提出了一种新型的轨迹跟踪器，用于在动态和复杂的环境中自动二次导航。提出的框架将未知空气动力学效应的分布加固学习（RL）估计器集成到随机模型预测控制器（SMPC）中进行轨迹跟踪。从阻力力衍生出的空气动力效应和力矩变化很难直接，准确地建模。因此，大多数当前的四型跟踪系统将它们视为传统控制方法中的简单“干扰”。我们提出了基于分位数的分布分布加强驱动式估计器（一种空气动力障碍估计器），以准确识别空气效应的真实和估计值之间的不确定性，即不确定性。简化的仿射干扰反馈用于控制参数化来保证凸度，然后我们将其与SMPC集成以获得足够的和非保守的控制信号。我们证明了我们的系统将累积跟踪误差提高至少66％，与最近的最新动力相比，未知和不同的空气动力。关于传统的强化学习的不泄露性，我们分别提供分布RL和SMPC的收敛性和稳定性保证，并具有非零均值干扰。

This paper presents a novel trajectory tracker for autonomous quadrotor navigation in dynamic and complex environments. The proposed framework integrates a distributional Reinforcement Learning (RL) estimator for unknown aerodynamic effects into a Stochastic Model Predictive Controller (SMPC) for trajectory tracking. Aerodynamic effects derived from drag forces and moment variations are difficult to model directly and accurately. Most current quadrotor tracking systems therefore treat them as simple `disturbances' in conventional control approaches. We propose Quantile-approximation-based Distributional Reinforced-disturbance-estimator, an aerodynamic disturbance estimator, to accurately identify disturbances, i.e., uncertainties between the true and estimated values of aerodynamic effects. Simplified Affine Disturbance Feedback is employed for control parameterization to guarantee convexity, which we then integrate with a SMPC to achieve sufficient and non-conservative control signals. We demonstrate our system to improve the cumulative tracking errors by at least 66% with unknown and diverse aerodynamic forces compared with recent state-of-the-art. Concerning traditional Reinforcement Learning's non-interpretability, we provide convergence and stability guarantees of Distributional RL and SMPC, respectively, with non-zero mean disturbances.

下载PDF全文

下载文献需遵守相关版权规定

论文标题