可生存的超冗余机器人手臂，贝叶斯政策变形

论文标题

可生存的超冗余机器人手臂，贝叶斯政策变形

Survivable Hyper-Redundant Robotic Arm with Bayesian Policy Morphing

论文作者

Raza, Sayyed Jaffar Ali, Dastider, Apan, Lin, Mingjie

论文摘要

在本文中，我们提出了一个贝叶斯加固学习框架，该框架允许机器人操纵器自动从随机机械故障中自动恢复，因此可以生存。为此，我们制定了贝叶斯政策变形（BPM）的框架，该框架使机器人代理在其机动维度降低后能够自我修改其学习的政策。我们基于现有的参与者批判框架，并将其扩展为执行政策梯度更新作为后验学习，将过去的政策更新作为先前的分布。我们表明，以先前经验偏见的方向搜索，在抽样要求方面显着提高了学习效率。我们用BPM的算法在一个8多型机器人臂上演示了我们的结果，同时有意禁用具有不同损伤类型的随机关节，例如无反应的关节，恒定的偏移错误和角度不精确。我们的结果表明，即使有身体上的损坏，机器人臂仍然可以成功地保持其功能，以准确定位和掌握给定的目标对象。

In this paper we present a Bayesian reinforcement learning framework that allows robotic manipulators to adaptively recover from random mechanical failures autonomously, hence being survivable. To this end, we formulate the framework of Bayesian Policy Morphing (BPM) that enables a robot agent to self-modify its learned policy after the diminution of its maneuvering dimensionality. We build upon existing actor-critic framework, and extend it to perform policy gradient updates as posterior learning, taking past policy updates as prior distributions. We show that policy search, in the direction biased by prior experience, significantly improves learning efficiency in terms of sampling requirements. We demonstrate our results on an 8-DOF robotic arm with our algorithm of BPM, while intentionally disabling random joints with different damage types like unresponsive joints, constant offset errors and angular imprecision. Our results have shown that, even with physical damages, the robotic arm can still successfully maintain its functionality to accurately locate and grasp a given target object.

下载PDF全文

下载文献需遵守相关版权规定

论文标题