一致风险措施的基于样本的界限：申请政策综合和验证

论文标题

一致风险措施的基于样本的界限：申请政策综合和验证

Sample-Based Bounds for Coherent Risk Measures: Applications to Policy Synthesis and Verification

论文作者

Akella, Prithvi, Dixit, Anushri, Ahmadi, Mohamadreza, Burdick, Joel W., Ames, Aaron D.

论文摘要

受到可变环境的自主系统的急剧增加导致了迫切需要考虑这些系统策略的合成和验证的风险。本文旨在通过首先开发一种基于样本的方法来绑定分布未知的随机变量的风险度量评估，以解决有关风险感知验证和政策综合的一些问题。这些界限使我们能够为大型机器人系统生成高信心验证声明。其次，我们开发了一种基于样本的方法来确定非凸优化问题的解决方案，以优于可能解决方案的决策空间的大部分。然后，两种基于样本的方法都使我们能够快速合成风险感知的政策，以确保达到最低水平的系统性能。为了展示我们的模拟方法，我们验证了合作的多代理系统，并开发了一种高于系统基线控制器的风险感知控制器。我们还提到如何扩展我们的方法以说明任何$ g $ - 通心素风险度量 - 我们关注的连贯风险措施的子集。

The dramatic increase of autonomous systems subject to variable environments has given rise to the pressing need to consider risk in both the synthesis and verification of policies for these systems. This paper aims to address a few problems regarding risk-aware verification and policy synthesis, by first developing a sample-based method to bound the risk measure evaluation of a random variable whose distribution is unknown. These bounds permit us to generate high-confidence verification statements for a large class of robotic systems. Second, we develop a sample-based method to determine solutions to non-convex optimization problems that outperform a large fraction of the decision space of possible solutions. Both sample-based approaches then permit us to rapidly synthesize risk-aware policies that are guaranteed to achieve a minimum level of system performance. To showcase our approach in simulation, we verify a cooperative multi-agent system and develop a risk-aware controller that outperforms the system's baseline controller. We also mention how our approach can be extended to account for any $g$-entropic risk measure - the subset of coherent risk measures on which we focus.

下载PDF全文

下载文献需遵守相关版权规定

论文标题