在马尔可夫决策过程中混合概率和非稳定目标

论文标题

在马尔可夫决策过程中混合概率和非稳定目标

Mixing Probabilistic and non-Probabilistic Objectives in Markov Decision Processes

论文作者

Berthon, Raphaël, Guha, Shibashis, Raskin, Jean-François

论文摘要

在本文中，我们考虑算法来确定MDP中的策略存在，以实现目标的布局组合。这些目标是欧米茄规范的特性，需要肯定，几乎肯定，存在或以非零的概率执行。在这种情况下，相关策略是随机的无限内存策略：可能需要无限的内存和随机分配才能最佳发挥作用。我们提供算法来解决布尔组合的一般情况，还研究了相关的子案例。我们进一步报告了这些问题的复杂性范围。

In this paper, we consider algorithms to decide the existence of strategies in MDPs for Boolean combinations of objectives. These objectives are omega-regular properties that need to be enforced either surely, almost surely, existentially, or with non-zero probability. In this setting, relevant strategies are randomized infinite memory strategies: both infinite memory and randomization may be needed to play optimally. We provide algorithms to solve the general case of Boolean combinations and we also investigate relevant subcases. We further report on complexity bounds for these problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题