拉斯维加斯的有限理性：概率有限的自动机武器武装匪徒

论文标题

拉斯维加斯的有限理性：概率有限的自动机武器武装匪徒

Bounded Rationality in Las Vegas: Probabilistic Finite Automata PlayMulti-Armed Bandits

论文作者

Liu, Xinming, Halpern, Joseph Y.

论文摘要

尽管传统经济学假设人类是完全理性的代理人，他们总是最大程度地提高自己的预期效用，但在实践中，我们不断地观察到显然是非理性的行为。一种解释是，人们的计算限制是有限的，因此，鉴于他们的计算局限性，他们可以做出最佳决定。为了检验这一假设，我们考虑了多军匪徒（MAB）问题。我们研究了一种弹奏MAB的简单策略，该mab可以通过概率有限自动机（PFA）轻松实现。粗略地说，PFA设定了一定的期望，只要与他们相遇，就可以发挥手臂。如果PFA具有足够多的状态，则其性能几乎是最佳的。随着州数量的减少，其性能优雅地降低。此外，PFA以“人类的方式”作用，表现出许多标准的人类偏见，例如乐观偏见和消极情绪偏见。

While traditional economics assumes that humans are fully rational agents who always maximize their expected utility, in practice, we constantly observe apparently irrational behavior. One explanation is that people have limited computational power, so that they are, quite rationally, making the best decisions they can, given their computational limitations. To test this hypothesis, we consider the multi-armed bandit (MAB) problem. We examine a simple strategy for playing an MAB that can be implemented easily by a probabilistic finite automaton (PFA). Roughly speaking, the PFA sets certain expectations, and plays an arm as long as it meets them. If the PFA has sufficiently many states, it performs near-optimally. Its performance degrades gracefully as the number of states decreases. Moreover, the PFA acts in a "human-like" way, exhibiting a number of standard human biases, like an optimism bias and a negativity bias.

下载PDF全文

下载文献需遵守相关版权规定

论文标题