学习长期出价：多代理强化学习，在重复拍卖游戏中长期且稀疏的奖励

论文标题

学习长期出价：多代理强化学习，在重复拍卖游戏中长期且稀疏的奖励

Learning to Bid Long-Term: Multi-Agent Reinforcement Learning with Long-Term and Sparse Reward in Repeated Auction Games

论文作者

Tan, Jing, Khalili, Ramin, Karl, Holger

论文摘要

我们提出了一种多代理分布式的增强学习算法，该算法在潜在的短期奖励与稀疏，延迟的长期奖励与在动态环境中进行部分信息之间进行平衡。我们比较不同的长期奖励，以激励算法，以最大程度地提高个人回报和整体社会福利。我们测试了两个模拟拍卖游戏中的算法，并证明1）我们的算法在直接竞争中优于两种基准算法，并具有社会福利成本，以及2）我们算法的积极竞争行为可以通过长期的奖励信号来指导，以最大程度地提高个人回报和整体社交福利。

We propose a multi-agent distributed reinforcement learning algorithm that balances between potentially conflicting short-term reward and sparse, delayed long-term reward, and learns with partial information in a dynamic environment. We compare different long-term rewards to incentivize the algorithm to maximize individual payoff and overall social welfare. We test the algorithm in two simulated auction games, and demonstrate that 1) our algorithm outperforms two benchmark algorithms in a direct competition, with cost to social welfare, and 2) our algorithm's aggressive competitive behavior can be guided with the long-term reward signal to maximize both individual payoff and overall social welfare.

下载PDF全文

下载文献需遵守相关版权规定

论文标题