通过同型优化的合作编码缓存的多代理强化学习

论文标题

通过同型优化的合作编码缓存的多代理强化学习

Multi-Agent Reinforcement Learning for Cooperative Coded Caching via Homotopy Optimization

论文作者

Wu, Xiongwei, Li, Jun, Xiao, Ming, Ching, P. C., Poor, H. Vincent

论文摘要

将合作编码的缓存引入小单元网络是减少交通负荷的一种有前途的方法。通过通过最大距离可分离（MDS）代码编码内容，可以在小型细胞基站（SBSS）集体缓存编码的片段以提高缓存效率。但是，内容受欢迎程度通常是时间变化且在实践中未知。结果，通过考虑有限的缓存存储和SBS之间的交互影响，预计缓存内容将被智能更新。为了应对这些挑战，我们建议在动态环境中智能更新高速缓存内容，提出一个多代理深度加固学习（DRL）框架。为了最大程度地降低长期预期的领先交通负荷，我们首先将动态编码的缓存建模为合作的多代理马尔可夫决策过程。由于MDS的编码，由此产生的决策属于一类持续决策变量的受约束的强化学习问题。为了应对这一困难，我们通过将同型优化嵌入到深层的确定性政策梯度形式主义中来定制一种新颖的DRL算法。接下来，为了通过复杂性和性能之间的有效权衡来赋予缓存框架，我们通过应用派生的DRL方法提出集中，部分和完全分散的缓存控制措施。仿真结果证明了所提出的多代理框架的出色性能。

Introducing cooperative coded caching into small cell networks is a promising approach to reducing traffic loads. By encoding content via maximum distance separable (MDS) codes, coded fragments can be collectively cached at small-cell base stations (SBSs) to enhance caching efficiency. However, content popularity is usually time-varying and unknown in practice. As a result, cache contents are anticipated to be intelligently updated by taking into account limited caching storage and interactive impacts among SBSs. In response to these challenges, we propose a multi-agent deep reinforcement learning (DRL) framework to intelligently update cache contents in dynamic environments. With the goal of minimizing long-term expected fronthaul traffic loads, we first model dynamic coded caching as a cooperative multi-agent Markov decision process. Owing to MDS coding, the resulting decision-making falls into a class of constrained reinforcement learning problems with continuous decision variables. To deal with this difficulty, we custom-build a novel DRL algorithm by embedding homotopy optimization into a deep deterministic policy gradient formalism. Next, to empower the caching framework with an effective trade-off between complexity and performance, we propose centralized, partially and fully decentralized caching controls by applying the derived DRL approach. Simulation results demonstrate the superior performance of the proposed multi-agent framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题