基于深度强化学习的模式选择和蜂窝v2x通信的资源分配

论文标题

基于深度强化学习的模式选择和蜂窝v2x通信的资源分配

Deep Reinforcement Learning Based Mode Selection and Resource Allocation for Cellular V2X Communications

论文作者

Zhang, Xinran, Peng, Mugen, Yan, Shi, Sun, Yaohua

论文摘要

蜂窝车辆到全部（V2X）通信对于支持未来的多样化车辆应用至关重要。但是，对于安全至关重要的应用，不稳定的车辆到车辆（V2V）链接和集中资源分配方法的高信号传导开销成为瓶颈。在本文中，我们研究了传输模式选择和蜂窝V2X通信资源分配的关节优化问题。特别是，提出了该问题作为马尔可夫决策过程，并提出了基于深入的加强学习（DRL）分散算法，以最大程度地提高车辆到基础用户的总和能力，同时满足V2V对的延迟和可靠性要求。此外，考虑到本地DRL模型的培训限制，开发了两次联合DRL算法的两次尺度，以帮助获得强大的模型。其中，基于图理论的车辆聚类算法是在大的时间范围内执行的，而联合学习算法则在小时尺度上进行。仿真结果表明，所提出的基于DRL的算法的表现优于其他分散的基准，并验证了新激活的V2V对的两次联合DRL算法的优越性。

Cellular vehicle-to-everything (V2X) communication is crucial to support future diverse vehicular applications. However, for safety-critical applications, unstable vehicle-to-vehicle (V2V) links and high signalling overhead of centralized resource allocation approaches become bottlenecks. In this paper, we investigate a joint optimization problem of transmission mode selection and resource allocation for cellular V2X communications. In particular, the problem is formulated as a Markov decision process, and a deep reinforcement learning (DRL) based decentralized algorithm is proposed to maximize the sum capacity of vehicle-to-infrastructure users while meeting the latency and reliability requirements of V2V pairs. Moreover, considering training limitation of local DRL models, a two-timescale federated DRL algorithm is developed to help obtain robust model. Wherein, the graph theory based vehicle clustering algorithm is executed on a large timescale and in turn the federated learning algorithm is conducted on a small timescale. Simulation results show that the proposed DRL-based algorithm outperforms other decentralized baselines, and validate the superiority of the two-timescale federated DRL algorithm for newly activated V2V pairs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题