论文标题
具有低级过渡的情节线性二次调节器
Episodic Linear Quadratic Regulators with Low-rank Transitions
论文作者
论文摘要
线性二次调节器(LQR)实现了巨大的成功现实应用。最近,人们一直专注于LQR的有效学习算法,当时他们的动态尚不清楚。现有结果有效地学会了使用多个情节数的数量根据系统参数(包括状态的环境维度)来控制未知系统。但是,这些传统的方法在常见方案中效率低下,例如,当国家是高分辨率图像时。在本文中,我们提出了一种利用固有系统低级别结构进行有效学习的算法。对于等级$ m $的问题,我们的算法达到了$ k $ - episode遗憾的订单$ \ widetilde {o} {o}(m^{3/2} k^{1/2})$。因此,我们算法的样本复杂性仅取决于等级,$ m $,而不是环境尺寸,$ d $,这可以是较大的命令。
Linear Quadratic Regulators (LQR) achieve enormous successful real-world applications. Very recently, people have been focusing on efficient learning algorithms for LQRs when their dynamics are unknown. Existing results effectively learn to control the unknown system using number of episodes depending polynomially on the system parameters, including the ambient dimension of the states. These traditional approaches, however, become inefficient in common scenarios, e.g., when the states are high-resolution images. In this paper, we propose an algorithm that utilizes the intrinsic system low-rank structure for efficient learning. For problems of rank-$m$, our algorithm achieves a $K$-episode regret bound of order $\widetilde{O}(m^{3/2} K^{1/2})$. Consequently, the sample complexity of our algorithm only depends on the rank, $m$, rather than the ambient dimension, $d$, which can be orders-of-magnitude larger.