来自计算约束的过度参数化

论文标题

来自计算约束的过度参数化

Overparameterization from Computational Constraints

论文作者

Garg, Sanjam, Jha, Somesh, Mahloujifar, Saeed, Mahmoody, Mohammad, Wang, Mingyuan

论文摘要

有数百万参数的过度参数化模型取得了巨大成功。在这项工作中，我们问：至少，由于学习者的\ emph {计算}限制，对大型模型的需求至少可以部分吗？此外，我们问，这种情况是否加剧了\ emph {robust}学习？我们证明确实可以是这种情况。我们显示了学习任务，计算有限的学习者需要\ emph {明显多于信息理论学习者所需的模型参数。此外，我们表明，对于强大的学习，甚至需要更多的模型参数。特别是，对于计算有限的学习者，我们扩展了Bubeck and Sellke [Neurips'2021]的最新结果，该结果表明，强大的模型可能需要更多的参数，并表明有限学习者可能需要更多的参数。然后，我们解决以下相关的问题：我们是否希望通过限制\ emph {fresversaries}来纠正稳健计算界限的情况，以便为了获得较少参数的模型而受到计算的限制？再次，我们证明这是可能的。具体而言，我们基于Garg，Jha，Mahloujifar和Mahmoody [Alt'2020]的工作，我们演示了一项学习任务，可以对计算界限的攻击者有效，强大地学习，同时对信息理论攻击者进行鲁棒性，需要学习者需要更多的参数参数。

Overparameterized models with millions of parameters have been hugely successful. In this work, we ask: can the need for large models be, at least in part, due to the \emph{computational} limitations of the learner? Additionally, we ask, is this situation exacerbated for \emph{robust} learning? We show that this indeed could be the case. We show learning tasks for which computationally bounded learners need \emph{significantly more} model parameters than what information-theoretic learners need. Furthermore, we show that even more model parameters could be necessary for robust learning. In particular, for computationally bounded learners, we extend the recent result of Bubeck and Sellke [NeurIPS'2021] which shows that robust models might need more parameters, to the computational regime and show that bounded learners could provably need an even larger number of parameters. Then, we address the following related question: can we hope to remedy the situation for robust computationally bounded learning by restricting \emph{adversaries} to also be computationally bounded for sake of obtaining models with fewer parameters? Here again, we show that this could be possible. Specifically, building on the work of Garg, Jha, Mahloujifar, and Mahmoody [ALT'2020], we demonstrate a learning task that can be learned efficiently and robustly against a computationally bounded attacker, while to be robust against an information-theoretic attacker requires the learner to utilize significantly more parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题