论文标题
罗马:通过拓扑分解和梯度积累来鲁马稳健的记忆效率NAS
ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation
论文作者
论文摘要
尽管是一种普遍的体系结构搜索方法,但可区分的体系结构搜索(飞镖)在很大程度上受到其大量内存成本的限制,因为整个超级网都位于内存中。这是单路飞镖进来的地方,每个步骤仅在每个步骤中选择一个单路s子模型。在友好型内存的同时,它也带来了低计算成本。但是,我们发现了一个尚未注意到的单路飞镖的关键问题。也就是说,它也遭受了严重的性能崩溃,因为像DARTS一样,过多的无参数操作(例如跳过连接)。在本文中,我们提出了一种新算法,称为鲁棒记忆效率的NAS(罗马)以提供治疗。首先,我们将拓扑搜索从操作搜索中解散,以使搜索和评估一致。然后,我们采用Gumbel-TOP2重新聚集和梯度积累来鲁棒性的双层优化。我们在15个基准中广泛验证罗马,以证明其有效性和鲁棒性。
Albeit being a prevalent architecture searching approach, differentiable architecture search (DARTS) is largely hindered by its substantial memory cost since the entire supernet resides in the memory. This is where the single-path DARTS comes in, which only chooses a single-path submodel at each step. While being memory-friendly, it also comes with low computational costs. Nonetheless, we discover a critical issue of single-path DARTS that has not been primarily noticed. Namely, it also suffers from severe performance collapse since too many parameter-free operations like skip connections are derived, just like DARTS does. In this paper, we propose a new algorithm called RObustifying Memory-Efficient NAS (ROME) to give a cure. First, we disentangle the topology search from the operation search to make searching and evaluation consistent. We then adopt Gumbel-Top2 reparameterization and gradient accumulation to robustify the unwieldy bi-level optimization. We verify ROME extensively across 15 benchmarks to demonstrate its effectiveness and robustness.