论文标题
具有随机功能的可区分架构搜索
Differentiable Architecture Search with Random Features
论文作者
论文摘要
由于其高度搜索效率和有效性,可微分的体系结构搜索(飞镖)大大促进了NAS技术的发展,但遭受了性能崩溃的影响。在本文中,我们努力减轻从两个方面的飞镖的性能崩溃问题。首先,我们研究了飞镖中超级网的表达能力,然后仅使用训练batchnorm来得出新的飞镖范式设置。其次,从理论上讲,随机特征稀释了跳过连接在超网优化中的辅助连接作用,并使搜索算法专注于更公平的操作选择,从而解决了性能崩溃问题。我们具有随机功能的实例化飞镖和PC-Darts,分别为每个命名的RF-Darts和RF-PCDART构建一个改进的版本。实验结果表明,RF-DARTS在CIFAR-10上获得了\ TextBf {94.36 \%}测试准确性(这是NAS-Bench-2010中最接近的最佳结果),并在从Cifar-10转移ImageEnet时在ImageEnet上实现\ textBf {24.0 \%}的最新最新TOP-1测试错误。此外,RF-DARTS在三个数据集(CIFAR-10,CIFAR-100和SVHN)和四个搜索空间(S1-S4)上均可稳健地执行。此外,RF-PCDARTS在Imagenet上取得了更好的结果,即\ textBf {23.9 \%} top-1和\ textbf {7.1 \%} top-5 TOP-5测试错误,超越了代表性的方法,例如单操作,训练,训练 - 训练,局部 - 渠道和部分 - 渠道和局部 - channel paradigms ockenet on ImageNet。
Differentiable architecture search (DARTS) has significantly promoted the development of NAS techniques because of its high search efficiency and effectiveness but suffers from performance collapse. In this paper, we make efforts to alleviate the performance collapse problem for DARTS from two aspects. First, we investigate the expressive power of the supernet in DARTS and then derive a new setup of DARTS paradigm with only training BatchNorm. Second, we theoretically find that random features dilute the auxiliary connection role of skip-connection in supernet optimization and enable search algorithm focus on fairer operation selection, thereby solving the performance collapse problem. We instantiate DARTS and PC-DARTS with random features to build an improved version for each named RF-DARTS and RF-PCDARTS respectively. Experimental results show that RF-DARTS obtains \textbf{94.36\%} test accuracy on CIFAR-10 (which is the nearest optimal result in NAS-Bench-201), and achieves the newest state-of-the-art top-1 test error of \textbf{24.0\%} on ImageNet when transferring from CIFAR-10. Moreover, RF-DARTS performs robustly across three datasets (CIFAR-10, CIFAR-100, and SVHN) and four search spaces (S1-S4). Besides, RF-PCDARTS achieves even better results on ImageNet, that is, \textbf{23.9\%} top-1 and \textbf{7.1\%} top-5 test error, surpassing representative methods like single-path, training-free, and partial-channel paradigms directly searched on ImageNet.