稀疏的扰动，以改善随机零阶优化的收敛性

论文标题

稀疏的扰动，以改善随机零阶优化的收敛性

Sparse Perturbations for Improved Convergence in Stochastic Zeroth-Order Optimization

论文作者

Ohta, Mayumi, Berger, Nathaniel, Sokolov, Artem, Riezler, Stefan

论文摘要

最近，在黑盒优化方案（例如对对抗性黑盒攻击对深神经网络的攻击）中，对随机零阶（SZO）方法的兴趣已恢复。 SZO方法只需要在随机输入点上评估目标函数的能力，但是，它们的弱点是其收敛速度对要评估函数的维度的依赖性。我们提出了一种稀疏的SZO优化方法，该方法将该因素降低到学习过程中随机扰动的预期维度。我们给出了证明，可以证明这种缩小的SZO对非凸功能的稀疏SZO优化，而无需对目标函数或梯度的稀疏性做出任何假设。此外，我们提出了MNIST和CIFAR上神经网络的实验结果，这些结果表明训练损失和测试准确性的收敛速度更快，并且与密集的SZO相比，梯度近似与稀疏SZO中真正梯度的距离较小。

Interest in stochastic zeroth-order (SZO) methods has recently been revived in black-box optimization scenarios such as adversarial black-box attacks to deep neural networks. SZO methods only require the ability to evaluate the objective function at random input points, however, their weakness is the dependency of their convergence speed on the dimensionality of the function to be evaluated. We present a sparse SZO optimization method that reduces this factor to the expected dimensionality of the random perturbation during learning. We give a proof that justifies this reduction for sparse SZO optimization for non-convex functions without making any assumptions on sparsity of objective function or gradient. Furthermore, we present experimental results for neural networks on MNIST and CIFAR that show faster convergence in training loss and test accuracy, and a smaller distance of the gradient approximation to the true gradient in sparse SZO compared to dense SZO.

下载PDF全文

下载文献需遵守相关版权规定

论文标题