论文标题
ECCO:等效电路受控优化
ECCO: Equivalent Circuit Controlled Optimization
论文作者
论文摘要
我们提出了一种自适应优化算法,用于解决无约束的缩放梯度流问题,该梯度流问题通过控制优化轨迹形状和离散化步骤大小来实现快速收敛。在广泛的缩放函数下,我们建立了提出的方法对平滑目标函数的关键点的融合,同时证明了其相对于参数调谐的灵活性和鲁棒性。首先,我们证明在规律性条件下,分量缩放梯度流到临界点的趋同。我们表明,这种受控的梯度流动动力学等于电路的瞬态响应,从而使电路理论概念可以解决问题。基于此等价性,我们基于最小化电路中存储的电荷的两个优化轨迹控制方案:使用真实的Hessian和另一种一阶方法,该方法仅使用梯度信息近似于优化轨迹。尽管控制方案是从电路概念得出的,但不需要电路知识来实现算法。为了找到临界点的值,我们提出了一个时间步长搜索例程,以控制欧拉离散化,以控制局部截断误差,这是根据电路模拟想法改编的方法。在模拟中,我们发现轨迹控制的表现优于不受控制的梯度流,而误解的离散化效果超过了Armijo条件的绩效搜索。我们的算法在包括神经网络在内的凸面和非convex测试功能上进行了评估,其收敛速度可与ADAM相当或超过ADAM。
We propose an adaptive optimization algorithm for solving unconstrained scaled gradient flow problems that achieves fast convergence by controlling the optimization trajectory shape and the discretization step sizes. Under a broad class of scaling functions, we establish convergence of the proposed approach to critical points of smooth objective functions, while demonstrating its flexibility and robustness with respect to hyperparameter tuning. First, we prove convergence of component-wise scaled gradient flow to a critical point under regularity conditions. We show that this controlled gradient flow dynamics is equivalent to the transient response of an electrical circuit, allowing for circuit theory concepts to solve the problem. Based on this equivalence, we develop two optimization trajectory control schemes based on minimizing the charge stored in the circuit: a second order method that uses the true Hessian and an alternate first order method that approximates the optimization trajectory with only gradient information. While the control schemes are derived from circuit concepts, no circuit knowledge is needed to implement the algorithms. To find the value of the critical point, we propose a time step search routine for Forward Euler discretization that controls the local truncation error, a method adapted from circuit simulation ideas. In simulation we find that the trajectory control outperforms uncontrolled gradient flow, and the error-aware discretization out-performs line search with the Armijo condition. Our algorithms are evaluated on convex and non-convex test functions, including neural networks, with convergence speeds comparable to or exceeding Adam.