论文标题
通过组合算法反向传播:带有投影作品的身份
Backpropagation through Combinatorial Algorithms: Identity with Projection Works
论文作者
论文摘要
将离散的求解器嵌入可区分的层,使现代深度学习体系结构组合表达和离散的推理能力。这些求解器的导数为零或未定义,因此有意义的替代者对于有效的基于梯度的学习至关重要。先前的工作依赖于输入扰动使求解器平滑,使求解器放松求解器连续问题,或使用通常需要其他求解器调用的技术插值损失格局,引入额外的超参数或损害性能。我们提出了一种原则性的方法来利用离散解决方案空间的几何形状,以将求解器视为向后通过的负身份,并进一步提供理论上的理由。我们的实验表明,这种直接的无参数方法能够在许多实验上与以前的更复杂的方法竞争,例如通过离散采样器,深度图匹配和图像检索等众多实验。此外,我们将先前提出的特定问题和标签依赖性边缘替代通用正则化程序,以防止成本崩溃并增加鲁棒性。
Embedding discrete solvers as differentiable layers has given modern deep learning architectures combinatorial expressivity and discrete reasoning capabilities. The derivative of these solvers is zero or undefined, therefore a meaningful replacement is crucial for effective gradient-based learning. Prior works rely on smoothing the solver with input perturbations, relaxing the solver to continuous problems, or interpolating the loss landscape with techniques that typically require additional solver calls, introduce extra hyper-parameters, or compromise performance. We propose a principled approach to exploit the geometry of the discrete solution space to treat the solver as a negative identity on the backward pass and further provide a theoretical justification. Our experiments demonstrate that such a straightforward hyper-parameter-free approach is able to compete with previous more complex methods on numerous experiments such as backpropagation through discrete samplers, deep graph matching, and image retrieval. Furthermore, we substitute the previously proposed problem-specific and label-dependent margin with a generic regularization procedure that prevents cost collapse and increases robustness.