变压器神经过程：通过序列建模的不确定性感知元学习

论文标题

变压器神经过程：通过序列建模的不确定性感知元学习

Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling

论文作者

Nguyen, Tung, Grover, Aditya

论文摘要

神经过程（NP）是一种流行的元学习方法。与高斯过程（GPS）类似，NPS将分布定义在功能上，并可以估计其预测中的不确定性。但是，与GPS不同，NP及其变体遭受不足的影响，并且通常具有棘手的可能性，这限制了它们在顺序决策中的应用。我们提出了变压器神经过程（TNP），这是NP家族的新成员，将不确定性感知到的元学习作为序列建模问题。我们通过基于自动可能的可能性目标学习TNP，并通过新颖的基于变压器的建筑实例化。模型架构尊重问题结构固有的归纳偏差，例如对观察到的数据点的不变性以及与未观察到的点的等效性。我们进一步调查了TNP框架内的旋钮，以额外的计算来折衷解码分布的表达。从经验上讲，我们表明TNP在各种基准问题上实现了最新的性能，在元回归，图像完成，上下文多武器的强盗和贝叶斯优化方面表现优于所有先前的NP变体。

Neural Processes (NPs) are a popular class of approaches for meta-learning. Similar to Gaussian Processes (GPs), NPs define distributions over functions and can estimate uncertainty in their predictions. However, unlike GPs, NPs and their variants suffer from underfitting and often have intractable likelihoods, which limit their applications in sequential decision making. We propose Transformer Neural Processes (TNPs), a new member of the NP family that casts uncertainty-aware meta learning as a sequence modeling problem. We learn TNPs via an autoregressive likelihood-based objective and instantiate it with a novel transformer-based architecture. The model architecture respects the inductive biases inherent to the problem structure, such as invariance to the observed data points and equivariance to the unobserved points. We further investigate knobs within the TNP framework that tradeoff expressivity of the decoding distribution with extra computation. Empirically, we show that TNPs achieve state-of-the-art performance on various benchmark problems, outperforming all previous NP variants on meta regression, image completion, contextual multi-armed bandits, and Bayesian optimization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题