论文标题
高阶线性变压器
Higher Order Linear Transformer
论文作者
论文摘要
跟随Katharopoulos等人的文章的线性变压器部分,该部分从Shen等人那里获得了这个想法,该想法是重复使用的,该想法产生了线性复杂性,并将其扩展到软归一化的二阶近似值。
Following up on the linear transformer part of the article from Katharopoulos et al., that takes this idea from Shen et al., the trick that produces a linear complexity for the attention mechanism is re-used and extended to a second-order approximation of the softmax normalization.