超越微调：少量样本句子嵌入转移

论文标题

超越微调：少量样本句子嵌入转移

Beyond Fine-tuning: Few-Sample Sentence Embedding Transfer

论文作者

Garg, Siddhant, Sharma, Rohit Kumar, Liang, Yingyu

论文摘要

微型数据集上的微调（FT）预训练的句子嵌入模型已被证明具有局限性。在本文中，我们表明，将预训练模型的嵌入嵌入与仅在目标数据上训练的简单句子嵌入模型的嵌入方式可以改善ft的ft性能，以进行几个样本任务。为此，通过冻结嵌入模型权重或训练分类器和嵌入模型端到端的嵌入式嵌入式对线性分类器进行了训练。我们对NLP任务的七个小型数据集进行评估，并表明我们使用端到端培训的方法优于ft，用可忽略不计的计算开销。此外，我们还表明，像CCA和KCCA这样的复杂组合技术在实践中的工作状态不如串联。我们提供理论分析来解释这一经验观察。

Fine-tuning (FT) pre-trained sentence embedding models on small datasets has been shown to have limitations. In this paper we show that concatenating the embeddings from the pre-trained model with those from a simple sentence embedding model trained only on the target data, can improve over the performance of FT for few-sample tasks. To this end, a linear classifier is trained on the combined embeddings, either by freezing the embedding model weights or training the classifier and embedding models end-to-end. We perform evaluation on seven small datasets from NLP tasks and show that our approach with end-to-end training outperforms FT with negligible computational overhead. Further, we also show that sophisticated combination techniques like CCA and KCCA do not work as well in practice as concatenation. We provide theoretical analysis to explain this empirical observation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题