多任务回归问题的新兴关系网络和任务嵌入

论文标题

多任务回归问题的新兴关系网络和任务嵌入

Emerging Relation Network and Task Embedding for Multi-Task Regression Problems

论文作者

Schreiber, Jens, Sick, Bernhard

论文摘要

多任务学习（MTL）在许多计算机视觉和自然语言处理的应用中提供了最先进的结果。与单任务学习（STL）相反，MTL允许利用相关任务之间的知识改善对主要任务的预测结果（与辅助任务相反）或所有任务。但是，将MTL架构用于回归和时间序列问题的比较研究有限，考虑到MTL的最新进展。一个有趣的非线性问题是对可再生电厂预期发电的预测。因此，本文提供了以下最新和重要的MTL体系结构的比较研究：硬参数共享，跨缝网络，SLUICE网络（SN）。将它们与STL设置中相似大小的多层感知器模型进行比较。此外，我们还提供了一种简单但有效的方法，可以通过多层感知器中的嵌入层建模特定于任务的信息，称为任务嵌入。此外，我们介绍了一个名为“新兴关系网络”（ERN）的新的MTL体系结构，该体系结构可以被视为SLUICE网络的扩展。对于太阳能数据集，嵌入的任务可实现14.9％的最佳平均改进。太阳能数据集上ERN和SN的平均改进相似，为14.7％和14.8％。在风能数据集上，只有ERN可实现高达7.7％的显着提高。结果表明，当任务仅相关时，ERN是有益的，并且预测问题更非线性。相反，当任务密切相关时，提出的任务嵌入是有利的。此外，与其他MTL体系结构相比，任务嵌入提供了一种有效的计算工作方法。

Multi-task learning (mtl) provides state-of-the-art results in many applications of computer vision and natural language processing. In contrast to single-task learning (stl), mtl allows for leveraging knowledge between related tasks improving prediction results on the main task (in contrast to an auxiliary task) or all tasks. However, there is a limited number of comparative studies on applying mtl architectures for regression and time series problems taking recent advances of mtl into account. An interesting, non-linear problem is the forecast of the expected power generation for renewable power plants. Therefore, this article provides a comparative study of the following recent and important mtl architectures: Hard parameter sharing, cross-stitch network, sluice network (sn). They are compared to a multi-layer perceptron model of similar size in an stl setting. Additionally, we provide a simple, yet effective approach to model task specific information through an embedding layer in an multi-layer perceptron, referred to as task embedding. Further, we introduce a new mtl architecture named emerging relation network (ern), which can be considered as an extension of the sluice network. For a solar power dataset, the task embedding achieves the best mean improvement with 14.9%. The mean improvement of the ern and the sn on the solar dataset is of similar magnitude with 14.7% and 14.8%. On a wind power dataset, only the ern achieves a significant improvement of up to 7.7%. Results suggest that the ern is beneficial when tasks are only loosely related and the prediction problem is more non-linear. Contrary, the proposed task embedding is advantageous when tasks are strongly correlated. Further, the task embedding provides an effective approach with reduced computational effort compared to other mtl architectures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题