论文标题
多任务回归问题的新兴关系网络和任务嵌入
Emerging Relation Network and Task Embedding for Multi-Task Regression Problems
论文作者
论文摘要
多任务学习(MTL)在许多计算机视觉和自然语言处理的应用中提供了最先进的结果。与单任务学习(STL)相反,MTL允许利用相关任务之间的知识改善对主要任务的预测结果(与辅助任务相反)或所有任务。但是,将MTL架构用于回归和时间序列问题的比较研究有限,考虑到MTL的最新进展。一个有趣的非线性问题是对可再生电厂预期发电的预测。因此,本文提供了以下最新和重要的MTL体系结构的比较研究:硬参数共享,跨缝网络,SLUICE网络(SN)。将它们与STL设置中相似大小的多层感知器模型进行比较。此外,我们还提供了一种简单但有效的方法,可以通过多层感知器中的嵌入层建模特定于任务的信息,称为任务嵌入。此外,我们介绍了一个名为“新兴关系网络”(ERN)的新的MTL体系结构,该体系结构可以被视为SLUICE网络的扩展。对于太阳能数据集,嵌入的任务可实现14.9%的最佳平均改进。太阳能数据集上ERN和SN的平均改进相似,为14.7%和14.8%。在风能数据集上,只有ERN可实现高达7.7%的显着提高。结果表明,当任务仅相关时,ERN是有益的,并且预测问题更非线性。相反,当任务密切相关时,提出的任务嵌入是有利的。此外,与其他MTL体系结构相比,任务嵌入提供了一种有效的计算工作方法。
Multi-task learning (mtl) provides state-of-the-art results in many applications of computer vision and natural language processing. In contrast to single-task learning (stl), mtl allows for leveraging knowledge between related tasks improving prediction results on the main task (in contrast to an auxiliary task) or all tasks. However, there is a limited number of comparative studies on applying mtl architectures for regression and time series problems taking recent advances of mtl into account. An interesting, non-linear problem is the forecast of the expected power generation for renewable power plants. Therefore, this article provides a comparative study of the following recent and important mtl architectures: Hard parameter sharing, cross-stitch network, sluice network (sn). They are compared to a multi-layer perceptron model of similar size in an stl setting. Additionally, we provide a simple, yet effective approach to model task specific information through an embedding layer in an multi-layer perceptron, referred to as task embedding. Further, we introduce a new mtl architecture named emerging relation network (ern), which can be considered as an extension of the sluice network. For a solar power dataset, the task embedding achieves the best mean improvement with 14.9%. The mean improvement of the ern and the sn on the solar dataset is of similar magnitude with 14.7% and 14.8%. On a wind power dataset, only the ern achieves a significant improvement of up to 7.7%. Results suggest that the ern is beneficial when tasks are only loosely related and the prediction problem is more non-linear. Contrary, the proposed task embedding is advantageous when tasks are strongly correlated. Further, the task embedding provides an effective approach with reduced computational effort compared to other mtl architectures.