协作小组学习

论文标题

协作小组学习

Collaborative Group Learning

论文作者

Feng, Shaoxiong, Chen, Hongshen, Ren, Xuancheng, Ding, Zhuoye, Li, Kan, Sun, Xu

论文摘要

协作学习已成功地应用知识转移，以指导小型学生网络的库来实现强大的本地最小值。但是，先前的方法通常在学生数量上升时与学生同质化的巨大加重。在本文中，我们提出了协作小组学习，这是一个有效的框架，旨在使功能表示形式多样化并进行有效的正则化。与人类群体研究机制相似，我们诱使学生学习和交换不同的部分，当然是作为协作小组的知识。首先，通过在模块化神经网络上随机路由来建立每个学生，这是由于代表性共享和分支的随机水平，从而有助于学生之间的灵活知识交流。其次，为了抵制学生的同质化，学生首先通过利用训练数据子集的电感偏差来组成各种特征集，然后通过模仿每个时间步骤的随机亚组来汇总和提炼不同的互补知识。总体而言，上述机制有益于最大化学生群体，以进一步改善模型的概括而无需牺牲计算效率。对图像和文本任务的经验评估表明，我们的方法在提高计算效率的同时，大大优于各种最新的协作方法。

Collaborative learning has successfully applied knowledge transfer to guide a pool of small student networks towards robust local minima. However, previous approaches typically struggle with drastically aggravated student homogenization when the number of students rises. In this paper, we propose Collaborative Group Learning, an efficient framework that aims to diversify the feature representation and conduct an effective regularization. Intuitively, similar to the human group study mechanism, we induce students to learn and exchange different parts of course knowledge as collaborative groups. First, each student is established by randomly routing on a modular neural network, which facilitates flexible knowledge communication between students due to random levels of representation sharing and branching. Second, to resist the student homogenization, students first compose diverse feature sets by exploiting the inductive bias from sub-sets of training data, and then aggregate and distill different complementary knowledge by imitating a random sub-group of students at each time step. Overall, the above mechanisms are beneficial for maximizing the student population to further improve the model generalization without sacrificing computational efficiency. Empirical evaluations on both image and text tasks indicate that our method significantly outperforms various state-of-the-art collaborative approaches whilst enhancing computational efficiency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题