避免过度拟合：卷积神经网络正规化方法的调查

论文标题

避免过度拟合：卷积神经网络正规化方法的调查

Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks

论文作者

Santos, Claudio Filipi Gonçalves dos, Papa, João Paulo

论文摘要

使用卷积神经网络（CNN），已经显着改善了几种图像处理任务，例如图像分类和对象检测。像Resnet and Extricnet一样，许多架构在创建时至少在一个数据集中取得了出色的结果。培训的关键因素涉及网络的正则化，从而阻止结构过度拟合。这项工作分析了过去几年中开发的几种正则化方法，显示了不同CNN模型的显着改善。这些作品分为三个主要领域：第一个领域称为“数据增强”，其中所有技术都集中在输入数据中执行更改。第二个名为“内部更改”，旨在描述修改神经网络或内核生成的特征图的程序。最后一个称为“标签”，涉及转换给定输入的标签。这项工作提出了两个主要区别，与其他有关正规化的可用调查相比：（i）第一个关注手稿中收集的论文不超过五岁，并且（ii）第二个区别是关于可重复可重复性的，即，这里引用的所有作品都在公共存储库中可用，或者在某些框架中可以直接实施，例如Torch或Torch，例如Torch或Torch。

Several image processing tasks, such as image classification and object detection, have been significantly improved using Convolutional Neural Networks (CNN). Like ResNet and EfficientNet, many architectures have achieved outstanding results in at least one dataset by the time of their creation. A critical factor in training concerns the network's regularization, which prevents the structure from overfitting. This work analyzes several regularization methods developed in the last few years, showing significant improvements for different CNN models. The works are classified into three main areas: the first one is called "data augmentation", where all the techniques focus on performing changes in the input data. The second, named "internal changes", which aims to describe procedures to modify the feature maps generated by the neural network or the kernels. The last one, called "label", concerns transforming the labels of a given input. This work presents two main differences comparing to other available surveys about regularization: (i) the first concerns the papers gathered in the manuscript, which are not older than five years, and (ii) the second distinction is about reproducibility, i.e., all works refered here have their code available in public repositories or they have been directly implemented in some framework, such as TensorFlow or Torch.

下载PDF全文

下载文献需遵守相关版权规定

论文标题