论文标题
在特征提取和概括方面的深度恢复网的深度选择
Depth Selection for Deep ReLU Nets in Feature Extraction and Generalization
论文作者
论文摘要
深度学习被认为能够通过利用人类的创造力和先验知识来发现代表性学习和模式识别的深度特征,而无需优雅的功能工程技术。因此,它引发了机器学习和模式识别的巨大研究活动。深度学习的最重要挑战之一是找出一个特征与深度神经网络深度之间的关系(简称深网),以反映深度的必要性。我们的目的是量化特征提取和概括中的此特征深度对应关系。我们通过在提取单个特征和复合特征时显示出深度参数的权衡,介绍对深度和副案例的适应性。基于这些结果,我们证明,在深网上实施经典的经验风险最小化可以实现众多学习任务的最佳概括性能。通过一系列数值实验,包括玩具模拟和地震地震强度预测的真实应用,我们的理论结果得到了验证。
Deep learning is recognized to be capable of discovering deep features for representation learning and pattern recognition without requiring elegant feature engineering techniques by taking advantage of human ingenuity and prior knowledge. Thus it has triggered enormous research activities in machine learning and pattern recognition. One of the most important challenge of deep learning is to figure out relations between a feature and the depth of deep neural networks (deep nets for short) to reflect the necessity of depth. Our purpose is to quantify this feature-depth correspondence in feature extraction and generalization. We present the adaptivity of features to depths and vice-verse via showing a depth-parameter trade-off in extracting both single feature and composite features. Based on these results, we prove that implementing the classical empirical risk minimization on deep nets can achieve the optimal generalization performance for numerous learning tasks. Our theoretical results are verified by a series of numerical experiments including toy simulations and a real application of earthquake seismic intensity prediction.