深度学习与梯度提升：基准为信用评分的最先进的机器学习算法

论文标题

深度学习与梯度提升：基准为信用评分的最先进的机器学习算法

Deep Learning vs. Gradient Boosting: Benchmarking state-of-the-art machine learning algorithms for credit scoring

论文作者

Schmitt, Marc

论文摘要

人工智能（AI）和机器学习（ML）对于全球金融服务公司保持竞争力至关重要。目前目前争夺信用风险管理的POL位置的两个模型是深度学习（DL）和梯度提升机（GBM）。本文使用三个具有不同功能的不同数据集在信用评分的背景下对这两种算法进行了基准测试，以说明现实，即模型选择/功率通常取决于数据集的基本特征。该实验表明，GBM倾向于比DL强大，并且由于计算要求较低而具有速度的优势。这使得GBM成为赢家，并选择了信用评分。但是，还表明，GBM的表现并不总是保证，最终具体问题方案或数据集将确定最终模型选择。总体而言，基于这项研究，这两种算法都可以视为结构化数据集上二进制分类任务的最新算法，而GBM应该是大多数问题方案的首选解决方案，这是由于易于使用，较快的训练时间和卓越的准确性。

Artificial intelligence (AI) and machine learning (ML) have become vital to remain competitive for financial services companies around the globe. The two models currently competing for the pole position in credit risk management are deep learning (DL) and gradient boosting machines (GBM). This paper benchmarked those two algorithms in the context of credit scoring using three distinct datasets with different features to account for the reality that model choice/power is often dependent on the underlying characteristics of the dataset. The experiment has shown that GBM tends to be more powerful than DL and has also the advantage of speed due to lower computational requirements. This makes GBM the winner and choice for credit scoring. However, it was also shown that the outperformance of GBM is not always guaranteed and ultimately the concrete problem scenario or dataset will determine the final model choice. Overall, based on this study both algorithms can be considered state-of-the-art for binary classification tasks on structured datasets, while GBM should be the go-to solution for most problem scenarios due to easier use, significantly faster training time, and superior accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题