快速大规模SVM培训的食谱：抛光，并行性和更多RAM！

论文标题

快速大规模SVM培训的食谱：抛光，并行性和更多RAM！

Recipe for Fast Large-scale SVM Training: Polishing, Parallelism, and more RAM!

论文作者

Glasmachers, Tobias

论文摘要

支持向量机（SVM）是机器学习工具箱中的标准方法，特别是对于表格数据。但是，非线性内核SVM通常会以长期培训时间为代价提供高度准确的预测指标。随着时间的推移，数据量的指数增长加剧了这个问题。过去，它主要是通过两种类型的技术来解决的：近似求解器和平行的GPU实现。在这项工作中，我们结合了两种方法来设计非常快速的双SVM求解器。我们充分利用现代计算服务器的功能：多核架构，多个高端GPU和大型随机访问存储器。在这样的机器上，我们在24分钟内在ImageNet数据集上训练一个大利润分类器。

Support vector machines (SVMs) are a standard method in the machine learning toolbox, in particular for tabular data. Non-linear kernel SVMs often deliver highly accurate predictors, however, at the cost of long training times. That problem is aggravated by the exponential growth of data volumes over time. It was tackled in the past mainly by two types of techniques: approximate solvers, and parallel GPU implementations. In this work, we combine both approaches to design an extremely fast dual SVM solver. We fully exploit the capabilities of modern compute servers: many-core architectures, multiple high-end GPUs, and large random access memory. On such a machine, we train a large-margin classifier on the ImageNet data set in 24 minutes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题