强大的力量，巨大的责任：减少培训语言模型能源的建议

论文标题

强大的力量，巨大的责任：减少培训语言模型能源的建议

Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models

论文作者

McDonald, Joseph, Li, Baolin, Frey, Nathan, Tiwari, Devesh, Gadepally, Vijay, Samsi, Siddharth

论文摘要

当前自然语言处理模型的能源需求继续以快速，不可持续的速度增长。最近强调这个问题的最近作品得出结论，迫切需要方法可以更广泛地减少NLP和机器学习的能量需求。在本文中，我们研究了可用于减少常见NLP应用的能源消耗的技术。特别是，我们专注于衡量能源使用情况以及不同的硬件和面向数据中心的设置的技术，这些设置可以调整以减少培训和推断语言模型的能源消耗。我们通过在高性能计算系统以及流行的云计算平台上进行的实验来表征这些设置对计算性能和能源消耗等指标的影响。当训练语言模型或推断其使用时，这些技术可能会大大减少能源消耗。例如，限制GPU可以消耗的最大功率的电源封盖可以使能量使用率减少15 \％，而在训练基于变压器的语言模型的情况下，总体计算时间的边际增加。

The energy requirements of current natural language processing models continue to grow at a rapid, unsustainable pace. Recent works highlighting this problem conclude there is an urgent need for methods that reduce the energy needs of NLP and machine learning more broadly. In this article, we investigate techniques that can be used to reduce the energy consumption of common NLP applications. In particular, we focus on techniques to measure energy usage and different hardware and datacenter-oriented settings that can be tuned to reduce energy consumption for training and inference for language models. We characterize the impact of these settings on metrics such as computational performance and energy consumption through experiments conducted on a high performance computing system as well as popular cloud computing platforms. These techniques can lead to significant reduction in energy consumption when training language models or their use for inference. For example, power-capping, which limits the maximum power a GPU can consume, can enable a 15\% decrease in energy usage with marginal increase in overall computation time when training a transformer-based language model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题