论文标题

ExtremeTert:用于加速定制Bert预处理的工具包

ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT

论文作者

Pan, Rui, Diao, Shizhe, Chen, Jianlin, Zhang, Tong

论文摘要

在本文中,我们介绍了ExtremeTert,这是一种用于加速和定制BERT预训练的工具包。我们的目标是为研究社区和行业提供易于使用的BERT预处理工具包。因此,自定义数据集上流行语言模型的审议是有限的资源负担得起的。实验表明,为了获得相同或更好的胶水分数,与原始BERT纸相比,Bert Base的工具包的时间成本超过$ 6 \ tims $ $ $乘以$ 9 \ $ 9 \ $ 9 \倍的$倍。该文档和代码在Apache-2.0许可下于https://github.com/extreme-bert/extreme-bert发布​​。

In this paper, we present ExtremeBERT, a toolkit for accelerating and customizing BERT pretraining. Our goal is to provide an easy-to-use BERT pretraining toolkit for the research community and industry. Thus, the pretraining of popular language models on customized datasets is affordable with limited resources. Experiments show that, to achieve the same or better GLUE scores, the time cost of our toolkit is over $6\times$ times less for BERT Base and $9\times$ times less for BERT Large when compared with the original BERT paper. The documentation and code are released at https://github.com/extreme-bert/extreme-bert under the Apache-2.0 license.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源