论文标题
EZLDA:GPU上有效且可扩展的LDA
EZLDA: Efficient and Scalable LDA on GPUs
论文作者
论文摘要
LDA是具有广泛应用的主题建模的统计方法。但是,很少有尝试在GPU上加速LDA的尝试,这些LDA具有出色的计算和内存吞吐量功能。为此,我们介绍了EZLDA,该EZLDA在GPU上实现了有效且可扩展的LDA培训,并提供了以下三个贡献:首先,EZLDA引入了三个分支采样方法,该方法利用了各种代币的收敛异质性来减少冗余采样任务。其次,为了通过快速采样和更新在GPU上启用D和W的稀疏性格式,我们将W介绍W和相应的令牌分区的混合格式,以进行T和倒置索引设计。第三,我们设计了一个分层的工作负载平衡解决方案,以解决多个GPU上GPU和Scaleezlda上极偏斜的工作负载不平衡问题。综上所述,Ezlda在较低的记忆消耗中取得了优于最先进的尝试的表现。
LDA is a statistical approach for topic modeling with a wide range of applications. However, there exist very few attempts to accelerate LDA on GPUs which come with exceptional computing and memory throughput capabilities. To this end, we introduce EZLDA which achieves efficient and scalable LDA training on GPUs with the following three contributions: First, EZLDA introduces three-branch sampling method which takes advantage of the convergence heterogeneity of various tokens to reduce the redundant sampling task. Second, to enable sparsity-aware format for both D and W on GPUs with fast sampling and updating, we introduce hybrid format for W along with corresponding token partition to T and inverted index designs. Third, we design a hierarchical workload balancing solution to address the extremely skewed workload imbalance problem on GPU and scaleEZLDA across multiple GPUs. Taken together, EZLDA achieves superior performance over the state-of-the-art attempts with lower memory consumption.