论文标题
分布式草图方法保存回归
Distributed Sketching Methods for Privacy Preserving Regression
论文作者
论文摘要
在这项工作中,我们研究了针对大规模回归问题的分布草图方法。我们利用多个随机草图来降低问题维度,并保留隐私并改善异步分布式系统中的Straggler弹性。我们得出了新颖的近似值,可保证经典草图方法,并分析分布式草图的参数平均的准确性。我们考虑随机矩阵,包括在分布式设置中的高斯,随机hadamard,均匀的采样和利用得分采样。此外,我们提出了一种结合采样和快速随机投影的混合方法,以提高计算效率。我们说明了通过大规模实验的无服务器计算平台中分布式草图的性能。
In this work, we study distributed sketching methods for large scale regression problems. We leverage multiple randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems. We derive novel approximation guarantees for classical sketching methods and analyze the accuracy of parameter averaging for distributed sketches. We consider random matrices including Gaussian, randomized Hadamard, uniform sampling and leverage score sampling in the distributed setting. Moreover, we propose a hybrid approach combining sampling and fast random projections for better computational efficiency. We illustrate the performance of distributed sketches in a serverless computing platform with large scale experiments.