论文标题
随机近似与样本的平均近似值,瓦斯泰因重中心
Stochastic Approximation versus Sample Average Approximation for population Wasserstein barycenters
论文作者
论文摘要
在机器学习和优化社区中,有两种主要方法是凸风险最小化问题,即随机近似(SA)和样本平均近似(SAA)。就甲骨文的复杂性(所需的随机梯度评估次数)而言,两种方法平均被认为是等效的(最多是对数因素)。总的复杂性取决于特定问题,但是,从工作开始\ cite {nemirovski2009Robust},通常可以接受SA比SAA更好。 %尽管如此,在大规模问题的情况下,SA可能会用尽内存,因为如果没有与其他机器的通信,将所有数据存储在一台计算机上并组织在线访问它是不可能的。与SA相反的SAA允许并行/分布式计算。我们表明,对于Wasserstein Barycenter问题,这种优势可以倒置。我们通过说明SA和SAA实现的复杂性界限来提供详细的比较,该实现计算了针对最佳运输距离和熵调节的最佳运输距离定义的重中心。作为副产品,我们还为$ \ ell_2 $ -norm中的熵调查的最佳传输距离定义的重中心构建置信区间。对于期望给出的一般凸优化问题而得出了初步结果,以便除了Wasserstein Barycenter问题以外的其他应用。
In the machine learning and optimization community, there are two main approaches for the convex risk minimization problem, namely, the Stochastic Approximation (SA) and the Sample Average Approximation (SAA). In terms of oracle complexity (required number of stochastic gradient evaluations), both approaches are considered equivalent on average (up to a logarithmic factor). The total complexity depends on the specific problem, however, starting from work \cite{nemirovski2009robust} it was generally accepted that the SA is better than the SAA. % Nevertheless, in case of large-scale problems SA may run out of memory as storing all data on one machine and organizing online access to it can be impossible without communications with other machines. SAA in contradistinction to SA allows parallel/distributed calculations. We show that for the Wasserstein barycenter problem this superiority can be inverted. We provide a detailed comparison by stating the complexity bounds for the SA and the SAA implementations calculating barycenters defined with respect to optimal transport distances and entropy-regularized optimal transport distances. As a byproduct, we also construct confidence intervals for the barycenter defined with respect to entropy-regularized optimal transport distances in the $\ell_2$-norm. The preliminary results are derived for a general convex optimization problem given by the expectation in order to have other applications besides the Wasserstein barycenter problem.