通过迭代多过滤列表可解码的平均值估计

论文标题

通过迭代多过滤列表可解码的平均值估计

List-Decodable Mean Estimation via Iterative Multi-Filtering

论文作者

Diakonikolas, Ilias, Kane, Daniel M., Kongsgaard, Daniel

论文摘要

我们研究有界协方差分布的{\ em列表可解码平均值}的问题。具体而言，我们将在$ \ mathbb {r}^d $中为我们提供了一组$ t $的积分，并承诺在$ t $中获得未知的$α$ - 分数，其中$ 0 <α<1/2 $是从未知的平均值和界限的协方差分配$ d $中汲取的，并且没有对剩余点进行假设。目的是输出一小部分假设向量列表，以便至少其中一个接近$ d $的平均值。我们为此问题提供了第一个实际可行的估计器。更详细地说，我们的算法是样本和计算上有效的，并在理论上实现了信息差不多的误差。虽然此设置固有的唯一先前算法固有地依赖于椭圆形方法，但我们的算法是迭代的，仅使用光谱技术。我们的主要技术创新是设计具有大多数离群值的高维重尾数据集的软拆卸程序。

We study the problem of {\em list-decodable mean estimation} for bounded covariance distributions. Specifically, we are given a set $T$ of points in $\mathbb{R}^d$ with the promise that an unknown $α$-fraction of points in $T$, where $0< α< 1/2$, are drawn from an unknown mean and bounded covariance distribution $D$, and no assumptions are made on the remaining points. The goal is to output a small list of hypothesis vectors such that at least one of them is close to the mean of $D$. We give the first practically viable estimator for this problem. In more detail, our algorithm is sample and computationally efficient, and achieves information-theoretically near-optimal error. While the only prior algorithm for this setting inherently relied on the ellipsoid method, our algorithm is iterative and only uses spectral techniques. Our main technical innovation is the design of a soft outlier removal procedure for high-dimensional heavy-tailed datasets with a majority of outliers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题