论文标题

通过Huber统计数据进行均匀性测试的锋利常数

Sharp Constants in Uniformity Testing via the Huber Statistic

论文作者

Gupta, Shivam, Price, Eric

论文摘要

统一性测试是财产测试中最有研究的问题之一,其中许多已知的测试统计数据,包括基于计数碰撞,单例和经验电视距离的统计数据。 It is known that the optimal sample complexity to distinguish the uniform distribution on $m$ elements from any $ε$-far distribution with $1-δ$ probability is $n = Θ\left(\frac{\sqrt{m \log (1/δ)}}{ε^2} + \frac{\log (1/δ)} {ε^2} \ right)$,这是由经验电视测试器实现的。然而,在模拟中,这些理论分析具有误导性:在许多情况下,即使在所有参数的渐近制度中,它们也无法正确排序现有测试人员的性能,即$ 0 $或$ \ infty $。 我们通过研究算法所需的\ emph {常数因子}来解释这一差异。我们表明,碰撞测试仪在均匀输入和非均匀输入之间的标准分离偏差数量中达到了急剧的最大常数。然后,我们基于Huber损失引入了一个新的测试仪,并表明它不仅与此分离相匹配,而且还具有与该分离的高斯相对应的尾巴。这导致样本复杂性为$(1 + O(1))\ frac {\ sqrt {\ sqrt {m \ log(1/δ)}}} {ε^2} $在本术语中占主导地位的制度中,与所有其他现有测试者不同。

Uniformity testing is one of the most well-studied problems in property testing, with many known test statistics, including ones based on counting collisions, singletons, and the empirical TV distance. It is known that the optimal sample complexity to distinguish the uniform distribution on $m$ elements from any $ε$-far distribution with $1-δ$ probability is $n = Θ\left(\frac{\sqrt{m \log (1/δ)}}{ε^2} + \frac{\log (1/δ)}{ε^2}\right)$, which is achieved by the empirical TV tester. Yet in simulation, these theoretical analyses are misleading: in many cases, they do not correctly rank order the performance of existing testers, even in an asymptotic regime of all parameters tending to $0$ or $\infty$. We explain this discrepancy by studying the \emph{constant factors} required by the algorithms. We show that the collisions tester achieves a sharp maximal constant in the number of standard deviations of separation between uniform and non-uniform inputs. We then introduce a new tester based on the Huber loss, and show that it not only matches this separation, but also has tails corresponding to a Gaussian with this separation. This leads to a sample complexity of $(1 + o(1))\frac{\sqrt{m \log (1/δ)}}{ε^2}$ in the regime where this term is dominant, unlike all other existing testers.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源