顶部$ k $排名的部分恢复：MLE的最优性和光谱方法的子优先级

论文标题

顶部$ k $排名的部分恢复：MLE的最优性和光谱方法的子优先级

Partial Recovery for Top-$k$ Ranking: Optimality of MLE and Sub-Optimality of Spectral Method

论文作者

Chen, Pinhan, Gao, Chao, Zhang, Anderson Y.

论文摘要

鉴于Bradley-Terry-Luce（BTL）模型产生的部分观察到的成对比较数据，我们研究了顶部$ K $排名的问题。也就是说，可以最佳地识别顶级$ K $播放器的集合。我们在归一化锤损失方面得出了最小速率。这提供了文献中的第一个结果，该结果以$ k $排名的错误比例来表征部分恢复错误。我们还得出了最佳信号与噪声比条件的确切恢复，顶部$ k $设置的精确恢复。显示最大似然估计器（MLE）既可以实现最佳的部分恢复和最佳的精确恢复。另一方面，我们显示了另一种流行的算法，即光谱方法，通常是最佳的。我们的结果补充了Chen等人的最新工作。（2019年）显示了MLE和光谱方法获得最佳样品复杂性以进行精确恢复。事实证明，对于两种算法，样品复杂性的领先常数是不同的。可能具有独立关注的另一个贡献是对MLE的分析，而无需对BTL模型进行任何惩罚或正规化。这缩小了排名文献中理论与实践之间的重要差距。

Given partially observed pairwise comparison data generated by the Bradley-Terry-Luce (BTL) model, we study the problem of top-$k$ ranking. That is, to optimally identify the set of top-$k$ players. We derive the minimax rate with respect to a normalized Hamming loss. This provides the first result in the literature that characterizes the partial recovery error in terms of the proportion of mistakes for top-$k$ ranking. We also derive the optimal signal to noise ratio condition for the exact recovery of the top-$k$ set. The maximum likelihood estimator (MLE) is shown to achieve both optimal partial recovery and optimal exact recovery. On the other hand, we show another popular algorithm, the spectral method, is in general sub-optimal. Our results complement the recent work by Chen et al. (2019) that shows both the MLE and the spectral method achieve the optimal sample complexity for exact recovery. It turns out the leading constants of the sample complexity are different for the two algorithms. Another contribution that may be of independent interest is the analysis of the MLE without any penalty or regularization for the BTL model. This closes an important gap between theory and practice in the literature of ranking.

下载PDF全文

下载文献需遵守相关版权规定

论文标题