论文标题
在抽样中中和自选择性偏差进行分类
Neutralizing Self-Selection Bias in Sampling for Sortition
论文作者
论文摘要
分类是一个政治制度,在该制度中,随机选择的公民小组做出决定。传统上,选择分类面板的过程被认为是均匀的采样,而无需替换,具有强大的公平性能。但是,实际上,由于只有一小部分代理人愿意在被邀请时参加小组,而不同的人口组以不同的速度参与,因此无法进行无替换的采样。为了生产其组成类似于人口的面板,我们开发了一种抽样算法,该算法可恢复所有代理商的接近平等表示概率,同时满足有意义的人口统计配额。作为其输入的一部分,我们的算法需要概率,以表明池中每个志愿者参与的可能性。由于这些参与概率无法直接观察到,因此我们展示了如何学习它们,并使用真实分类面板上的数据与一般人群的信息相结合,以公开可用的调查数据的形式结合使用。
Sortition is a political system in which decisions are made by panels of randomly selected citizens. The process for selecting a sortition panel is traditionally thought of as uniform sampling without replacement, which has strong fairness properties. In practice, however, sampling without replacement is not possible since only a fraction of agents is willing to participate in a panel when invited, and different demographic groups participate at different rates. In order to still produce panels whose composition resembles that of the population, we develop a sampling algorithm that restores close-to-equal representation probabilities for all agents while satisfying meaningful demographic quotas. As part of its input, our algorithm requires probabilities indicating how likely each volunteer in the pool was to participate. Since these participation probabilities are not directly observable, we show how to learn them, and demonstrate our approach using data on a real sortition panel combined with information on the general population in the form of publicly available survey data.