论文标题
贝叶斯信息标准中有效的样本量,用于特定水平的固定和随机效应选择,在两级嵌套模型中
The Effective Sample Size in Bayesian Information Criterion for Level-Specific Fixed and Random Effects Selection in a Two-Level Nested Model
论文作者
论文摘要
流行的统计软件为多级模型或线性混合模型提供了贝叶斯信息标准(BIC)。但是,已经观察到,统计文献和软件文档的组合已导致BIC的公式和不确定性的不确定性在选择多级模型方面相对于特定级别的固定效果和随机效应的差异。这些差异和不确定性是由于样本量的不同规格在BIC的多级模型中的罚款项中产生的。在这项研究中,我们在两级嵌套设计中得出了BIC的惩罚项。在这个新版本的BIC(称为BIC_E)中,如果随机效应方差 - 互动矩阵具有完整的等级,则将此惩罚项分解为两个部分:(a)一个术语,每个群集的平均样本量的日志涉及随机和固定效应效果设计矩阵和(b)总数cl log参数数量的乘数重叠的尺寸的重叠尺寸数量。此外,我们在存在冗余随机效应的情况下研究BIC_E的行为。用教科书示例数据集说明了BIC_E的使用,数值演示表明,派生的公式遵守经验值。
Popular statistical software provides Bayesian information criterion (BIC) for multilevel models or linear mixed models. However, it has been observed that the combination of statistical literature and software documentation has led to discrepancies in the formulas of the BIC and uncertainties of the proper use of the BIC in selecting a multilevel model with respect to level-specific fixed and random effects. These discrepancies and uncertainties result from different specifications of sample size in the BIC's penalty term for multilevel models. In this study, we derive the BIC's penalty term for level-specific fixed and random effect selection in a two-level nested design. In this new version of BIC, called BIC_E, this penalty term is decomposed into two parts if the random effect variance-covariance matrix has full rank: (a) a term with the log of average sample size per cluster whose multiplier involves the overlapping number of dimensions between the column spaces of the random and fixed effect design matrices and (b) the total number of parameters times the log of the total number of clusters. Furthermore, we study the behavior of BIC_E in the presence of redundant random effects. The use of BIC_E is illustrated with a textbook example data set and a numerical demonstration shows that the derived formulae adheres to empirical values.