论文标题
具有本体论的大频率表中的数据挖掘,并应用于疫苗不良事件报告系统
Data Mining in Large Frequency Tables With Ontology, with an Application to the Vaccine Adverse Event Reporting System
论文作者
论文摘要
疫苗安全是公众的一个问题,并且已经开发出许多信号检测方法来识别疫苗和不良事件之间的相对风险(AES)。这些方法通常关注数据的随机性很高。结果通常是不准确的,缺乏临床意义。 AE本体学包含有关AE的生物学相似性的信息。基于此,我们将相对风险(RRS)的概念扩展到AE组级别,从而可以通过利用整个组的数据来进行更准确和有意义的估计。在本文中,我们提出了基于零膨胀的负二项式分布的方法ZGPS.AO(零膨胀的伽马泊托恩收缩量)。该模型有两个紫色:一个回归模型估计组水平RR,以及一个评估AE级RR的经验贝叶斯框架。回归零件可以处理多余的零,并且在数据中的分散性过多,并且经验方法借用了组级别和AE级别的信息,以减少数据噪声并稳定AE级别的结果。我们已经通过模拟数据证明了模型的无偏见和较低的方差特征,并获得了与先前对VAERS(疫苗不良事件事件报告系统)数据库的有意义的结果相一致的。提出的方法是在R软件包ZGPS.AO中实现的,该方法可以从Cran的综合R档案网络中安装。使用交互式Web应用程序rshiny可视化VAERS数据的结果。
Vaccine safety is a concerning problem of the public, and many signal detecting methods have been developed to identify relative risks between vaccines and adverse events (AEs). Those methods usually focus on individual AEs, where the randomness of data is high. The results often turn out to be inaccurate and lack of clinical meaning. The AE ontology contains information about biological similarity of AEs. Based on this, we extend the concept of relative risks (RRs) to AE group level, which allows the possibility of more accurate and meaningful estimation by utilizing data from the whole group. In this paper, we propose the method zGPS.AO (Zero Inflated Gamma Poisson Shrinker with AE ontology) based on the zero inflated negative binomial distribution. This model has two purples: a regression model estimating group level RRs, and a empirical bayes framework to evaluate AE level RRs. The regression part can handle both excess zeros and over dispersion in the data, and the empirical method borrows information from both group level and AE level to reduce data noise and stabilize the AE level result. We have demonstrate the unbiaseness and low variance features of our model with simulated data, and obtained meaningful results coherent with previous studies on the VAERS (Vaccine Adverse Event Reporting System) database. The proposed methods are implemented in the R package zGPS.AO, which can be installed from the Comprehensive R Archive Network, CRAN. The results on VAERS data are visualized using the interactive web app Rshiny.