具有非随机缺失的多元极值二进制数据的共享空间模型。

IF 0.7 Sankhya. Series B (2008) Pub Date : 2021-11-01 Epub Date: 2019-07-16 DOI:10.1007/s13571-019-00198-7

Xiaoyue Zhao, Lin Zhang, Dipankar Bandyopadhyay

{"title":"具有非随机缺失的多元极值二进制数据的共享空间模型。","authors":"Xiaoyue Zhao, Lin Zhang, Dipankar Bandyopadhyay","doi":"10.1007/s13571-019-00198-7","DOIUrl":null,"url":null,"abstract":"Clinical studies and trials on periodontal disease (PD) generate a large volume of data collected at various tooth locations of a subject. However, they present a number of statistical complexities. When our focus is on understanding the extent of extreme PD progression, standard analysis under a generalized linear mixed model framework with a symmetric (logit) link may be inappropriate, as the binary split (extreme disease versus not) maybe highly skewed. In addition, PD progression is often hypothesized to be spatially-referenced, i.e. proximal teeth may have a similar PD status than those that are distally located. Furthermore, a non-ignorable quantity of missing data is observed, and the missingness is non-random, as it informs the periodontal health status of the subject. In this paper, we address all the above concerns through a shared (spatial) latent factor model, where the latent factor jointly models the extreme binary responses via a generalized extreme value regression, and the non-randomly missing teeth via a probit regression. Our approach is Bayesian, and the inferential framework is powered by within-Gibbs Hamiltonian Monte Carlo techniques. Through simulation studies and application to a real dataset on PD, we demonstrate the potential advantages of our model in terms of model fit, and obtaining precise parameter estimates over alternatives that do not consider the aforementioned complexities.","PeriodicalId":74754,"journal":{"name":"Sankhya. Series B (2008)","volume":"83 2","pages":"374-396"},"PeriodicalIF":0.7000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s13571-019-00198-7","citationCount":"1","resultStr":"{\"title\":\"A shared spatial model for multivariate extreme-valued binary data with non-random missingness.\",\"authors\":\"Xiaoyue Zhao, Lin Zhang, Dipankar Bandyopadhyay\",\"doi\":\"10.1007/s13571-019-00198-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clinical studies and trials on periodontal disease (PD) generate a large volume of data collected at various tooth locations of a subject. However, they present a number of statistical complexities. When our focus is on understanding the extent of extreme PD progression, standard analysis under a generalized linear mixed model framework with a symmetric (logit) link may be inappropriate, as the binary split (extreme disease versus not) maybe highly skewed. In addition, PD progression is often hypothesized to be spatially-referenced, i.e. proximal teeth may have a similar PD status than those that are distally located. Furthermore, a non-ignorable quantity of missing data is observed, and the missingness is non-random, as it informs the periodontal health status of the subject. In this paper, we address all the above concerns through a shared (spatial) latent factor model, where the latent factor jointly models the extreme binary responses via a generalized extreme value regression, and the non-randomly missing teeth via a probit regression. Our approach is Bayesian, and the inferential framework is powered by within-Gibbs Hamiltonian Monte Carlo techniques. Through simulation studies and application to a real dataset on PD, we demonstrate the potential advantages of our model in terms of model fit, and obtaining precise parameter estimates over alternatives that do not consider the aforementioned complexities.\",\"PeriodicalId\":74754,\"journal\":{\"name\":\"Sankhya. Series B (2008)\",\"volume\":\"83 2\",\"pages\":\"374-396\"},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1007/s13571-019-00198-7\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sankhya. Series B (2008)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s13571-019-00198-7\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2019/7/16 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sankhya. Series B (2008)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13571-019-00198-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2019/7/16 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

牙周病(PD)的临床研究和试验产生了大量的数据，这些数据收集于受试者的不同牙齿位置。然而，它们带来了一些统计上的复杂性。当我们的重点是了解极端PD进展的程度时，在具有对称(logit)链接的广义线性混合模型框架下的标准分析可能是不合适的，因为二元分裂(极端疾病与非极端疾病)可能高度倾斜。此外，PD的进展通常被假设为空间参考，即近端牙齿可能比远端牙齿具有相似的PD状态。此外，观察到不可忽视的缺失数据量，并且缺失是非随机的，因为它告知受试者的牙周健康状况。在本文中，我们通过一个共享(空间)潜在因素模型来解决上述所有问题，其中潜在因素通过广义极值回归联合建模极端二元响应，通过probit回归联合建模非随机缺失牙齿。我们的方法是贝叶斯，而推理框架是由吉布斯哈密顿蒙特卡罗技术提供支持的。通过对PD真实数据集的仿真研究和应用，我们证明了我们的模型在模型拟合方面的潜在优势，并且与不考虑上述复杂性的替代方案相比，可以获得精确的参数估计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A shared spatial model for multivariate extreme-valued binary data with non-random missingness.

Clinical studies and trials on periodontal disease (PD) generate a large volume of data collected at various tooth locations of a subject. However, they present a number of statistical complexities. When our focus is on understanding the extent of extreme PD progression, standard analysis under a generalized linear mixed model framework with a symmetric (logit) link may be inappropriate, as the binary split (extreme disease versus not) maybe highly skewed. In addition, PD progression is often hypothesized to be spatially-referenced, i.e. proximal teeth may have a similar PD status than those that are distally located. Furthermore, a non-ignorable quantity of missing data is observed, and the missingness is non-random, as it informs the periodontal health status of the subject. In this paper, we address all the above concerns through a shared (spatial) latent factor model, where the latent factor jointly models the extreme binary responses via a generalized extreme value regression, and the non-randomly missing teeth via a probit regression. Our approach is Bayesian, and the inferential framework is powered by within-Gibbs Hamiltonian Monte Carlo techniques. Through simulation studies and application to a real dataset on PD, we demonstrate the potential advantages of our model in terms of model fit, and obtaining precise parameter estimates over alternatives that do not consider the aforementioned complexities.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Sankhya. Series B (2008)

自引率

0.00%

发文量