首页 > 最新文献

International Journal of Biostatistics最新文献

英文 中文
Testing the Relative Performance of Data Adaptive Prediction Algorithms: A Generalized Test of Conditional Risk Differences 测试数据自适应预测算法的相对性能:条件风险差异的广义测试
IF 1.2 4区 数学 Pub Date : 2016-05-01 DOI: 10.1515/ijb-2015-0014
B. Goldstein, E. Polley, F. Briggs, M. J. van der Laan, A. Hubbard
Abstract Comparing the relative fit of competing models can be used to address many different scientific questions. In classical statistics one can, if appropriate, use likelihood ratio tests and information based criterion, whereas clinical medicine has tended to rely on comparisons of fit metrics like C-statistics. However, for many data adaptive modelling procedures such approaches are not suitable. In these cases, statisticians have used cross-validation, which can make inference challenging. In this paper we propose a general approach that focuses on the “conditional” risk difference (conditional on the model fits being fixed) for the improvement in prediction risk. Specifically, we derive a Wald-type test statistic and associated confidence intervals for cross-validated test sets utilizing the independent validation within cross-validation in conjunction with a test for multiple comparisons. We show that this test maintains proper Type I Error under the null fit, and can be used as a general test of relative fit for any semi-parametric model alternative. We apply the test to a candidate gene study to test for the association of a set of genes in a genetic pathway.
比较竞争模型的相对拟合可以用来解决许多不同的科学问题。在经典统计学中,如果合适的话,可以使用似然比检验和基于信息的标准,而临床医学往往依赖于c统计等拟合度量的比较。然而,对于许多数据自适应建模程序,这种方法并不适用。在这些情况下,统计学家使用交叉验证,这可能使推理具有挑战性。在本文中,我们提出了一种通用的方法,重点关注“条件”风险差异(条件是模型拟合是固定的),以提高预测风险。具体地说,我们利用交叉验证中的独立验证与多个比较的测试相结合,得出了交叉验证测试集的wald型检验统计量和相关置信区间。我们表明,该检验在零拟合下保持适当的I型误差,并且可以用作任何半参数模型替代的相对拟合的一般检验。我们将测试应用于候选基因研究,以测试一组基因在遗传途径中的关联。
{"title":"Testing the Relative Performance of Data Adaptive Prediction Algorithms: A Generalized Test of Conditional Risk Differences","authors":"B. Goldstein, E. Polley, F. Briggs, M. J. van der Laan, A. Hubbard","doi":"10.1515/ijb-2015-0014","DOIUrl":"https://doi.org/10.1515/ijb-2015-0014","url":null,"abstract":"Abstract Comparing the relative fit of competing models can be used to address many different scientific questions. In classical statistics one can, if appropriate, use likelihood ratio tests and information based criterion, whereas clinical medicine has tended to rely on comparisons of fit metrics like C-statistics. However, for many data adaptive modelling procedures such approaches are not suitable. In these cases, statisticians have used cross-validation, which can make inference challenging. In this paper we propose a general approach that focuses on the “conditional” risk difference (conditional on the model fits being fixed) for the improvement in prediction risk. Specifically, we derive a Wald-type test statistic and associated confidence intervals for cross-validated test sets utilizing the independent validation within cross-validation in conjunction with a test for multiple comparisons. We show that this test maintains proper Type I Error under the null fit, and can be used as a general test of relative fit for any semi-parametric model alternative. We apply the test to a candidate gene study to test for the association of a set of genes in a genetic pathway.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"12 1","pages":"117 - 129"},"PeriodicalIF":1.2,"publicationDate":"2016-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2015-0014","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66987670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Semiparametric Bayesian Approach for Analyzing Longitudinal Data from Multiple Related Groups 多相关组纵向数据分析的半参数贝叶斯方法
IF 1.2 4区 数学 Pub Date : 2015-11-01 DOI: 10.1515/ijb-2015-0002
Kiranmoy Das, Prince Afriyie, Lauren Spirko
Abstract Often the biological and/or clinical experiments result in longitudinal data from multiple related groups. The analysis of such data is quite challenging due to the fact that groups might have shared information on the mean and/or covariance functions. In this article, we consider a Bayesian semiparametric approach of modeling the mean trajectories for longitudinal response coming from multiple related groups. We consider matrix stick-breaking process priors on the group mean parameters which allows information sharing on the mean trajectories across the groups. Simulation studies are performed to demonstrate the effectiveness of the proposed approach compared to the more traditional approaches. We analyze data from a one-year follow-up of nutrition education for hypercholesterolemic children with three different treatments where the children are from different age-groups. Our analysis provides more clinically useful information than the previous analysis of the same dataset. The proposed approach will be a very powerful tool for analyzing data from clinical trials and other medical experiments.
通常,生物学和/或临床实验的结果是来自多个相关群体的纵向数据。这类数据的分析是相当具有挑战性的,因为群体可能在均值和/或协方差函数上共享信息。在本文中,我们考虑了贝叶斯半参数方法来模拟来自多个相关组的纵向响应的平均轨迹。我们考虑了组平均参数上的矩阵断棒过程先验,这允许在组之间的平均轨迹上共享信息。进行了仿真研究,以证明所提出的方法与更传统的方法相比是有效的。我们分析了对高胆固醇儿童进行为期一年的营养教育随访的数据,这些儿童来自不同的年龄组,采用三种不同的治疗方法。我们的分析提供了比以前对相同数据集的分析更多的临床有用信息。该方法将成为分析临床试验和其他医学实验数据的有力工具。
{"title":"A Semiparametric Bayesian Approach for Analyzing Longitudinal Data from Multiple Related Groups","authors":"Kiranmoy Das, Prince Afriyie, Lauren Spirko","doi":"10.1515/ijb-2015-0002","DOIUrl":"https://doi.org/10.1515/ijb-2015-0002","url":null,"abstract":"Abstract Often the biological and/or clinical experiments result in longitudinal data from multiple related groups. The analysis of such data is quite challenging due to the fact that groups might have shared information on the mean and/or covariance functions. In this article, we consider a Bayesian semiparametric approach of modeling the mean trajectories for longitudinal response coming from multiple related groups. We consider matrix stick-breaking process priors on the group mean parameters which allows information sharing on the mean trajectories across the groups. Simulation studies are performed to demonstrate the effectiveness of the proposed approach compared to the more traditional approaches. We analyze data from a one-year follow-up of nutrition education for hypercholesterolemic children with three different treatments where the children are from different age-groups. Our analysis provides more clinically useful information than the previous analysis of the same dataset. The proposed approach will be a very powerful tool for analyzing data from clinical trials and other medical experiments.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"30 1","pages":"273 - 284"},"PeriodicalIF":1.2,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2015-0002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66987590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Multiple-Objective Optimal Designs for Studying the Dose Response Function and Interesting Dose Levels 剂量响应函数和感兴趣剂量水平研究的多目标优化设计
IF 1.2 4区 数学 Pub Date : 2015-11-01 DOI: 10.1515/ijb-2015-0044
Seung Won Hyun, W. Wong
Abstract We construct an optimal design to simultaneously estimate three common interesting features in a dose-finding trial with possibly different emphasis on each feature. These features are (1) the shape of the dose-response curve, (2) the median effective dose and (3) the minimum effective dose level. A main difficulty of this task is that an optimal design for a single objective may not perform well for other objectives. There are optimal designs for dual objectives in the literature but we were unable to find optimal designs for 3 or more objectives to date with a concrete application. A reason for this is that the approach for finding a dual-objective optimal design does not work well for a 3 or more multiple-objective design problem. We propose a method for finding multiple-objective optimal designs that estimate the three features with user-specified higher efficiencies for the more important objectives. We use the flexible 4-parameter logistic model to illustrate the methodology but our approach is applicable to find multiple-objective optimal designs for other types of objectives and models. We also investigate robustness properties of multiple-objective optimal designs to mis-specification in the nominal parameter values and to a variation in the optimality criterion. We also provide computer code for generating tailor made multiple-objective optimal designs.
我们构建了一个优化设计,以同时估计剂量发现试验中三个共同的有趣特征,每个特征的重点可能不同。这些特征是(1)剂量-反应曲线的形状,(2)中位有效剂量和(3)最小有效剂量水平。这项任务的一个主要困难是,针对单个目标的最佳设计可能不适用于其他目标。文献中有针对双目标的最佳设计,但我们无法找到针对3个或更多目标的最佳设计。原因在于,寻找双目标优化设计的方法并不适用于3个或更多的多目标设计问题。我们提出了一种寻找多目标优化设计的方法,该方法估计了三个特征,对于更重要的目标,用户指定的效率更高。我们使用灵活的四参数逻辑模型来说明方法,但我们的方法适用于寻找其他类型的目标和模型的多目标优化设计。我们还研究了多目标优化设计在标称参数值错误规范和最优性准则变化时的鲁棒性。我们还提供了生成量身定制的多目标优化设计的计算机代码。
{"title":"Multiple-Objective Optimal Designs for Studying the Dose Response Function and Interesting Dose Levels","authors":"Seung Won Hyun, W. Wong","doi":"10.1515/ijb-2015-0044","DOIUrl":"https://doi.org/10.1515/ijb-2015-0044","url":null,"abstract":"Abstract We construct an optimal design to simultaneously estimate three common interesting features in a dose-finding trial with possibly different emphasis on each feature. These features are (1) the shape of the dose-response curve, (2) the median effective dose and (3) the minimum effective dose level. A main difficulty of this task is that an optimal design for a single objective may not perform well for other objectives. There are optimal designs for dual objectives in the literature but we were unable to find optimal designs for 3 or more objectives to date with a concrete application. A reason for this is that the approach for finding a dual-objective optimal design does not work well for a 3 or more multiple-objective design problem. We propose a method for finding multiple-objective optimal designs that estimate the three features with user-specified higher efficiencies for the more important objectives. We use the flexible 4-parameter logistic model to illustrate the methodology but our approach is applicable to find multiple-objective optimal designs for other types of objectives and models. We also investigate robustness properties of multiple-objective optimal designs to mis-specification in the nominal parameter values and to a variation in the optimality criterion. We also provide computer code for generating tailor made multiple-objective optimal designs.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"11 1","pages":"253 - 271"},"PeriodicalIF":1.2,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2015-0044","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66988220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Multiple Comparisons Using Composite Likelihood in Clustered Data 在聚类数据中使用复合似然的多重比较
IF 1.2 4区 数学 Pub Date : 2014-11-05 DOI: 10.1515/ijb-2016-0004
M. Azadbakhsh, Xin Gao, H. Jankowski
Abstract We study the problem of multiple hypothesis testing for correlated clustered data. As the existing multiple comparison procedures based on maximum likelihood estimation could be computationally intensive, we propose to construct multiple comparison procedures based on composite likelihood method. The new test statistics account for the correlation structure within the clusters and are computationally convenient to compute. Simulation studies show that the composite likelihood based procedures maintain good control of the familywise type I error rate in the presence of intra-cluster correlation, whereas ignoring the correlation leads to erratic performance.
摘要研究了相关聚类数据的多重假设检验问题。针对现有基于极大似然估计的多重比较过程计算量大的问题,提出基于复合似然方法构建多重比较过程。新的测试统计量考虑了聚类内部的相关结构,计算方便。仿真研究表明,在存在簇内相关性的情况下,基于复合似然的方法可以很好地控制家族I型错误率,而忽略相关性会导致性能不稳定。
{"title":"Multiple Comparisons Using Composite Likelihood in Clustered Data","authors":"M. Azadbakhsh, Xin Gao, H. Jankowski","doi":"10.1515/ijb-2016-0004","DOIUrl":"https://doi.org/10.1515/ijb-2016-0004","url":null,"abstract":"Abstract We study the problem of multiple hypothesis testing for correlated clustered data. As the existing multiple comparison procedures based on maximum likelihood estimation could be computationally intensive, we propose to construct multiple comparison procedures based on composite likelihood method. The new test statistics account for the correlation structure within the clusters and are computationally convenient to compute. Simulation studies show that the composite likelihood based procedures maintain good control of the familywise type I error rate in the presence of intra-cluster correlation, whereas ignoring the correlation leads to erratic performance.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"12 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2014-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2016-0004","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66988085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Functional and Parametric Estimation in a Semi- and Nonparametric Model with Application to Mass-Spectrometry Data 半参数和非参数模型的函数和参数估计及其在质谱数据中的应用
IF 1.2 4区 数学 Pub Date : 2013-05-07 DOI: 10.1515/ijb-2014-0066
Weiping Ma, Yang Feng, Kani Chen, Z. Ying
Abstract Motivated by modeling and analysis of mass-spectrometry data, a semi- and nonparametric model is proposed that consists of linear parametric components for individual location and scale and a nonparametric regression function for the common shape. A multi-step approach is developed that simultaneously estimates the parametric components and the nonparametric function. Under certain regularity conditions, it is shown that the resulting estimators is consistent and asymptotic normal for the parametric part and achieve the optimal rate of convergence for the nonparametric part when the bandwidth is suitably chosen. Simulation results are presented to demonstrate the effectiveness and finite-sample performance of the method. The method is also applied to a SELDI-TOF mass spectrometry data set from a study of liver cancer patients.
摘要基于质谱数据的建模和分析,提出了一种半参数和非参数模型,该模型由用于个体位置和尺度的线性参数分量和用于公共形状的非参数回归函数组成。提出了一种同时估计参数分量和非参数函数的多步方法。在一定的正则性条件下,得到的估计量对于参数部分是一致的和渐近正态的;对于非参数部分,当带宽选择适当时,得到了最优的收敛速率。仿真结果验证了该方法的有效性和有限样本性能。该方法也适用于来自肝癌患者研究的SELDI-TOF质谱数据集。
{"title":"Functional and Parametric Estimation in a Semi- and Nonparametric Model with Application to Mass-Spectrometry Data","authors":"Weiping Ma, Yang Feng, Kani Chen, Z. Ying","doi":"10.1515/ijb-2014-0066","DOIUrl":"https://doi.org/10.1515/ijb-2014-0066","url":null,"abstract":"Abstract Motivated by modeling and analysis of mass-spectrometry data, a semi- and nonparametric model is proposed that consists of linear parametric components for individual location and scale and a nonparametric regression function for the common shape. A multi-step approach is developed that simultaneously estimates the parametric components and the nonparametric function. Under certain regularity conditions, it is shown that the resulting estimators is consistent and asymptotic normal for the parametric part and achieve the optimal rate of convergence for the nonparametric part when the bandwidth is suitably chosen. Simulation results are presented to demonstrate the effectiveness and finite-sample performance of the method. The method is also applied to a SELDI-TOF mass spectrometry data set from a study of liver cancer patients.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"11 1","pages":"285 - 303"},"PeriodicalIF":1.2,"publicationDate":"2013-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2014-0066","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66987582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Relative Risk Estimation in Cluster Randomized Trials: A Comparison of Generalized Estimating Equation Methods 聚类随机试验的相对风险估计:广义估计方程方法的比较
IF 1.2 4区 数学 Pub Date : 2011-05-21 DOI: 10.2202/1557-4679.1323
L. Yelland, A. Salter, Philip Ryan
Relative risks have become a popular measure of treatment effect for binary outcomes in randomized controlled trials (RCTs). Relative risks can be estimated directly using log binomial regression but the model may fail to converge. Alternative methods are available for estimating relative risks but these have generally only been evaluated for independent data. As some of these methods are now being applied in cluster RCTs, investigation of their performance in this context is needed. We compare log binomial regression and three alternative methods (expanded logistic regression, log Poisson regression and log normal regression) for estimating relative risks in cluster RCTs. Clustering is taken into account using generalized estimating equations (GEEs) with an independence or exchangeable working correlation structure. The results of our large simulation study show that the log binomial GEE generally performs well for clustered data but suffers from convergence problems, as expected. Both the log Poisson GEE and log normal GEE have advantages in certain settings in terms of type I error, bias and coverage. The expanded logistic GEE can perform poorly and is sensitive to the chosen working correlation structure. Conclusions about the effectiveness of treatment often differ depending on the method used, highlighting the need to pre-specify an analysis approach. We recommend pre-specifying that either the log Poisson GEE or log normal GEE will be used in the event that the log binomial GEE fails to converge.
在随机对照试验(rct)中,相对危险度已成为衡量二元结果治疗效果的常用指标。使用对数二项回归可以直接估计相对风险,但模型可能无法收敛。有其他方法可用于估计相对风险,但这些方法通常仅对独立数据进行了评估。由于其中一些方法目前正在集群随机对照试验中应用,因此有必要研究它们在这种情况下的性能。我们比较了对数二项回归和三种替代方法(扩展逻辑回归、对数泊松回归和对数正态回归)在集群随机对照试验中的相对风险估计。使用具有独立或可交换工作关联结构的广义估计方程(GEEs)来考虑聚类。我们的大型模拟研究结果表明,对数二项GEE对于聚类数据通常表现良好,但正如预期的那样存在收敛问题。对数泊松曲线和对数正态曲线在I型误差、偏差和覆盖范围等方面都具有一定的优势。扩展后的逻辑GEE性能较差,且对选择的工作关联结构比较敏感。关于治疗有效性的结论往往因使用的方法而异,这突出了预先指定分析方法的必要性。我们建议在日志二项式GEE不能收敛的情况下,预先指定使用日志泊松GEE或日志正态GEE。
{"title":"Relative Risk Estimation in Cluster Randomized Trials: A Comparison of Generalized Estimating Equation Methods","authors":"L. Yelland, A. Salter, Philip Ryan","doi":"10.2202/1557-4679.1323","DOIUrl":"https://doi.org/10.2202/1557-4679.1323","url":null,"abstract":"Relative risks have become a popular measure of treatment effect for binary outcomes in randomized controlled trials (RCTs). Relative risks can be estimated directly using log binomial regression but the model may fail to converge. Alternative methods are available for estimating relative risks but these have generally only been evaluated for independent data. As some of these methods are now being applied in cluster RCTs, investigation of their performance in this context is needed. We compare log binomial regression and three alternative methods (expanded logistic regression, log Poisson regression and log normal regression) for estimating relative risks in cluster RCTs. Clustering is taken into account using generalized estimating equations (GEEs) with an independence or exchangeable working correlation structure. The results of our large simulation study show that the log binomial GEE generally performs well for clustered data but suffers from convergence problems, as expected. Both the log Poisson GEE and log normal GEE have advantages in certain settings in terms of type I error, bias and coverage. The expanded logistic GEE can perform poorly and is sensitive to the chosen working correlation structure. Conclusions about the effectiveness of treatment often differ depending on the method used, highlighting the need to pre-specify an analysis approach. We recommend pre-specifying that either the log Poisson GEE or log normal GEE will be used in the event that the log binomial GEE fails to converge.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"7 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2011-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1323","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68718384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
A First Passage Time Model for Long-Term Survivors with Competing Risks 具有竞争风险的长期幸存者的首次通过时间模型
IF 1.2 4区 数学 Pub Date : 2011-05-21 DOI: 10.2202/1557-4679.1224
Ruimin Xu, P. McNicholas, A. Desmond, G. Darlington
We investigate a competing risks model, using the specification of the Gompertz distribution for failure times from competing causes and the inverse Gaussian distribution for failure times from the cause of interest. The expectation-maximization algorithm is used for parameter estimation and the model is applied to real data on breast cancer and melanoma. In these applications, our models compare favourably with existing techniques. The proposed method provides a useful technique that may be more broadly applicable than existing alternatives.
我们研究了一个竞争风险模型,使用来自竞争原因的失败次数的Gompertz分布和来自利益原因的失败次数的逆高斯分布的规范。采用期望最大化算法进行参数估计,并将该模型应用于乳腺癌和黑色素瘤的实际数据。在这些应用中,我们的模型与现有技术相比具有优势。所提出的方法提供了一种有用的技术,可能比现有的替代方法更广泛地适用。
{"title":"A First Passage Time Model for Long-Term Survivors with Competing Risks","authors":"Ruimin Xu, P. McNicholas, A. Desmond, G. Darlington","doi":"10.2202/1557-4679.1224","DOIUrl":"https://doi.org/10.2202/1557-4679.1224","url":null,"abstract":"We investigate a competing risks model, using the specification of the Gompertz distribution for failure times from competing causes and the inverse Gaussian distribution for failure times from the cause of interest. The expectation-maximization algorithm is used for parameter estimation and the model is applied to real data on breast cancer and melanoma. In these applications, our models compare favourably with existing techniques. The proposed method provides a useful technique that may be more broadly applicable than existing alternatives.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"7 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2011-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1224","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68717157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Lower Bound Model for Multiple Record Systems Estimation with Heterogeneous Catchability 具有异构可捕获性的多记录系统估计下界模型
IF 1.2 4区 数学 Pub Date : 2011-05-18 DOI: 10.2202/1557-4679.1283
L. Rivest
This work considers the estimation of the size N of a closed population using incomplete lists of its members. Capture histories are constructed by establishing the presence or the absence of each individual in all the lists available. Models for data featuring a heterogeneous catchability and list dependencies are considered. A log-linear model leading to a lower bound for the population size is derived for a known set of list dependencies and a latent catchability variable with an arbitrary distribution. This generalizes Chao’s lower bound to models with interactions. The proposed model can be used to carry out a search for important list interactions. It also provides diagnostic information about the nature of the underlying heterogeneity. Indeed, it is shown that the Poisson maximum likelihood estimator of N under a dichotomous latent class model does not exist for a particular set of LB models. Several distributions for the heterogeneous catchability are considered; they allow to investigate the sensitivity of the population size estimate to the model for the heterogeneous catchability.
这项工作考虑了使用其成员的不完整列表估计封闭种群的大小N。捕获历史是通过确定所有可用列表中每个个体的存在或不存在来构建的。考虑了具有异构可捕获性和列表依赖性的数据模型。对于已知的列表依赖项集和具有任意分布的潜在可捕获性变量,导出了导致总体大小下界的对数线性模型。这将Chao的下界推广到具有相互作用的模型。所提出的模型可用于搜索重要的列表交互。它还提供了关于潜在异质性性质的诊断信息。事实上,对于一组特定的LB模型,在二分类潜在类模型下N的泊松极大似然估计量不存在。考虑了异质捕集力的几种分布;它们允许研究种群大小估计对异质捕获能力模型的敏感性。
{"title":"A Lower Bound Model for Multiple Record Systems Estimation with Heterogeneous Catchability","authors":"L. Rivest","doi":"10.2202/1557-4679.1283","DOIUrl":"https://doi.org/10.2202/1557-4679.1283","url":null,"abstract":"This work considers the estimation of the size N of a closed population using incomplete lists of its members. Capture histories are constructed by establishing the presence or the absence of each individual in all the lists available. Models for data featuring a heterogeneous catchability and list dependencies are considered. A log-linear model leading to a lower bound for the population size is derived for a known set of list dependencies and a latent catchability variable with an arbitrary distribution. This generalizes Chao’s lower bound to models with interactions. The proposed model can be used to carry out a search for important list interactions. It also provides diagnostic information about the nature of the underlying heterogeneity. Indeed, it is shown that the Poisson maximum likelihood estimator of N under a dichotomous latent class model does not exist for a particular set of LB models. Several distributions for the heterogeneous catchability are considered; they allow to investigate the sensitivity of the population size estimate to the model for the heterogeneous catchability.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"41 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2011-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1283","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68717594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Reaction to Pearl's Critique of Principal Stratification 对珀尔《主要分层批判》的反应
IF 1.2 4区 数学 Pub Date : 2011-04-13 DOI: 10.2202/1557-4679.1324
Arvid Sjolander
This Reader’s Reaction contains some brief remarks regarding Pearl’s concerns regarding the value of principal stratification.
这个读者的反应包含了一些关于珀尔对主要分层的价值的关注的简短评论。
{"title":"Reaction to Pearl's Critique of Principal Stratification","authors":"Arvid Sjolander","doi":"10.2202/1557-4679.1324","DOIUrl":"https://doi.org/10.2202/1557-4679.1324","url":null,"abstract":"This Reader’s Reaction contains some brief remarks regarding Pearl’s concerns regarding the value of principal stratification.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"7 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2011-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1324","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68718456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Bayesian Variable Selection in Semiparametric Proportional Hazards Model for High Dimensional Survival Data 高维生存数据半参数比例风险模型中的贝叶斯变量选择
IF 1.2 4区 数学 Pub Date : 2011-04-07 DOI: 10.2202/1557-4679.1301
Kyu Ha Lee, S. Chakraborty, Jianguo Sun
Variable selection for high dimensional data has recently received a great deal of attention. However, due to the complex structure of the likelihood, only limited developments have been made for time-to-event data where censoring is present. In this paper, we propose a Bayesian variable selection scheme for a Bayesian semiparametric survival model for right censored survival data sets. A special shrinkage prior on the coefficients corresponding to the predictor variables is used to handle cases when the explanatory variables are of very high-dimension. The shrinkage prior is obtained through a scale mixture representation of Normal and Gamma distributions. Our proposed variable selection prior corresponds to the well known lasso penalty. The likelihood function is based on the Cox proportional hazards model framework, where the cumulative baseline hazard function is modeled a priori by a gamma process. We assign a prior on the tuning parameter of the shrinkage prior and adaptively control the sparsity of our model. The primary use of the proposed model is to identify the important covariates relating to the survival curves. To implement our methodology, we have developed a fast Markov chain Monte Carlo algorithm with an adaptive jumping rule. We have successfully applied our method on simulated data sets under two different settings and real microarray data sets which contain right censored survival time. The performance of our Bayesian variable selection model compared with other competing methods is also provided to demonstrate the superiority of our method. A short description of the biological relevance of the selected genes in the real data sets is provided, further strengthening our claims.
高维数据的变量选择问题近年来受到了广泛的关注。然而,由于可能性的复杂结构,对于存在审查的事件时间数据,只进行了有限的开发。本文针对右截尾生存数据集的贝叶斯半参数生存模型,提出了一个贝叶斯变量选择方案。当解释变量具有非常高的维度时,对预测变量对应的系数使用特殊的先验收缩来处理。收缩先验是通过正态分布和伽玛分布的比例混合表示获得的。我们提出的变量选择先验对应于众所周知的套索惩罚。似然函数基于Cox比例风险模型框架,其中累积基线风险函数通过gamma过程先验建模。我们对收缩先验的调整参数赋予一个先验,并自适应地控制模型的稀疏度。该模型的主要用途是识别与生存曲线相关的重要协变量。为了实现我们的方法,我们开发了一个具有自适应跳跃规则的快速马尔可夫链蒙特卡罗算法。我们成功地将我们的方法应用于两种不同设置下的模拟数据集和包含正确截短存活时间的真实微阵列数据集。最后,将贝叶斯变量选择模型的性能与其他竞争方法进行了比较,证明了该方法的优越性。在真实的数据集中提供了所选基因的生物学相关性的简短描述,进一步加强了我们的主张。
{"title":"Bayesian Variable Selection in Semiparametric Proportional Hazards Model for High Dimensional Survival Data","authors":"Kyu Ha Lee, S. Chakraborty, Jianguo Sun","doi":"10.2202/1557-4679.1301","DOIUrl":"https://doi.org/10.2202/1557-4679.1301","url":null,"abstract":"Variable selection for high dimensional data has recently received a great deal of attention. However, due to the complex structure of the likelihood, only limited developments have been made for time-to-event data where censoring is present. In this paper, we propose a Bayesian variable selection scheme for a Bayesian semiparametric survival model for right censored survival data sets. A special shrinkage prior on the coefficients corresponding to the predictor variables is used to handle cases when the explanatory variables are of very high-dimension. The shrinkage prior is obtained through a scale mixture representation of Normal and Gamma distributions. Our proposed variable selection prior corresponds to the well known lasso penalty. The likelihood function is based on the Cox proportional hazards model framework, where the cumulative baseline hazard function is modeled a priori by a gamma process. We assign a prior on the tuning parameter of the shrinkage prior and adaptively control the sparsity of our model. The primary use of the proposed model is to identify the important covariates relating to the survival curves. To implement our methodology, we have developed a fast Markov chain Monte Carlo algorithm with an adaptive jumping rule. We have successfully applied our method on simulated data sets under two different settings and real microarray data sets which contain right censored survival time. The performance of our Bayesian variable selection model compared with other competing methods is also provided to demonstrate the superiority of our method. A short description of the biological relevance of the selected genes in the real data sets is provided, further strengthening our claims.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"7 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2011-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1301","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68717639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
期刊
International Journal of Biostatistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1