首页 > 最新文献

Journal of Applied Statistics最新文献

英文 中文
Integrative rank-based regression for multi-source high-dimensional data with multi-type responses. 多源高维数据多类型响应的综合秩回归。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-01-16 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2452964
Fuzhi Xu, Shuangge Ma, Qingzhao Zhang

Practical scenarios often present instances where the types of responses are different between multi-source different datasets, reflecting distinct attributes or characteristics. In this paper, an integrative rank-based regression is proposed to facilitate information sharing among varied datasets with multi-type responses. Taking advantage of the rank-based regression, our proposed approach adeptly tackles differences in the magnitude of loss functions. In addition, it can robustly handle outliers and data contamination, and effectively mitigate model misspecification. Extensive numerical simulations demonstrate the superior and competitive performance of the proposed approach in model estimation and variable selection. Analysis of genetic data on HNSC and LUAD yields results with biological explanations and confirms its practical usefulness.

在实际场景中,多源不同数据集之间的响应类型不同,反映了不同的属性或特征。本文提出了一种基于秩的综合回归方法,以促进多类型响应的不同数据集之间的信息共享。利用基于秩的回归,我们提出的方法巧妙地处理了损失函数大小的差异。此外,它还可以鲁棒地处理异常值和数据污染,并有效地减轻模型错误规范。大量的数值仿真证明了该方法在模型估计和变量选择方面的优越性和竞争力。对HNSC和LUAD基因数据的分析得出了具有生物学解释的结果,并证实了其实际用途。
{"title":"Integrative rank-based regression for multi-source high-dimensional data with multi-type responses.","authors":"Fuzhi Xu, Shuangge Ma, Qingzhao Zhang","doi":"10.1080/02664763.2025.2452964","DOIUrl":"10.1080/02664763.2025.2452964","url":null,"abstract":"<p><p>Practical scenarios often present instances where the types of responses are different between multi-source different datasets, reflecting distinct attributes or characteristics. In this paper, an integrative rank-based regression is proposed to facilitate information sharing among varied datasets with multi-type responses. Taking advantage of the rank-based regression, our proposed approach adeptly tackles differences in the magnitude of loss functions. In addition, it can robustly handle outliers and data contamination, and effectively mitigate model misspecification. Extensive numerical simulations demonstrate the superior and competitive performance of the proposed approach in model estimation and variable selection. Analysis of genetic data on HNSC and LUAD yields results with biological explanations and confirms its practical usefulness.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 11","pages":"2011-2030"},"PeriodicalIF":1.1,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12404076/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144992765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The slashed Lomax distribution: new properties and Mellin-type statistical measures for inference. 割形洛马克斯分布:新的性质和梅林型推断的统计度量。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-01-15 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2451977
Jaine de Moura Carvalho, Frank Gomes-Silva, Josimar M Vasconcelos, Gauss M Cordeiro

Several continuous distributions have been proposed recently to provide more flexibility in modeling lifetime data. Among these, the Slashed class of models, particularly the Slashed Lomax ( SL ) distribution, has gained special attention. This asymmetric model has positive support and it is notable for its stochastic representation and ability to fit heavy-tailed datasets. Despite the increasing number of new continuous models catering to specific samples, there have been few statistical tools introduced to evaluate their goodness-of-fits. To address this deficit, we employ the methodology outlined in J.M. Nicolas [Introduction aux statistiques de deuxième espèce: Applications des logs-moments et des logs-cumulants à l'analyse des lois d'images radar, TS, Trait. Signal 19 (2002), pp. 139-167] derived from the Mellin Transform (MT) to provide new goodness-of-fit measures for the SL distribution. These measures consider both qualitative and quantitative aspects. We derive the MT for the SL distribution, calculate the log-cumulants, and construct the log-cumulant diagram. Further, we introduce a test statistic using a combination of Hotelling's T 2 statistic and the multivariate Delta method to test hypotheses about the log-cumulants. We apply the new methodology to two real databases in the context of survival analysis to show its effectiveness in evaluating the fit criteria. We conduct bootstrap experiments to assess the power of the proposed test and to evaluate the performance of the estimators. The results revealed that the adjustment tools performed well and that the log-cumulant method proved to be an effective estimation criterion.

最近提出了几个连续分布,以便在建模生命周期数据时提供更大的灵活性。在这些型号中,slash类型号,特别是slash Lomax (SL)分布,获得了特别的关注。这种非对称模型有积极的支持,它的随机表示和适应重尾数据集的能力是值得注意的。尽管有越来越多的新的连续模型迎合特定的样本,但很少有统计工具被引入来评估它们的拟合优度。为了解决这个问题,我们采用了J.M. Nicolas [Introduction aux statisques de deuxime esp: Applications des logs-moments和des logs- cumulative, l'analyse des lois d'images radar, TS, Trait]中概述的方法。信号19(2002),第139-167页)从Mellin变换(MT)中导出,为SL分布提供了新的拟合优度度量。这些措施考虑到定性和定量两个方面。推导了SL分布的MT,计算了对数累积量,构造了对数累积图。此外,我们引入了一个检验统计量,使用霍特林的T 2统计量和多元Delta方法的组合来检验关于对数累积量的假设。我们将新方法应用于生存分析背景下的两个真实数据库,以显示其在评估拟合标准方面的有效性。我们进行自举实验来评估所提出的测试的能力,并评估估计器的性能。结果表明,平差工具性能良好,对数累积量法是一种有效的估计准则。
{"title":"The slashed Lomax distribution: new properties and Mellin-type statistical measures for inference.","authors":"Jaine de Moura Carvalho, Frank Gomes-Silva, Josimar M Vasconcelos, Gauss M Cordeiro","doi":"10.1080/02664763.2025.2451977","DOIUrl":"https://doi.org/10.1080/02664763.2025.2451977","url":null,"abstract":"<p><p>Several continuous distributions have been proposed recently to provide more flexibility in modeling lifetime data. Among these, the Slashed class of models, particularly the Slashed Lomax ( <math><mrow><mi>SL</mi></mrow> </math> ) distribution, has gained special attention. This asymmetric model has positive support and it is notable for its stochastic representation and ability to fit heavy-tailed datasets. Despite the increasing number of new continuous models catering to specific samples, there have been few statistical tools introduced to evaluate their goodness-of-fits. To address this deficit, we employ the methodology outlined in J.M. Nicolas [<i>Introduction aux statistiques de deuxième espèce: Applications des logs-moments et des logs-cumulants à l'analyse des lois d'images radar, TS</i>, Trait. Signal 19 (2002), pp. 139-167] derived from the Mellin Transform (MT) to provide new goodness-of-fit measures for the <math><mrow><mi>SL</mi></mrow> </math> distribution. These measures consider both qualitative and quantitative aspects. We derive the MT for the <math><mrow><mi>SL</mi></mrow> </math> distribution, calculate the log-cumulants, and construct the log-cumulant diagram. Further, we introduce a test statistic using a combination of Hotelling's <math><msup><mi>T</mi> <mn>2</mn></msup> </math> statistic and the multivariate Delta method to test hypotheses about the log-cumulants. We apply the new methodology to two real databases in the context of survival analysis to show its effectiveness in evaluating the fit criteria. We conduct bootstrap experiments to assess the power of the proposed test and to evaluate the performance of the estimators. The results revealed that the adjustment tools performed well and that the log-cumulant method proved to be an effective estimation criterion.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 10","pages":"1984-2006"},"PeriodicalIF":1.1,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12320265/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144789266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inference under multivariate size-biased sampling. 多元规模偏差抽样下的推断。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-01-15 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2025.2451972
A Batsidis, G Tzavelas, P Economou

The present research deals with statistical inference for the expectation of a function of a random vector based on biased samples. After highlighting with the help of a motivating example the need for conducting this study, using the concept of multivariate weighted distributions, a consistent and asymptotically normally distributed estimator is proposed and utilized for developing statistical inference. A Monte Carlo study is carried out to examine the performance of the estimator proposed. Finally, the analysis of a real-world data set illustrates the benefits of using the proposed methods for statistical inference.

目前的研究涉及基于有偏样本的随机向量函数期望的统计推断。在一个激励的例子的帮助下,强调了进行这项研究的必要性,使用多元加权分布的概念,提出了一个一致的和渐近正态分布的估计量,并用于发展统计推断。通过蒙特卡罗研究来检验所提出的估计器的性能。最后,对真实世界数据集的分析说明了使用所提出的方法进行统计推断的好处。
{"title":"Inference under multivariate size-biased sampling.","authors":"A Batsidis, G Tzavelas, P Economou","doi":"10.1080/02664763.2025.2451972","DOIUrl":"https://doi.org/10.1080/02664763.2025.2451972","url":null,"abstract":"<p><p>The present research deals with statistical inference for the expectation of a function of a random vector based on biased samples. After highlighting with the help of a motivating example the need for conducting this study, using the concept of multivariate weighted distributions, a consistent and asymptotically normally distributed estimator is proposed and utilized for developing statistical inference. A Monte Carlo study is carried out to examine the performance of the estimator proposed. Finally, the analysis of a real-world data set illustrates the benefits of using the proposed methods for statistical inference.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 10","pages":"1968-1983"},"PeriodicalIF":1.1,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12320264/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144789263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian doubly robust estimation of causal effects for clustered observational data. 聚类观测数据因果效应的贝叶斯双稳健估计。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-01-09 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2024.2449396
Qi Zhou, Haonan He, Jie Zhao, Joon Jin Song

Observational data often exhibit clustered structure, which leads to inaccurate estimates of exposure effect if such structure is ignored. To overcome the challenges of modelling the complex confounder effects in clustered data, we propose a Bayesian doubly robust estimator of causal effects with random intercept BART to enhance the robustness against model misspecification. The proposed approach incorporates the uncertainty in the estimation of the propensity score, potential outcomes and the distribution of individual-level and cluster-level confounders into the exposure effect estimation, thereby improving the coverage probability of interval estimation. We evaluate the proposed method in the simulation study compared with frequentist doubly robust estimators with parametric and nonparametric multilevel modelling strategies. The proposed method is applied to estimate the effect of limited food access on the mortality of cardiovascular disease in the senior population.

观测数据通常呈现聚类结构,如果忽略这种结构,则会导致对暴露效应的不准确估计。为了克服在聚类数据中建模复杂混杂效应的挑战,我们提出了一个随机截距BART的贝叶斯双鲁棒因果效应估计器,以增强对模型错误规范的鲁棒性。该方法将倾向值估计、潜在结果估计以及个体水平和集群水平混杂因素分布的不确定性纳入暴露效应估计,从而提高了区间估计的覆盖概率。在仿真研究中,我们将所提出的方法与具有参数和非参数多级建模策略的频率双鲁棒估计进行了比较。该方法被应用于估计有限的食物获取对老年人心血管疾病死亡率的影响。
{"title":"Bayesian doubly robust estimation of causal effects for clustered observational data.","authors":"Qi Zhou, Haonan He, Jie Zhao, Joon Jin Song","doi":"10.1080/02664763.2024.2449396","DOIUrl":"10.1080/02664763.2024.2449396","url":null,"abstract":"<p><p>Observational data often exhibit clustered structure, which leads to inaccurate estimates of exposure effect if such structure is ignored. To overcome the challenges of modelling the complex confounder effects in clustered data, we propose a Bayesian doubly robust estimator of causal effects with random intercept BART to enhance the robustness against model misspecification. The proposed approach incorporates the uncertainty in the estimation of the propensity score, potential outcomes and the distribution of individual-level and cluster-level confounders into the exposure effect estimation, thereby improving the coverage probability of interval estimation. We evaluate the proposed method in the simulation study compared with frequentist doubly robust estimators with parametric and nonparametric multilevel modelling strategies. The proposed method is applied to estimate the effect of limited food access on the mortality of cardiovascular disease in the senior population.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 10","pages":"1931-1949"},"PeriodicalIF":1.1,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12320258/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144789262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weighted portmanteau statistics for testing for zero autocorrelation in dependent data. 用于检验相关数据中零自相关的加权组合统计。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-01-08 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2024.2449413
N Muriel

Zero autocorrelation test statistics of the portmanteau type are studied under dependence. The asymptotic distribution of statistics formed with weighted averages of the autocorrelation and partial autocorrelation functions is theoretically obtained and its accuracy is then analyzed via simulation and in an empirical application. In the simulation study, we find that the proposed statistics provide test with sizes quite close to their nominal, intended sizes and with power functions which show high sensitivity to deviations from the null. It also reveals, for all the lags studied, that the tests are increasingly precise as the sample size increases. An application to financial time series modeling is given where the importance of using robust portmanteau statistics is illustrated. Specifically, we show that traditional tests incur in large deviations from their nominal size, whereas robust tests do not.

研究了依赖条件下组合型的零自相关检验统计量。从理论上得到了由自相关函数和部分自相关函数加权平均形成的统计量的渐近分布,并通过仿真和经验应用分析了其精度。在模拟研究中,我们发现所提出的统计量提供的测试尺寸非常接近其标称,预期尺寸,并且幂函数对偏离零值显示出高灵敏度。它还揭示了,对于所有研究的滞后,随着样本量的增加,测试越来越精确。给出了金融时间序列建模的一个应用,说明了使用稳健组合统计的重要性。具体地说,我们表明传统的测试会产生较大的偏离其标称尺寸,而稳健的测试则不会。
{"title":"Weighted portmanteau statistics for testing for zero autocorrelation in dependent data.","authors":"N Muriel","doi":"10.1080/02664763.2024.2449413","DOIUrl":"10.1080/02664763.2024.2449413","url":null,"abstract":"<p><p>Zero autocorrelation test statistics of the portmanteau type are studied under dependence. The asymptotic distribution of statistics formed with weighted averages of the autocorrelation and partial autocorrelation functions is theoretically obtained and its accuracy is then analyzed via simulation and in an empirical application. In the simulation study, we find that the proposed statistics provide test with sizes quite close to their nominal, intended sizes and with power functions which show high sensitivity to deviations from the null. It also reveals, for all the lags studied, that the tests are increasingly precise as the sample size increases. An application to financial time series modeling is given where the importance of using robust portmanteau statistics is illustrated. Specifically, we show that traditional tests incur in large deviations from their nominal size, whereas robust tests do not.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 10","pages":"1950-1967"},"PeriodicalIF":1.1,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12320263/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144789267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical inference for dependent competing risks data under adaptive Type-II progressive hybrid censoring. 自适应ii型渐进式混合滤波下依赖竞争风险数据的统计推断。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2025-01-04 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2024.2445237
Subhankar Dutta, Suchandan Kayal

In this article, we consider statistical inference based on dependent competing risks data from Marshall-Olkin bivariate Weibull distribution. The maximum likelihood estimates of the unknown model parameters have been computed by using Newton-Raphson method under adaptive Type II progressive hybrid censoring with partially observed failure causes. Existence and uniqueness of maximum likelihood estimates are derived. Approximate confidence intervals have been constructed via the observed Fisher information matrix using asymptotic normality property of the maximum likelihood estimates. Bayes estimates and highest posterior density credible intervals have been calculated under gamma-Dirichlet prior distribution by using Markov chain Monte Carlo technique. Convergence of Markov chain Monte Carlo samples is tested. In addition, a Monte Carlo simulation is carried out to compare the effectiveness of the proposed methods. Further, three different optimality criteria have been taken into account to obtain the most effective censoring plans. From these simulation study results it has been concluded that Bayesian technique produces superior outcomes. Finally, a real-life data set has been analyzed to illustrate the operability and applicability of the proposed methods.

本文考虑了基于Marshall-Olkin二元威布尔分布的相关竞争风险数据的统计推断。采用Newton-Raphson方法,在局部观察到失效原因的自适应II型渐进式混合滤波条件下,计算了未知模型参数的最大似然估计。导出了极大似然估计的存在性和唯一性。利用最大似然估计的渐近正态性,通过观察到的Fisher信息矩阵构造了近似置信区间。利用马尔可夫链蒙特卡罗技术计算了伽玛-狄利克雷先验分布下的贝叶斯估计和最高后验密度可信区间。验证了马尔可夫链蒙特卡罗样本的收敛性。此外,还进行了蒙特卡罗仿真,比较了所提方法的有效性。此外,还考虑了三种不同的最优性准则,以获得最有效的审查计划。仿真研究结果表明,贝叶斯技术具有较好的效果。最后,通过对一个实际数据集的分析,说明了所提方法的可操作性和适用性。
{"title":"Statistical inference for dependent competing risks data under adaptive Type-II progressive hybrid censoring.","authors":"Subhankar Dutta, Suchandan Kayal","doi":"10.1080/02664763.2024.2445237","DOIUrl":"10.1080/02664763.2024.2445237","url":null,"abstract":"<p><p>In this article, we consider statistical inference based on dependent competing risks data from Marshall-Olkin bivariate Weibull distribution. The maximum likelihood estimates of the unknown model parameters have been computed by using Newton-Raphson method under adaptive Type II progressive hybrid censoring with partially observed failure causes. Existence and uniqueness of maximum likelihood estimates are derived. Approximate confidence intervals have been constructed via the observed Fisher information matrix using asymptotic normality property of the maximum likelihood estimates. Bayes estimates and highest posterior density credible intervals have been calculated under gamma-Dirichlet prior distribution by using Markov chain Monte Carlo technique. Convergence of Markov chain Monte Carlo samples is tested. In addition, a Monte Carlo simulation is carried out to compare the effectiveness of the proposed methods. Further, three different optimality criteria have been taken into account to obtain the most effective censoring plans. From these simulation study results it has been concluded that Bayesian technique produces superior outcomes. Finally, a real-life data set has been analyzed to illustrate the operability and applicability of the proposed methods.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 10","pages":"1871-1903"},"PeriodicalIF":1.1,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12320270/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144789265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semiparametric model averaging prediction in nested case-control studies. 嵌套病例对照研究中的半参数模型平均预测。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-12-31 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2024.2447324
Mengyu Li, Xiaoguang Wang

Survival predictions for patients are becoming increasingly important in clinical practice as they play a crucial role in aiding healthcare professionals to make more informed diagnoses and treatment decisions. The nested case-control designs have been extensively utilized as a cost-effective solution in many large cohort studies across epidemiology and other research fields. To achieve accurate survival predictions of individuals from nested case-control studies, we propose a semiparametric model averaging approach based on the partly linear additive proportional hazards structure to avoid the curse of dimensionality. The inverse probability weighting method is considered to estimate the parameters of submodels used in model averaging. We choose the weights by maximizing the pseudo-likelihood function constructed for the aggregated model and discuss the asymptotic optimality of selected weights. Simulation studies are conducted to assess the performance of our proposed model averaging method in the nested case-control study. Furthermore, we apply the proposed approach to real data to demonstrate its superiority.

患者的生存预测在临床实践中变得越来越重要,因为它们在帮助医疗保健专业人员做出更明智的诊断和治疗决策方面发挥着至关重要的作用。在流行病学和其他研究领域的许多大型队列研究中,巢式病例对照设计作为一种具有成本效益的解决方案被广泛使用。为了从嵌套病例对照研究中获得准确的个体生存预测,我们提出了一种基于部分线性加性比例风险结构的半参数模型平均方法,以避免维度诅咒。采用逆概率加权法估计模型平均中子模型的参数。我们通过最大化为聚合模型构造的伪似然函数来选择权值,并讨论了所选权值的渐近最优性。在嵌套病例对照研究中,进行了模拟研究来评估我们提出的模型平均方法的性能。此外,我们将该方法应用于实际数据,以证明其优越性。
{"title":"Semiparametric model averaging prediction in nested case-control studies.","authors":"Mengyu Li, Xiaoguang Wang","doi":"10.1080/02664763.2024.2447324","DOIUrl":"10.1080/02664763.2024.2447324","url":null,"abstract":"<p><p>Survival predictions for patients are becoming increasingly important in clinical practice as they play a crucial role in aiding healthcare professionals to make more informed diagnoses and treatment decisions. The nested case-control designs have been extensively utilized as a cost-effective solution in many large cohort studies across epidemiology and other research fields. To achieve accurate survival predictions of individuals from nested case-control studies, we propose a semiparametric model averaging approach based on the partly linear additive proportional hazards structure to avoid the curse of dimensionality. The inverse probability weighting method is considered to estimate the parameters of submodels used in model averaging. We choose the weights by maximizing the pseudo-likelihood function constructed for the aggregated model and discuss the asymptotic optimality of selected weights. Simulation studies are conducted to assess the performance of our proposed model averaging method in the nested case-control study. Furthermore, we apply the proposed approach to real data to demonstrate its superiority.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 10","pages":"1904-1930"},"PeriodicalIF":1.1,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12320267/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144789264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adapting and evaluating deep-pseudo neural network for survival data with time-varying covariates. 具有时变协变量的生存数据的深度伪神经网络自适应与评价。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-12-24 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2024.2444649
Albert Whata, Justine B Nasejje, Najmeh Nakhaei Rad, Tshilidzi Mulaudzi, Ding-Geng Chen

The Extended Cox model provides an alternative to the proportional hazard Cox model for modelling data including time-varying covariates. Incorporating time-varying covariates is particularly beneficial when dealing with survival data, as it can improve the precision of survival function estimation. Deep learning methods, in particular, the Deep-pseudo survival neural network (DSNN) model have demonstrated a high potential for accurately predicting right-censored survival data when dealing with time-invariant variables. The DSNN's ability to discretise survival times makes it a natural choice for extending its application to scenarios involving time-varying covariates. This study adapts the DSNN to predict survival probabilities for data with time-varying covariates. To demonstrate this, we considered two scenarios: significant and non-significant time-varying covariates. For significant covariates, the Brier scores were below 0.25 at all considered specific time points, while, in the non-significant case, the Brier scores were above 0.25. The results illustrate that the DSNN performed comparably to the extended Cox, the Dynamic-DeepHit and mulitivariate joint models and on the simulated data. A real-world data application further confirms the predictive potential of the DSNN model in modelling survival data with time-varying covariates.

扩展Cox模型为包含时变协变量的数据建模提供了一种替代比例风险Cox模型的方法。在处理生存数据时,结合时变协变量是特别有益的,因为它可以提高生存函数估计的精度。深度学习方法,特别是深度伪生存神经网络(Deep-pseudo - survival neural network, DSNN)模型,在处理时不变变量时,在准确预测右截除生存数据方面具有很高的潜力。DSNN离散生存时间的能力使其成为将其应用扩展到涉及时变协变量的场景的自然选择。本研究采用DSNN来预测具有时变协变量的数据的生存概率。为了证明这一点,我们考虑了两种情况:显著和非显著时变协变量。对于显著性协变量,在所有考虑的特定时间点,Brier评分低于0.25,而在非显著性情况下,Brier评分高于0.25。结果表明,DSNN在模拟数据上的性能与扩展Cox、Dynamic-DeepHit和多变量联合模型相当。一个真实的数据应用进一步证实了DSNN模型在建模具有时变协变量的生存数据方面的预测潜力。
{"title":"Adapting and evaluating deep-pseudo neural network for survival data with time-varying covariates.","authors":"Albert Whata, Justine B Nasejje, Najmeh Nakhaei Rad, Tshilidzi Mulaudzi, Ding-Geng Chen","doi":"10.1080/02664763.2024.2444649","DOIUrl":"10.1080/02664763.2024.2444649","url":null,"abstract":"<p><p>The Extended Cox model provides an alternative to the proportional hazard Cox model for modelling data including time-varying covariates. Incorporating time-varying covariates is particularly beneficial when dealing with survival data, as it can improve the precision of survival function estimation. Deep learning methods, in particular, the Deep-pseudo survival neural network (DSNN) model have demonstrated a high potential for accurately predicting right-censored survival data when dealing with time-invariant variables. The DSNN's ability to discretise survival times makes it a natural choice for extending its application to scenarios involving time-varying covariates. This study adapts the DSNN to predict survival probabilities for data with time-varying covariates. To demonstrate this, we considered two scenarios: significant and non-significant time-varying covariates. For significant covariates, the Brier scores were below 0.25 at all considered specific time points, while, in the non-significant case, the Brier scores were above 0.25. The results illustrate that the DSNN performed comparably to the extended Cox, the Dynamic-DeepHit and mulitivariate joint models and on the simulated data. A real-world data application further confirms the predictive potential of the DSNN model in modelling survival data with time-varying covariates.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 10","pages":"1847-1870"},"PeriodicalIF":1.1,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12320266/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144789261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Latent class profile model with time-dependent covariates: a study on symptom patterning of patients for head and neck cancer. 具有时间相关协变量的潜在类特征模型:头颈癌患者症状模式的研究。
IF 1.2 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-12-16 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2024.2435997
Jung Wun Lee, Hayley Dunnack Yackel

The latent class profile model (LCPM) is a widely used technique for identifying distinct subgroups within a sample based on observations' longitudinal responses to categorical items. This paper proposes an expanded version of LCPM by embedding time-specific structures. Such development allows analysts to investigate associations between latent class memberships and time-dependent predictors at specific time points. We suggest a simultaneous estimation of latent class measurement parameters via the expectation-maximization (EM) algorithm, which yields valid point and interval estimators of associations between latent class memberships and covariates. We illustrate the validity of our estimation strategy via numerical studies. In addition, we demonstrate the novelty of the proposed model by analyzing the head and neck cancer data set.

潜在类分布模型(LCPM)是一种广泛使用的技术,用于根据观察对象对分类项目的纵向反应来识别样本中的不同亚群。本文通过嵌入时间特定结构,提出了LCPM的扩展版本。这样的发展允许分析人员在特定时间点调查潜在类成员和时间依赖预测因子之间的关联。我们建议通过期望最大化(EM)算法同时估计潜在类别测量参数,从而产生潜在类别隶属度和协变量之间关联的有效点和区间估计。我们通过数值研究说明了我们的估计策略的有效性。此外,我们通过分析头颈癌数据集证明了所提出模型的新颖性。
{"title":"Latent class profile model with time-dependent covariates: a study on symptom patterning of patients for head and neck cancer.","authors":"Jung Wun Lee, Hayley Dunnack Yackel","doi":"10.1080/02664763.2024.2435997","DOIUrl":"10.1080/02664763.2024.2435997","url":null,"abstract":"<p><p>The latent class profile model (LCPM) is a widely used technique for identifying distinct subgroups within a sample based on observations' longitudinal responses to categorical items. This paper proposes an expanded version of LCPM by embedding time-specific structures. Such development allows analysts to investigate associations between latent class memberships and time-dependent predictors at specific time points. We suggest a simultaneous estimation of latent class measurement parameters via the expectation-maximization (EM) algorithm, which yields valid point and interval estimators of associations between latent class memberships and covariates. We illustrate the validity of our estimation strategy via numerical studies. In addition, we demonstrate the novelty of the proposed model by analyzing the head and neck cancer data set.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 8","pages":"1628-1648"},"PeriodicalIF":1.2,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12147489/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144266340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A control chart for bivariate discrete data monitoring. 用于二元离散数据监测的控制图。
IF 1.1 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-12-15 eCollection Date: 2025-01-01 DOI: 10.1080/02664763.2024.2438795
Ayesha Talib, Sajid Ali, Ismail Shah

Control charts are sophisticated graphical tools used to detect and control aberrant variations. Different control schemes are designed to continuously monitor and improve the process stability and performance. This study proposes a bivariate exponentially weighted moving average chart for joint monitoring of the mean vector of Gumbel's bivariate geometric (GBG) data. The performance of the proposed chart is compared with Hotelling's T 2 chart. The results of the study indicated that the proposed control chart performs uniformly and substantially better than Hotelling's T 2 chart. In addition to two real-life examples, an example based on simulated data is also considered and compared to existing charts to verify the superiority of the proposed chart. Based on the comparisons, it turns out that the MEWMA (GBG) chart outperforms Hotelling's T 2 chart and individual EWMA control chart.

控制图是用于检测和控制异常变化的复杂图形工具。设计了不同的控制方案,以持续监测和改善过程的稳定性和性能。本文提出了一种双变量指数加权移动平均图,用于联合监测Gumbel双变量几何数据的平均向量。将本文提出的图表的性能与Hotelling的T - 2图表进行了比较。研究结果表明,所提出的控制图表现均匀且显著优于Hotelling的T - 2图。除了两个真实的例子外,还考虑了一个基于模拟数据的例子,并与现有图表进行了比较,以验证所提出图表的优越性。通过比较发现,EWMA (GBG)图优于Hotelling的T 2图和个体EWMA控制图。
{"title":"A control chart for bivariate discrete data monitoring.","authors":"Ayesha Talib, Sajid Ali, Ismail Shah","doi":"10.1080/02664763.2024.2438795","DOIUrl":"10.1080/02664763.2024.2438795","url":null,"abstract":"<p><p>Control charts are sophisticated graphical tools used to detect and control aberrant variations. Different control schemes are designed to continuously monitor and improve the process stability and performance. This study proposes a bivariate exponentially weighted moving average chart for joint monitoring of the mean vector of Gumbel's bivariate geometric (GBG) data. The performance of the proposed chart is compared with Hotelling's <math><msup><mi>T</mi> <mn>2</mn></msup> </math> chart. The results of the study indicated that the proposed control chart performs uniformly and substantially better than Hotelling's <math><msup><mi>T</mi> <mn>2</mn></msup> </math> chart. In addition to two real-life examples, an example based on simulated data is also considered and compared to existing charts to verify the superiority of the proposed chart. Based on the comparisons, it turns out that the MEWMA (GBG) chart outperforms Hotelling's <math><msup><mi>T</mi> <mn>2</mn></msup> </math> chart and individual EWMA control chart.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 9","pages":"1713-1741"},"PeriodicalIF":1.1,"publicationDate":"2024-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12217121/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144560246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Applied Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1