Stats最新文献 - Book学术

Bidirectional f-Divergence-Based Deep Generative Method for Imputing Missing Values in Time-Series Data. 基于双向f散度的时间序列数据缺失值输入深度生成方法。

IF 0.9 Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Stats

Pub Date : 2025-03-01 Epub Date: 2025-01-14 DOI: 10.3390/stats8010007

Wen-Shan Liu, Tong Si, Aldas Kriauciunas, Marcus Snell, Haijun Gong

Imputing missing values in high-dimensional time-series data remains a significant challenge in statistics and machine learning. Although various methods have been proposed in recent years, many struggle with limitations and reduced accuracy, particularly when the missing rate is high. In this work, we present a novel f-divergence-based bidirectional generative adversarial imputation network, tf-BiGAIN, designed to address these challenges in time-series data imputation. Unlike traditional imputation methods, tf-BiGAIN employs a generative model to synthesize missing values without relying on distributional assumptions. The imputation process is achieved by training two neural networks, implemented using bidirectional modified gated recurrent units, with f-divergence serving as the objective function to guide optimization. Compared to existing deep learning-based methods, tf-BiGAIN introduces two key innovations. First, the use of f-divergence provides a flexible and adaptable framework for optimizing the model across diverse imputation tasks, enhancing its versatility. Second, the use of bidirectional gated recurrent units allows the model to leverage both forward and backward temporal information. This bidirectional approach enables the model to effectively capture dependencies from both past and future observations, enhancing its imputation accuracy and robustness. We applied tf-BiGAIN to analyze two real-world time-series datasets, demonstrating its superior performance in imputing missing values and outperforming existing methods in terms of accuracy and robustness.

在高维时间序列数据中输入缺失值仍然是统计学和机器学习中的重大挑战。尽管近年来提出了各种方法，但许多方法都存在局限性和准确性降低的问题，特别是在缺失率很高的情况下。在这项工作中，我们提出了一种新的基于f散度的双向生成对抗输入网络tf-BiGAIN，旨在解决时间序列数据输入中的这些挑战。与传统的imputation方法不同，tf-BiGAIN采用生成模型来综合缺失值，而不依赖于分布假设。输入过程是通过训练两个神经网络来实现的，使用双向修正门控循环单元，以f-散度作为指导优化的目标函数。与现有的基于深度学习的方法相比，tf-BiGAIN引入了两个关键创新。首先，f-散度的使用提供了一个灵活且适应性强的框架，用于跨不同的输入任务优化模型，增强了其通用性。其次，使用双向门控循环单元允许模型利用前向和后向时间信息。这种双向方法使模型能够有效地从过去和未来的观测中捕获依赖关系，提高其估算精度和鲁棒性。我们应用tf-BiGAIN分析了两个真实世界的时间序列数据集，证明了它在输入缺失值方面的优越性能，并且在准确性和鲁棒性方面优于现有方法。

{"title":"Bidirectional f-Divergence-Based Deep Generative Method for Imputing Missing Values in Time-Series Data.","authors":"Wen-Shan Liu, Tong Si, Aldas Kriauciunas, Marcus Snell, Haijun Gong","doi":"10.3390/stats8010007","DOIUrl":"10.3390/stats8010007","url":null,"abstract":"Imputing missing values in high-dimensional time-series data remains a significant challenge in statistics and machine learning. Although various methods have been proposed in recent years, many struggle with limitations and reduced accuracy, particularly when the missing rate is high. In this work, we present a novel f-divergence-based bidirectional generative adversarial imputation network, tf-BiGAIN, designed to address these challenges in time-series data imputation. Unlike traditional imputation methods, tf-BiGAIN employs a generative model to synthesize missing values without relying on distributional assumptions. The imputation process is achieved by training two neural networks, implemented using bidirectional modified gated recurrent units, with f-divergence serving as the objective function to guide optimization. Compared to existing deep learning-based methods, tf-BiGAIN introduces two key innovations. First, the use of f-divergence provides a flexible and adaptable framework for optimizing the model across diverse imputation tasks, enhancing its versatility. Second, the use of bidirectional gated recurrent units allows the model to leverage both forward and backward temporal information. This bidirectional approach enables the model to effectively capture dependencies from both past and future observations, enhancing its imputation accuracy and robustness. We applied tf-BiGAIN to analyze two real-world time-series datasets, demonstrating its superior performance in imputing missing values and outperforming existing methods in terms of accuracy and robustness.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"8 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11793919/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143257500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exact Inference for Random Effects Meta-Analyses for Small, Sparse Data. 小型稀疏数据随机效应荟萃分析的精确推断。

IF 0.9 Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Stats

Pub Date : 2025-03-01 Epub Date: 2025-01-07 DOI: 10.3390/stats8010005

Jessica Gronsbell, Zachary R McCaw, Timothy Regis, Lu Tian

Meta-analysis aggregates information across related studies to provide more reliable statistical inference and has been a vital tool for assessing the safety and efficacy of many high-profile pharmaceutical products. A key challenge in conducting a meta-analysis is that the number of related studies is typically small. Applying classical methods that are asymptotic in the number of studies can compromise the validity of inference, particularly when heterogeneity across studies is present. Moreover, serious adverse events are often rare and can result in one or more studies with no events in at least one study arm. Practitioners remove studies in which no events have occurred in one or both arms or apply arbitrary continuity corrections (e.g., adding one event to arms with zero events) to stabilize or define effect estimates in such settings, which can further invalidate subsequent inference. To address these significant practical issues, we introduce an exact inference method for random effects meta-analysis of a treatment effect in the two-sample setting with rare events, which we coin "XRRmeta". In contrast to existing methods, XRRmeta provides valid inference for meta-analysis in the presence of between-study heterogeneity and when the event rates, number of studies, and/or the within-study sample sizes are small. Extensive numerical studies indicate that XRRmeta does not yield overly conservative inference. We apply our proposed method to two real-data examples using our open-source R package.

荟萃分析汇集了相关研究的信息，以提供更可靠的统计推断，并已成为评估许多知名药品安全性和有效性的重要工具。进行荟萃分析的一个关键挑战是相关研究的数量通常很少。应用研究数量渐近的经典方法可能会损害推理的有效性，特别是当研究存在异质性时。此外，严重的不良事件通常是罕见的，并可能导致一个或多个研究中至少一个研究组没有发生不良事件。从业者删除了在一个或两个实验组中没有发生事件的研究，或者应用任意的连续性修正（例如，在零事件的实验组中添加一个事件）来稳定或定义这种设置中的效果估计，这可能进一步使后续推断无效。为了解决这些重要的实际问题，我们引入了一种精确推理方法，用于随机效应荟萃分析在双样本设置中罕见事件的治疗效果，我们称之为“XRRmeta”。与现有方法相比，在存在研究间异质性以及事件发生率、研究数量和/或研究内样本量较小的情况下，XRRmeta为meta分析提供了有效的推断。大量的数值研究表明，XRRmeta不会产生过于保守的推断。我们使用我们的开源R包将我们提出的方法应用到两个实际数据示例中。

{"title":"Exact Inference for Random Effects Meta-Analyses for Small, Sparse Data.","authors":"Jessica Gronsbell, Zachary R McCaw, Timothy Regis, Lu Tian","doi":"10.3390/stats8010005","DOIUrl":"10.3390/stats8010005","url":null,"abstract":"Meta-analysis aggregates information across related studies to provide more reliable statistical inference and has been a vital tool for assessing the safety and efficacy of many high-profile pharmaceutical products. A key challenge in conducting a meta-analysis is that the number of related studies is typically small. Applying classical methods that are asymptotic in the number of studies can compromise the validity of inference, particularly when heterogeneity across studies is present. Moreover, serious adverse events are often rare and can result in one or more studies with no events in at least one study arm. Practitioners remove studies in which no events have occurred in one or both arms or apply arbitrary continuity corrections (e.g., adding one event to arms with zero events) to stabilize or define effect estimates in such settings, which can further invalidate subsequent inference. To address these significant practical issues, we introduce an exact inference method for random effects meta-analysis of a treatment effect in the two-sample setting with rare events, which we coin \"XRRmeta\". In contrast to existing methods, XRRmeta provides valid inference for meta-analysis in the presence of between-study heterogeneity and when the event rates, number of studies, and/or the within-study sample sizes are small. Extensive numerical studies indicate that XRRmeta does not yield overly conservative inference. We apply our proposed method to two real-data examples using our open-source R package.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"8 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12456449/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145139610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Investigating Risk Factors for Racial Disparity in E-Cigarette Use with PATH Study. 利用PATH研究调查电子烟使用中种族差异的危险因素。

IF 0.9 Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Stats

Pub Date : 2024-09-01 Epub Date: 2024-06-21 DOI: 10.3390/stats7030037

Amy Liu, Kennedy Dorsey, Almetra Granger, Ty-Runet Bryant, Tung-Sung Tseng, Michael Celestin, Qingzhao Yu

Background: Previous research has identified differences in e-cigarette use and socioeconomic factors between different racial groups However, there is little research examining specific risk factors contributing to the racial differences.

Objective: This study sought to identify racial disparities in e-cigarette use and to determine risk factors that help explain these differences.

Methods: We used Wave 5 (2018-2019) of the Adult Population Assessment of Tobacco and Health (PATH) Study. First, we conducted descriptive statistics of e-smoking across our risk factor variables. Next, we used multiple logistic regression to check the risk effects by adjusting all covariates. Finally, we conducted a mediation analysis to determine whether identified factors showed evidence of influencing the association between race and e-cigarette use. All analyses were performed in R or SAS. The R package mma was used for the mediation analysis.

Results: Between Hispanic and non-Hispanic White populations, our potential risk factors collectively explain 17.5% of the racial difference, former cigarette smoking explains 7.6%, receiving e-cigarette advertising 2.6%, and perception of e-cigarette harm explains 27.8% of the racial difference. Between non-Hispanic Black and non-Hispanic White populations, former cigarette smoking, receiving e-cigarette advertising, and perception of e-cigarette harm explain 5.2%, 1.8%, and 6.8% of the racial difference, respectively. E-cigarette use is most prevalent in the non-Hispanic White population compared to non-Hispanic Black and Hispanic populations, which may be explained by former cigarette smoking, exposure to e-cigarette advertising, and e-cigarette harm perception.

Conclusions: These findings suggest that racial differences in e-cigarette use may be reduced by increasing knowledge of the dangers associated with e-cigarette use and reducing exposure to e-cigarette advertisements. This comprehensive analysis of risk factors can be used to significantly guide smoking cessation efforts and address potential health burden disparities arising from differences in e-cigarette usage.

背景：之前的研究已经确定了不同种族群体在电子烟使用和社会经济因素方面的差异，然而，很少有研究检查导致种族差异的具体风险因素。目的：本研究旨在确定电子烟使用中的种族差异，并确定有助于解释这些差异的风险因素。方法：我们使用成人烟草与健康评估（PATH）研究的第5期（2018-2019）。首先，我们对电子烟的风险因素变量进行了描述性统计。接下来，我们通过调整所有协变量，使用多元逻辑回归来检验风险效应。最后，我们进行了中介分析，以确定确定的因素是否显示影响种族和电子烟使用之间关联的证据。所有分析均在R或SAS中进行。采用R包mma进行中介分析。结果：在西班牙裔和非西班牙裔白人人群中，我们的潜在风险因素共同解释了17.5%的种族差异，以前吸烟解释了7.6%，接受电子烟广告解释了2.6%，对电子烟危害的认知解释了27.8%的种族差异。在非西班牙裔黑人和非西班牙裔白人人群中，曾经吸烟、接受电子烟广告和对电子烟危害的认知分别解释了5.2%、1.8%和6.8%的种族差异。与非西班牙裔黑人和西班牙裔人群相比，非西班牙裔白人中电子烟的使用最为普遍，这可能是由于以前吸烟、接触电子烟广告以及对电子烟危害的认识。结论：这些发现表明，通过增加对电子烟使用相关危险的认识和减少对电子烟广告的接触，可以减少电子烟使用的种族差异。这种对风险因素的全面分析可用于显著指导戒烟工作，并解决因电子烟使用差异而产生的潜在健康负担差异。

{"title":"Investigating Risk Factors for Racial Disparity in E-Cigarette Use with PATH Study.","authors":"Amy Liu, Kennedy Dorsey, Almetra Granger, Ty-Runet Bryant, Tung-Sung Tseng, Michael Celestin, Qingzhao Yu","doi":"10.3390/stats7030037","DOIUrl":"10.3390/stats7030037","url":null,"abstract":"Background: Previous research has identified differences in e-cigarette use and socioeconomic factors between different racial groups However, there is little research examining specific risk factors contributing to the racial differences.Objective: This study sought to identify racial disparities in e-cigarette use and to determine risk factors that help explain these differences.Methods: We used Wave 5 (2018-2019) of the Adult Population Assessment of Tobacco and Health (PATH) Study. First, we conducted descriptive statistics of e-smoking across our risk factor variables. Next, we used multiple logistic regression to check the risk effects by adjusting all covariates. Finally, we conducted a mediation analysis to determine whether identified factors showed evidence of influencing the association between race and e-cigarette use. All analyses were performed in R or SAS. The R package mma was used for the mediation analysis.Results: Between Hispanic and non-Hispanic White populations, our potential risk factors collectively explain 17.5% of the racial difference, former cigarette smoking explains 7.6%, receiving e-cigarette advertising 2.6%, and perception of e-cigarette harm explains 27.8% of the racial difference. Between non-Hispanic Black and non-Hispanic White populations, former cigarette smoking, receiving e-cigarette advertising, and perception of e-cigarette harm explain 5.2%, 1.8%, and 6.8% of the racial difference, respectively. E-cigarette use is most prevalent in the non-Hispanic White population compared to non-Hispanic Black and Hispanic populations, which may be explained by former cigarette smoking, exposure to e-cigarette advertising, and e-cigarette harm perception.Conclusions: These findings suggest that racial differences in e-cigarette use may be reduced by increasing knowledge of the dangers associated with e-cigarette use and reducing exposure to e-cigarette advertisements. This comprehensive analysis of risk factors can be used to significantly guide smoking cessation efforts and address potential health burden disparities arising from differences in e-cigarette usage.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"7 3","pages":"613-626"},"PeriodicalIF":0.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11756910/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143030447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Doubly Robust Estimation and Semiparametric Efficiency in Generalized Partially Linear Models with Missing Outcomes. 缺失结果的广义部分线性模型的双鲁棒估计和半参数效率。

IF 0.9 Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Stats

Pub Date : 2024-09-01 Epub Date: 2024-08-31 DOI: 10.3390/stats7030056

Lu Wang, Zhongzhe Ouyang, Xihong Lin

We investigate a semiparametric generalized partially linear regression model that accommodates missing outcomes, with some covariates modeled parametrically and others nonparametrically. We propose a class of augmented inverse probability weighted (AIPW) kernel-profile estimating equations. The nonparametric component is estimated using AIPW kernel estimating equations, while parametric regression coefficients are estimated using AIPW profile estimating equations. We demonstrate the doubly robust nature of the AIPW estimators for both nonparametric and parametric components. Specifically, these estimators remain consistent if either the assumed model for the probability of missing data or that for the conditional mean of the outcome, given covariates and auxiliary variables, is correctly specified, though not necessarily both simultaneously. Additionally, the AIPW profile estimator for parametric regression coefficients is consistent and asymptotically normal under the semiparametric model defined by the generalized partially linear model on complete data, assuming that the missing data mechanism is missing at random. When both working models are correctly specified, this estimator achieves semiparametric efficiency, with its asymptotic variance reaching the efficiency bound. We validate our approach through simulations to assess the finite sample performance of the proposed estimators and apply the method to a study that investigates risk factors associated with myocardial ischemia.

我们研究了一个半参数广义部分线性回归模型，该模型可以容纳缺失结果，其中一些协变量是参数化的，而另一些是非参数化的。提出了一类增广逆概率加权（AIPW）核剖面估计方程。采用AIPW核估计方程估计非参数分量，采用AIPW剖面估计方程估计参数回归系数。我们证明了非参数分量和参数分量的AIPW估计的双重鲁棒性。具体来说，如果缺失数据概率的假设模型或给定协变量和辅助变量的结果的条件均值的假设模型被正确指定，尽管不一定同时指定，但这些估计量保持一致。此外，假设缺失机制是随机缺失的，在完全数据上由广义部分线性模型定义的半参数模型下，参数回归系数的AIPW剖面估计是一致且渐近正态的。当两个工作模型都被正确指定时，该估计量达到半参数效率，其渐近方差达到效率界。我们通过模拟来验证我们的方法，以评估所提出的估计器的有限样本性能，并将该方法应用于研究心肌缺血相关风险因素的研究。

{"title":"Doubly Robust Estimation and Semiparametric Efficiency in Generalized Partially Linear Models with Missing Outcomes.","authors":"Lu Wang, Zhongzhe Ouyang, Xihong Lin","doi":"10.3390/stats7030056","DOIUrl":"10.3390/stats7030056","url":null,"abstract":"We investigate a semiparametric generalized partially linear regression model that accommodates missing outcomes, with some covariates modeled parametrically and others nonparametrically. We propose a class of augmented inverse probability weighted (AIPW) kernel-profile estimating equations. The nonparametric component is estimated using AIPW kernel estimating equations, while parametric regression coefficients are estimated using AIPW profile estimating equations. We demonstrate the doubly robust nature of the AIPW estimators for both nonparametric and parametric components. Specifically, these estimators remain consistent if either the assumed model for the probability of missing data or that for the conditional mean of the outcome, given covariates and auxiliary variables, is correctly specified, though not necessarily both simultaneously. Additionally, the AIPW profile estimator for parametric regression coefficients is consistent and asymptotically normal under the semiparametric model defined by the generalized partially linear model on complete data, assuming that the missing data mechanism is missing at random. When both working models are correctly specified, this estimator achieves semiparametric efficiency, with its asymptotic variance reaching the efficiency bound. We validate our approach through simulations to assess the finite sample performance of the proposed estimators and apply the method to a study that investigates risk factors associated with myocardial ischemia.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"7 3","pages":"924-943"},"PeriodicalIF":0.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12478555/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145202384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrating Proteomic Analysis and Machine Learning to Predict Prostate Cancer Aggressiveness. 整合蛋白质组学分析和机器学习预测前列腺癌侵袭性。

IF 0.9 Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Stats

Pub Date : 2024-09-01 Epub Date: 2024-08-21 DOI: 10.3390/stats7030053

Sheila M Valle Cortés, Jaileene Pérez Morales, Mariely Nieves Plaza, Darielys Maldonado, Swizel M Tevenal Baez, Marc A Negrón Blas, Cayetana Lazcano Etchebarne, José Feliciano, Gilberto Ruiz Deyá, Juan C Santa Rosario, Pedro Santiago Cardona

Prostate cancer (PCa) poses a significant challenge because of the difficulty in identifying aggressive tumors, leading to overtreatment and missed personalized therapies. Although only 8% of cases progress beyond the prostate, the accurate prediction of aggressiveness remains crucial. Thus, this study focused on studying retinoblastoma phosphorylated at Serine 249 (Phospho-Rb S249), N-cadherin, β-catenin, and E-cadherin as biomarkers for identifying aggressive PCa using a logistic regression model and a classification and regression tree (CART). Using immunohistochemistry (IHC), we targeted the expression of these biomarkers in PCa tissues and correlated their expression with clinicopathological data of the tumor. The results showed a negative correlation between E-cadherin and β-catenin with aggressive tumor behavior, whereas Phospho-Rb S249 and N-cadherin positively correlated with increased tumor aggressiveness. Furthermore, patients were stratified based on Gleason scores and E-cadherin staining patterns to evaluate their capability for early identification of aggressive PCa. Our findings suggest that the classification tree is the most effective method for measuring the utility of these biomarkers in clinical practice, incorporating β-catenin, tumor grade, and Gleason grade as relevant determinants for identifying patients with Gleason scores ≥ 4 + 3. This study could potentially benefit patients with aggressive PCa by enabling early disease detection and closer monitoring.

前列腺癌（PCa）面临着巨大的挑战，因为难以识别侵袭性肿瘤，导致过度治疗和错过个性化治疗。虽然只有8%的病例进展到前列腺以外，但对侵袭性的准确预测仍然至关重要。因此，本研究采用logistic回归模型和分类回归树（CART）研究了丝氨酸249 （Phospho-Rb S249）、N-cadherin、β-catenin和E-cadherin磷酸化的视网膜母细胞瘤作为识别侵袭性PCa的生物标志物。利用免疫组织化学（IHC），我们定位了这些生物标志物在PCa组织中的表达，并将它们的表达与肿瘤的临床病理数据联系起来。结果显示，E-cadherin和β-catenin与肿瘤侵袭性行为呈负相关，而Phospho-Rb S249和N-cadherin与肿瘤侵袭性增加呈正相关。此外，根据Gleason评分和E-cadherin染色模式对患者进行分层，以评估其早期识别侵袭性前列腺癌的能力。我们的研究结果表明，分类树是衡量这些生物标志物在临床实践中的效用的最有效方法，将β-catenin、肿瘤分级和Gleason分级作为识别Gleason评分≥4 + 3的患者的相关决定因素。这项研究可以通过早期疾病检测和更密切的监测，使侵袭性前列腺癌患者受益。

{"title":"Integrating Proteomic Analysis and Machine Learning to Predict Prostate Cancer Aggressiveness.","authors":"Sheila M Valle Cortés, Jaileene Pérez Morales, Mariely Nieves Plaza, Darielys Maldonado, Swizel M Tevenal Baez, Marc A Negrón Blas, Cayetana Lazcano Etchebarne, José Feliciano, Gilberto Ruiz Deyá, Juan C Santa Rosario, Pedro Santiago Cardona","doi":"10.3390/stats7030053","DOIUrl":"10.3390/stats7030053","url":null,"abstract":"Prostate cancer (PCa) poses a significant challenge because of the difficulty in identifying aggressive tumors, leading to overtreatment and missed personalized therapies. Although only 8% of cases progress beyond the prostate, the accurate prediction of aggressiveness remains crucial. Thus, this study focused on studying retinoblastoma phosphorylated at Serine 249 (Phospho-Rb S249), N-cadherin, β-catenin, and E-cadherin as biomarkers for identifying aggressive PCa using a logistic regression model and a classification and regression tree (CART). Using immunohistochemistry (IHC), we targeted the expression of these biomarkers in PCa tissues and correlated their expression with clinicopathological data of the tumor. The results showed a negative correlation between E-cadherin and β-catenin with aggressive tumor behavior, whereas Phospho-Rb S249 and N-cadherin positively correlated with increased tumor aggressiveness. Furthermore, patients were stratified based on Gleason scores and E-cadherin staining patterns to evaluate their capability for early identification of aggressive PCa. Our findings suggest that the classification tree is the most effective method for measuring the utility of these biomarkers in clinical practice, incorporating β-catenin, tumor grade, and Gleason grade as relevant determinants for identifying patients with Gleason scores ≥ 4 + 3. This study could potentially benefit patients with aggressive PCa by enabling early disease detection and closer monitoring.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"7 3","pages":"875-893"},"PeriodicalIF":0.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12494234/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145234320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Assessing Spillover Effects of Medications for Opioid Use Disorder on HIV Risk Behaviors among a Network of People Who Inject Drugs. 评估阿片类药物使用障碍对注射毒品人群网络中艾滋病毒风险行为的溢出效应。

IF 0.9 Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Stats

Pub Date : 2024-06-01 Epub Date: 2024-06-19 DOI: 10.3390/stats7020034

Joseph Puleo, Ashley Buchanan, Natallia Katenka, M Elizabeth Halloran, Samuel R Friedman, Georgios Nikolopoulos

People who inject drugs (PWID) have an increased risk of HIV infection partly due to injection behaviors often related to opioid use. Medications for opioid use disorder (MOUD) have been shown to reduce HIV infection risk, possibly by reducing injection risk behaviors. MOUD may benefit individuals who do not receive it themselves but are connected through social, sexual, or drug use networks with individuals who are treated. This is known as spillover. Valid estimation of spillover in network studies requires considering the network's community structure. Communities are groups of densely connected individuals with sparse connections to other groups. We analyzed a network of 277 PWID and their contacts from the Transmission Reduction Intervention Project. We assessed the effect of MOUD on reductions in injection risk behaviors and the possible benefit for network contacts of participants treated with MOUD. We identified communities using modularity-based methods and employed inverse probability weighting with community-level propensity scores to adjust for measured confounding. We found that MOUD may have beneficial spillover effects on reducing injection risk behaviors. The magnitudes of estimated effects were sensitive to the community detection method. Careful consideration should be paid to the significance of community structure in network studies evaluating spillover.

注射吸毒者（PWID）感染艾滋病毒的风险增加，部分原因是与阿片类药物使用有关的注射行为。阿片类药物使用障碍（mod）药物已被证明可以降低艾滋病毒感染风险，可能是通过减少注射风险行为。mod可能对那些自己不接受但通过社会、性或吸毒网络与接受治疗的人有联系的人有益。这就是所谓的溢出效应。网络研究中溢出效应的有效估计需要考虑网络的社区结构。社区是由紧密联系的个人组成的群体，与其他群体的联系很少。我们分析了来自减少传播干预项目的277名PWID及其联系人的网络。我们评估了mod对减少注射危险行为的影响，以及对接受mod治疗的参与者的网络接触可能带来的好处。我们使用基于模块化的方法确定社区，并使用反概率加权与社区水平倾向得分来调整测量的混淆。研究发现，mod对降低注射风险行为具有有益的溢出效应。估计影响的大小对社区检测方法敏感。在评价网络溢出效应的研究中，应充分考虑社区结构的重要性。

{"title":"Assessing Spillover Effects of Medications for Opioid Use Disorder on HIV Risk Behaviors among a Network of People Who Inject Drugs.","authors":"Joseph Puleo, Ashley Buchanan, Natallia Katenka, M Elizabeth Halloran, Samuel R Friedman, Georgios Nikolopoulos","doi":"10.3390/stats7020034","DOIUrl":"10.3390/stats7020034","url":null,"abstract":"People who inject drugs (PWID) have an increased risk of HIV infection partly due to injection behaviors often related to opioid use. Medications for opioid use disorder (MOUD) have been shown to reduce HIV infection risk, possibly by reducing injection risk behaviors. MOUD may benefit individuals who do not receive it themselves but are connected through social, sexual, or drug use networks with individuals who are treated. This is known as spillover. Valid estimation of spillover in network studies requires considering the network's community structure. Communities are groups of densely connected individuals with sparse connections to other groups. We analyzed a network of 277 PWID and their contacts from the Transmission Reduction Intervention Project. We assessed the effect of MOUD on reductions in injection risk behaviors and the possible benefit for network contacts of participants treated with MOUD. We identified communities using modularity-based methods and employed inverse probability weighting with community-level propensity scores to adjust for measured confounding. We found that MOUD may have beneficial spillover effects on reducing injection risk behaviors. The magnitudes of estimated effects were sensitive to the community detection method. Careful consideration should be paid to the significance of community structure in network studies evaluating spillover.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"7 2","pages":"549-575"},"PeriodicalIF":0.9,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12165006/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144303849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Precise Tensor Product Smoothing via Spectral Splines 通过光谱样条实现精确的张量乘积平滑化

Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Stats

Pub Date : 2024-01-10 DOI: 10.3390/stats7010003

Nathaniel E. Helwig

Tensor product smoothers are frequently used to include interaction effects in multiple nonparametric regression models. Current implementations of tensor product smoothers either require using approximate penalties, such as those typically used in generalized additive models, or costly parameterizations, such as those used in smoothing spline analysis of variance models. In this paper, I propose a computationally efficient and theoretically precise approach for tensor product smoothing. Specifically, I propose a spectral representation of a univariate smoothing spline basis, and I develop an efficient approach for building tensor product smooths from marginal spectral spline representations. The developed theory suggests that current tensor product smoothing methods could be improved by incorporating the proposed tensor product spectral smoothers. Simulation results demonstrate that the proposed approach can outperform popular tensor product smoothing implementations, which supports the theoretical results developed in the paper.

张量积平滑器常用于在多重非参数回归模型中加入交互效应。目前张量积平滑器的实现要么需要使用近似惩罚（如广义加法模型中通常使用的惩罚），要么需要昂贵的参数化（如平滑样条方差分析模型中使用的参数化）。在本文中，我提出了一种计算高效、理论精确的张量乘平滑方法。具体来说，我提出了单变量平滑样条曲线基础的谱表示，并开发了一种从边际谱样条曲线表示建立张量乘平滑的高效方法。所开发的理论表明，当前的张量积平滑方法可以通过结合所提出的张量积谱平滑器来加以改进。仿真结果表明，所提出的方法可以超越流行的张量乘平滑实现方法，这也支持了本文所提出的理论结果。

引用次数: 0

Predicting Random Walks and a Data-Splitting Prediction Region 预测随机行走和数据分割预测区域

Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Stats

Pub Date : 2024-01-08 DOI: 10.3390/stats7010002

Mulubrhan G. Haile, Lingling Zhang, David J. Olive

Perhaps the first nonparametric, asymptotically optimal prediction intervals are provided for univariate random walks, with applications to renewal processes. Perhaps the first nonparametric prediction regions are introduced for vector-valued random walks. This paper further derives nonparametric data-splitting prediction regions, which are underpinned by very simple theory. Some of the prediction regions can be used when the data distribution does not have first moments, and some can be used for high-dimensional data, where the number of predictors is larger than the sample size. The prediction regions can make use of many estimators of multivariate location and dispersion.

也许是首次为单变量随机游走提供了非参数、渐近最优预测区间，并将其应用于更新过程。本文或许首次为向量随机游走引入了非参数预测区间。本文进一步推导出了非参数数据分割预测区域，这些预测区域以非常简单的理论为基础。其中一些预测区域可用于数据分布没有第一矩的情况，还有一些预测区域可用于预测因子数量大于样本量的高维数据。预测区域可以利用多变量位置和离散性的许多估计值。

引用次数: 0

The Mediating Impact of Innovation Types in the Relationship between Innovation Use Theory and Market Performance 创新类型在创新使用理论与市场绩效关系中的中介影响

Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Stats

Pub Date : 2023-12-30 DOI: 10.3390/stats7010001

Shieh-Liang Chen, Kuo-Liang Chen

The ultimate goal of innovation is to improve performance. But if people’s needs and uses are ignored, innovation will only be a formality. In the past, research on innovation mostly focused on technology, processes, business models, services, and organizations. The measurement of innovation focuses on capabilities, processes, results, and methods, but there has always been a lack of pre-innovation measurements and tools. This study is the first to use the innovation use theory proposed by Christensen et al. combined with innovation types, and it uses the measurement focus on the early stage of innovation as a post-innovation performance prediction. This study collected 590 valid samples and used SPSS and the four-step BK method to conduct regression analysis and mediation tests. The empirical results obtained the following: (1) a confirmed model and scale of the innovation use theory; (2) that three constructs of innovation use theory have an impact on market performance; and (3) that innovation types acting as mediators will improve market performance. This study establishes an academic model of the innovation use theory to provide a clear scale tool for subsequent research. In practice, it can first measure the direction of innovation and performance prediction, providing managers with a reference when developing new products and applying market strategies.

创新的最终目的是提高绩效。但如果忽视了人们的需求和用途，创新就只能流于形式。过去，关于创新的研究大多集中在技术、流程、商业模式、服务和组织上。对创新的衡量侧重于能力、过程、结果和方法，但一直缺乏对创新前的衡量和工具。本研究首次将克里斯坦森等人提出的创新使用理论与创新类型相结合，并将创新早期阶段的测量重点作为创新后的绩效预测。本研究收集了 590 个有效样本，采用 SPSS 和四步 BK 法进行了回归分析和中介检验。实证结果如下(1）确认了创新使用理论的模型和量表；（2）创新使用理论的三个构念对市场绩效有影响；（3）创新类型作为中介会提高市场绩效。本研究建立了创新使用理论的学术模型，为后续研究提供了明确的量表工具。在实践中，它可以首先衡量创新和绩效预测的方向，为管理者在开发新产品和应用市场策略时提供参考。

{"title":"The Mediating Impact of Innovation Types in the Relationship between Innovation Use Theory and Market Performance","authors":"Shieh-Liang Chen, Kuo-Liang Chen","doi":"10.3390/stats7010001","DOIUrl":"https://doi.org/10.3390/stats7010001","url":null,"abstract":"The ultimate goal of innovation is to improve performance. But if people’s needs and uses are ignored, innovation will only be a formality. In the past, research on innovation mostly focused on technology, processes, business models, services, and organizations. The measurement of innovation focuses on capabilities, processes, results, and methods, but there has always been a lack of pre-innovation measurements and tools. This study is the first to use the innovation use theory proposed by Christensen et al. combined with innovation types, and it uses the measurement focus on the early stage of innovation as a post-innovation performance prediction. This study collected 590 valid samples and used SPSS and the four-step BK method to conduct regression analysis and mediation tests. The empirical results obtained the following: (1) a confirmed model and scale of the innovation use theory; (2) that three constructs of innovation use theory have an impact on market performance; and (3) that innovation types acting as mediators will improve market performance. This study establishes an academic model of the innovation use theory to provide a clear scale tool for subsequent research. In practice, it can first measure the direction of innovation and performance prediction, providing managers with a reference when developing new products and applying market strategies.","PeriodicalId":93142,"journal":{"name":"Stats","volume":" 19","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139138519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Jump-Robust Realized-GARCH-MIDAS-X Estimators for Bitcoin and Ethereum Volatility Indices 比特币和以太坊波动率指数的跃迁-稳健实现-GARCH-MIDAS-X 估计器

Q4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Stats

Pub Date : 2023-12-12 DOI: 10.3390/stats6040082

Julien Chevallier, Bilel Sanhaji

In this paper, we conducted an empirical investigation of the realized volatility of cryptocurrencies using an econometric approach. This work’s two main characteristics are: (i) the realized volatility to be forecast filters jumps, and (ii) the benefit of using various historical/implied volatility indices from brokers as exogenous variables was explicitly considered. We feature a jump-robust extension of the REGARCH-MIDAS-X model incorporating realized beta GARCH processes and MIDAS filters with monthly, daily, and hourly components. First, we estimated six jump-robust estimators of realized volatility for Bitcoin and Ethereum that were retained as the dependent variable. Second, we inserted ten Bitcoin and Ethereum volatility indices gathered from various exchanges as an exogenous variable, each at a time. Third, we explored their forecasting ability based on the MSE and QLIKE statistics. Our sample spanned the period from May 2018 to January 2023. The main result featured the best predictors among the volatility indices for Bitcoin and Ethereum derived from 30-day implied volatility. The significance of the findings could mostly be attributable to the ability of our new model to incorporate financial and technological variables directly into the specification of the Bitcoin and Ethereum volatility dynamics.

在本文中，我们使用计量经济学方法对加密货币的已实现波动性进行了实证调查。这项工作的两个主要特点是(i) 要预测的已实现波动率过滤了跳跃；(ii) 明确考虑了使用经纪商提供的各种历史/隐含波动率指数作为外生变量的好处。我们对 REGARCH-MIDAS-X 模型进行了跳跃稳健性扩展，纳入了已实现的贝塔 GARCH 过程和 MIDAS 滤波器的月度、日和小时成分。首先，我们估算了比特币和以太坊已实现波动率的六个跳跃稳健估计值，并将其保留为因变量。其次，我们插入了从不同交易所收集的十个比特币和以太坊波动率指数作为外生变量，每次一个。第三，我们根据 MSE 和 QLIKE 统计量探索了它们的预测能力。我们的样本时间跨度为 2018 年 5 月至 2023 年 1 月。主要结果显示，根据 30 天隐含波动率得出的比特币和以太坊波动率指数具有最佳预测能力。研究结果的重要性主要归功于我们的新模型能够将金融和技术变量直接纳入比特币和以太坊波动动态的规范中。

{"title":"Jump-Robust Realized-GARCH-MIDAS-X Estimators for Bitcoin and Ethereum Volatility Indices","authors":"Julien Chevallier, Bilel Sanhaji","doi":"10.3390/stats6040082","DOIUrl":"https://doi.org/10.3390/stats6040082","url":null,"abstract":"In this paper, we conducted an empirical investigation of the realized volatility of cryptocurrencies using an econometric approach. This work’s two main characteristics are: (i) the realized volatility to be forecast filters jumps, and (ii) the benefit of using various historical/implied volatility indices from brokers as exogenous variables was explicitly considered. We feature a jump-robust extension of the REGARCH-MIDAS-X model incorporating realized beta GARCH processes and MIDAS filters with monthly, daily, and hourly components. First, we estimated six jump-robust estimators of realized volatility for Bitcoin and Ethereum that were retained as the dependent variable. Second, we inserted ten Bitcoin and Ethereum volatility indices gathered from various exchanges as an exogenous variable, each at a time. Third, we explored their forecasting ability based on the MSE and QLIKE statistics. Our sample spanned the period from May 2018 to January 2023. The main result featured the best predictors among the volatility indices for Bitcoin and Ethereum derived from 30-day implied volatility. The significance of the findings could mostly be attributable to the ability of our new model to incorporate financial and technological variables directly into the specification of the Bitcoin and Ethereum volatility dynamics.","PeriodicalId":93142,"journal":{"name":"Stats","volume":"3 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139007733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0