Lara Maleyeff, Shirin Golchi, Erica E. M. Moodie, R. John Kimoff
Precision medicine tailors treatments to individual patient characteristics, which is especially valuable for obstructive sleep apnea (OSA), where treatment responses vary widely. Traditional trials often overlook subgroup differences, leading to suboptimal recommendations. Current approaches rely on prespecified thresholds, which may be incorrectly specified. This case study compares prespecified thresholds to two Bayesian methods: the established FK-BMA (free-knot Bayesian model averaging) method, and its novel variant, FK. The FK approach retains the flexibility of free-knot splines but omits variable selection, providing stable, interpretable models. Using biomarker data from large studies, this design identifies subgroups dynamically, allowing early trial termination or enrollment adjustments. Simulation results—motivated by real-world biomarker distributions and clinical constraints—show that under conditions of limited signal-to-noise ratio and limited candidate biomarkers, FK improves efficiency and subgroup detection.
{"title":"A precision trial case study for heterogeneous treatment effects in obstructive sleep apnea","authors":"Lara Maleyeff, Shirin Golchi, Erica E. M. Moodie, R. John Kimoff","doi":"10.1002/cjs.70028","DOIUrl":"https://doi.org/10.1002/cjs.70028","url":null,"abstract":"<p>Precision medicine tailors treatments to individual patient characteristics, which is especially valuable for obstructive sleep apnea (OSA), where treatment responses vary widely. Traditional trials often overlook subgroup differences, leading to suboptimal recommendations. Current approaches rely on prespecified thresholds, which may be incorrectly specified. This case study compares prespecified thresholds to two Bayesian methods: the established FK-BMA (free-knot Bayesian model averaging) method, and its novel variant, FK. The FK approach retains the flexibility of free-knot splines but omits variable selection, providing stable, interpretable models. Using biomarker data from large studies, this design identifies subgroups dynamically, allowing early trial termination or enrollment adjustments. Simulation results—motivated by real-world biomarker distributions and clinical constraints—show that under conditions of limited signal-to-noise ratio and limited candidate biomarkers, FK improves efficiency and subgroup detection.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 4","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70028","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145619062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article considers the receiver operating characteristic (ROC) curve analysis for medical data with non-ignorable missingness in the disease status. In the framework of the logistic regression models for both the disease status and the verification status, we first establish the identifiability of model parameters, and then propose a likelihood method to estimate the model parameters, the ROC curve, and the area under the ROC curve for the biomarker. The asymptotic distributions of these estimators are established. Via extensive simulation studies, we compare our method with competing methods of point estimation and assess the accuracy of confidence interval estimation under various scenarios. To illustrate the use of our proposed approach in a practical setting, we apply our method to the Alzheimer's disease dataset from the National Alzheimer's Coordinating Center.
{"title":"Receiver operating characteristic curve analysis with non-ignorable missing disease status","authors":"Dingding Hu, Tao Yu, Pengfei Li","doi":"10.1002/cjs.70025","DOIUrl":"https://doi.org/10.1002/cjs.70025","url":null,"abstract":"<p>This article considers the receiver operating characteristic (ROC) curve analysis for medical data with non-ignorable missingness in the disease status. In the framework of the logistic regression models for both the disease status and the verification status, we first establish the identifiability of model parameters, and then propose a likelihood method to estimate the model parameters, the ROC curve, and the area under the ROC curve for the biomarker. The asymptotic distributions of these estimators are established. Via extensive simulation studies, we compare our method with competing methods of point estimation and assess the accuracy of confidence interval estimation under various scenarios. To illustrate the use of our proposed approach in a practical setting, we apply our method to the Alzheimer's disease dataset from the National Alzheimer's Coordinating Center.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 4","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70025","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145618894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When data from multiple tasks have outlier contamination, existing multitask learning methods perform less efficiently. To address this issue, we propose a robust multitask feature learning method by combining the adaptive Huber regression tasks with mixed regularization. The robustification parameters can be chosen to adapt to the sample size, model dimension, and moments of the error distribution while striking a balance between unbiasedness and robustness. We consider heavy-tailed distributions for multiple datasets that have bounded