Kai Kang, Zhihao Wu, Xinjie Qian, Xinyuan Song, Hongtu Zhu
Federated learning enables the training of a global model while keeping data localized; however, current methods face challenges with high-dimensional semiparametric models that involve complex nuisance parameters. This paper proposes a federated double machine learning framework designed to address high-dimensional nuisance parameters of semiparametric models in multicenter studies. Our approach leverages double machine learning (Chernozhukov et al., 2018a) to estimate center-specific parameters, extends the surrogate efficient score method within a Neyman-orthogonal framework, and applies density ratio tilting to create a federated estimator that combines local individual-level data with summary statistics from other centers. This methodology mitigates regularization bias and overfitting in high-dimensional nuisance parameter estimation. We establish the estimator's limiting distribution under minimal assumptions, validate its performance through extensive simulations, and demonstrate its effectiveness in analyzing multiphase data from the Alzheimer's Disease Neuroimaging Initiative study.
{"title":"Federated double machine learning for high-dimensional semiparametric models.","authors":"Kai Kang, Zhihao Wu, Xinjie Qian, Xinyuan Song, Hongtu Zhu","doi":"10.1093/biomtc/ujaf150","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf150","url":null,"abstract":"<p><p>Federated learning enables the training of a global model while keeping data localized; however, current methods face challenges with high-dimensional semiparametric models that involve complex nuisance parameters. This paper proposes a federated double machine learning framework designed to address high-dimensional nuisance parameters of semiparametric models in multicenter studies. Our approach leverages double machine learning (Chernozhukov et al., 2018a) to estimate center-specific parameters, extends the surrogate efficient score method within a Neyman-orthogonal framework, and applies density ratio tilting to create a federated estimator that combines local individual-level data with summary statistics from other centers. This methodology mitigates regularization bias and overfitting in high-dimensional nuisance parameter estimation. We establish the estimator's limiting distribution under minimal assumptions, validate its performance through extensive simulations, and demonstrate its effectiveness in analyzing multiphase data from the Alzheimer's Disease Neuroimaging Initiative study.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145562431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matching is a commonly used causal inference study design in observational studies. Through matching on measured confounders between different treatment groups, valid randomization inferences can be conducted under the no unmeasured confounding assumption, and sensitivity analysis can be further performed to assess robustness of results to potential unmeasured confounding. However, for many common matched designs, there is still a lack of valid downstream randomization inference and sensitivity analysis methods. Specifically, in matched observational studies with treatment doses (eg, continuous or ordinal treatments), with the exception of some special cases such as pair matching, there is no existing randomization inference or sensitivity analysis method for studying analogs of the sample average treatment effect (ie, Neyman-type weak nulls), and no existing valid sensitivity analysis approach for testing the sharp null of no treatment effect for any subject (ie, Fisher's sharp null) when the outcome is nonbinary. To fill these important gaps, we propose new methods for randomization inference and sensitivity analysis that can work for general matched designs with treatment doses, applicable to general types of outcome variables (eg, binary, ordinal, or continuous), and cover both Fisher's sharp null and Neyman-type weak nulls. We illustrate our methods via comprehensive simulation studies and a real data application. All the proposed methods have been incorporated into $tt {R}$ package $tt {doseSens}$.
{"title":"Bridging the gap between design and analysis: randomization inference and sensitivity analysis for matched observational studies with treatment doses.","authors":"Jeffrey Zhang, Siyu Heng","doi":"10.1093/biomtc/ujaf156","DOIUrl":"10.1093/biomtc/ujaf156","url":null,"abstract":"<p><p>Matching is a commonly used causal inference study design in observational studies. Through matching on measured confounders between different treatment groups, valid randomization inferences can be conducted under the no unmeasured confounding assumption, and sensitivity analysis can be further performed to assess robustness of results to potential unmeasured confounding. However, for many common matched designs, there is still a lack of valid downstream randomization inference and sensitivity analysis methods. Specifically, in matched observational studies with treatment doses (eg, continuous or ordinal treatments), with the exception of some special cases such as pair matching, there is no existing randomization inference or sensitivity analysis method for studying analogs of the sample average treatment effect (ie, Neyman-type weak nulls), and no existing valid sensitivity analysis approach for testing the sharp null of no treatment effect for any subject (ie, Fisher's sharp null) when the outcome is nonbinary. To fill these important gaps, we propose new methods for randomization inference and sensitivity analysis that can work for general matched designs with treatment doses, applicable to general types of outcome variables (eg, binary, ordinal, or continuous), and cover both Fisher's sharp null and Neyman-type weak nulls. We illustrate our methods via comprehensive simulation studies and a real data application. All the proposed methods have been incorporated into $tt {R}$ package $tt {doseSens}$.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665973/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145647307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heterogeneous treatment effect models allow us to compare treatments at subgroup levels and are becoming increasingly popular in applications such as personalized medicine, advertising, and education. Regardless of the type of responses (continuous, binary, count, survival), most causal estimands focus on the differences between the treatment and control conditional means. In this paper, we propose an alternative estimand, DINA-the DIfference in NAtural parameters-to quantify heterogeneous treatment effects motivated by exponential families and the Cox model. Despite the type of responses, DINA is both convenient and more practical for modeling the influence of covariates on the treatment effect. Additionally, we introduce a meta-algorithm for DINA, enabling practitioners to utilize powerful off-the-shelf machine learning tools for the estimation of nuisance functions. This meta-algorithm is also statistically robust to errors in the nuisance function estimation. We demonstrate the efficacy of our method in combination with various machine learning base-learners on both simulated and real datasets.
{"title":"Estimating heterogeneous treatment effects for general responses.","authors":"Zijun Gao, Trevor Hastie","doi":"10.1093/biomtc/ujaf162","DOIUrl":"10.1093/biomtc/ujaf162","url":null,"abstract":"<p><p>Heterogeneous treatment effect models allow us to compare treatments at subgroup levels and are becoming increasingly popular in applications such as personalized medicine, advertising, and education. Regardless of the type of responses (continuous, binary, count, survival), most causal estimands focus on the differences between the treatment and control conditional means. In this paper, we propose an alternative estimand, DINA-the DIfference in NAtural parameters-to quantify heterogeneous treatment effects motivated by exponential families and the Cox model. Despite the type of responses, DINA is both convenient and more practical for modeling the influence of covariates on the treatment effect. Additionally, we introduce a meta-algorithm for DINA, enabling practitioners to utilize powerful off-the-shelf machine learning tools for the estimation of nuisance functions. This meta-algorithm is also statistically robust to errors in the nuisance function estimation. We demonstrate the efficacy of our method in combination with various machine learning base-learners on both simulated and real datasets.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12728347/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145817568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tiphaine Saulnier, Wassilios G Meissner, Margherita Fabbri, Alexandra Foubert-Samier, Cécile Proust-Lima
In clinical studies, questionnaires are often used to report disease-related manifestations from clinician and/or patient perspectives. Their analysis can help identify relevant manifestations throughout the disease course, enhancing knowledge of disease progression and guiding clinicians in appropriate care provision. However, the analysis of questionnaires in health studies is not straightforward as made of repeated, ordinal, and potentially multidimensional item data. Sum-score summaries may considerably reduce information and hamper interpretation; items' changes over time occur along clinical progression; and as many other longitudinal processes, observations may be truncated by events. This work establishes a comprehensive strategy in four consecutive steps to leverage repeated ordinal data from multidimensional questionnaires. The 4S method successively (1) identifies the questionnaire structure into dimensions satisfying three calibration assumptions (unidimensionality, conditional independence, increasing monotonicity), (2) describes each dimension progression using a joint latent process model which includes a continuous-time item response theory model for the longitudinal subpart, (3) aligns each dimension progression with disease stages through a projection approach, and (4) identifies the most informative items across disease stages using the Fisher information. The method is applied to multiple system atrophy (MSA), a rare neurodegenerative disease, with the analysis of daily activity and motor impairments over disease progression. The 4S method provides an effective and complete analytical strategy for questionnaires repeatedly collected in health studies.
{"title":"Structuring, sequencing, staging, selecting: the 4S method for the longitudinal analysis of multidimensional questionnaires in chronic diseases.","authors":"Tiphaine Saulnier, Wassilios G Meissner, Margherita Fabbri, Alexandra Foubert-Samier, Cécile Proust-Lima","doi":"10.1093/biomtc/ujaf163","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf163","url":null,"abstract":"<p><p>In clinical studies, questionnaires are often used to report disease-related manifestations from clinician and/or patient perspectives. Their analysis can help identify relevant manifestations throughout the disease course, enhancing knowledge of disease progression and guiding clinicians in appropriate care provision. However, the analysis of questionnaires in health studies is not straightforward as made of repeated, ordinal, and potentially multidimensional item data. Sum-score summaries may considerably reduce information and hamper interpretation; items' changes over time occur along clinical progression; and as many other longitudinal processes, observations may be truncated by events. This work establishes a comprehensive strategy in four consecutive steps to leverage repeated ordinal data from multidimensional questionnaires. The 4S method successively (1) identifies the questionnaire structure into dimensions satisfying three calibration assumptions (unidimensionality, conditional independence, increasing monotonicity), (2) describes each dimension progression using a joint latent process model which includes a continuous-time item response theory model for the longitudinal subpart, (3) aligns each dimension progression with disease stages through a projection approach, and (4) identifies the most informative items across disease stages using the Fisher information. The method is applied to multiple system atrophy (MSA), a rare neurodegenerative disease, with the analysis of daily activity and motor impairments over disease progression. The 4S method provides an effective and complete analytical strategy for questionnaires repeatedly collected in health studies.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145832991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The case-cohort study design is often used in modern epidemiological studies of rare diseases, as it can achieve similar efficiency as a much larger cohort study with a fraction of the cost. Previous work focused on parameter estimation for case-cohort studies based on a particular statistical model, but few discussed the survival prediction problem under such type of design. In this article, we propose a super learner algorithm for survival prediction in case-cohort studies. We further extend our proposed algorithm to generalized case-cohort studies. The proposed super learner algorithm is shown to have asymptotic model selection consistency as well as uniform consistency. We also demonstrate our algorithm has satisfactory finite sample performances. Simulation studies suggest that the proposed super learners trained by data from case-cohort and generalized case-cohort studies have better prediction accuracy than the ones trained by data from the simple random sampling design with the same sample sizes. Finally, we apply the proposed method to analyze a generalized case-cohort study conducted as part of the Atherosclerosis Risk in Communities Study.
{"title":"Super learner for survival prediction in case-cohort and generalized case-cohort studies.","authors":"Haolin Li, Haibo Zhou, David Couper, Jianwen Cai","doi":"10.1093/biomtc/ujaf155","DOIUrl":"10.1093/biomtc/ujaf155","url":null,"abstract":"<p><p>The case-cohort study design is often used in modern epidemiological studies of rare diseases, as it can achieve similar efficiency as a much larger cohort study with a fraction of the cost. Previous work focused on parameter estimation for case-cohort studies based on a particular statistical model, but few discussed the survival prediction problem under such type of design. In this article, we propose a super learner algorithm for survival prediction in case-cohort studies. We further extend our proposed algorithm to generalized case-cohort studies. The proposed super learner algorithm is shown to have asymptotic model selection consistency as well as uniform consistency. We also demonstrate our algorithm has satisfactory finite sample performances. Simulation studies suggest that the proposed super learners trained by data from case-cohort and generalized case-cohort studies have better prediction accuracy than the ones trained by data from the simple random sampling design with the same sample sizes. Finally, we apply the proposed method to analyze a generalized case-cohort study conducted as part of the Atherosclerosis Risk in Communities Study.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665972/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145647246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Effective treatment of medical conditions begins with an accurate diagnosis. However, many conditions are often underdiagnosed, either being overlooked or diagnosed after significant delays. Electronic health records (EHRs) contain extensive patient health information, offering an opportunity to probabilistically identify underdiagnosed individuals. The rationale is that both diagnosed and underdiagnosed patients may display similar health profiles in EHR data, distinguishing them from condition-free patients. Thus, EHR data can be leveraged to develop models that assess an individual's risk of having a condition. To date, this opportunity has largely remained unexploited, partly due to the lack of suitable statistical methods. The key challenge is the positive-unlabeled EHR data structure, which consists of data for diagnosed ("positive") patients and the remaining ("unlabeled") that include underdiagnosed patients and many condition-free patients. Therefore, data for patients who are unambiguously condition-free, essential for developing risk assessment models, are unavailable. To overcome this challenge, we propose ascertaining condition statuses for a small subset of unlabeled patients. We develop a novel statistical method for building accurate models using this supplemented EHR data to estimate the probability that a patient has the condition of interest. We study the asymptotic properties of our method and assess its finite-sample performance through simulation studies. Finally, we apply our method to develop a preliminary model for identifying potentially underdiagnosed non-alcoholic steatohepatitis patients using data from Penn Medicine EHRs.
{"title":"A semiparametric method for addressing underdiagnosis using electronic health record data.","authors":"Weidong Ma, Jordana B Cohen, Jinbo Chen","doi":"10.1093/biomtc/ujaf157","DOIUrl":"10.1093/biomtc/ujaf157","url":null,"abstract":"<p><p>Effective treatment of medical conditions begins with an accurate diagnosis. However, many conditions are often underdiagnosed, either being overlooked or diagnosed after significant delays. Electronic health records (EHRs) contain extensive patient health information, offering an opportunity to probabilistically identify underdiagnosed individuals. The rationale is that both diagnosed and underdiagnosed patients may display similar health profiles in EHR data, distinguishing them from condition-free patients. Thus, EHR data can be leveraged to develop models that assess an individual's risk of having a condition. To date, this opportunity has largely remained unexploited, partly due to the lack of suitable statistical methods. The key challenge is the positive-unlabeled EHR data structure, which consists of data for diagnosed (\"positive\") patients and the remaining (\"unlabeled\") that include underdiagnosed patients and many condition-free patients. Therefore, data for patients who are unambiguously condition-free, essential for developing risk assessment models, are unavailable. To overcome this challenge, we propose ascertaining condition statuses for a small subset of unlabeled patients. We develop a novel statistical method for building accurate models using this supplemented EHR data to estimate the probability that a patient has the condition of interest. We study the asymptotic properties of our method and assess its finite-sample performance through simulation studies. Finally, we apply our method to develop a preliminary model for identifying potentially underdiagnosed non-alcoholic steatohepatitis patients using data from Penn Medicine EHRs.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665971/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145647261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to: Covariate-Adjusted Response-Adaptive Randomization for Multi-Arm Clinical Trials Using a Modified Forward Looking Gittins Index Rule.","authors":"","doi":"10.1093/biomtc/ujaf139","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf139","url":null,"abstract":"","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145372147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Byran J Smucker, Stephen E Wright, Isaac Williams, Richard C Page, Andor J Kiss, Surendra Bikram Silwal, Maria Weese, David J Edwards
High-throughput screening, in which large numbers of compounds are traditionally studied one-at-a-time in multiwell plates against specific targets, is widely used across many areas of the biological sciences, including drug discovery. To improve the effectiveness of these screens, we propose a new class of supersaturated designs that guide the construction of pools of compounds in each well. Because the size of the pools is typically limited by the particular application, the new designs accommodate this constraint and are part of a larger procedure that we call Constrained Row Screening or CRowS. We develop an efficient computational procedure to construct the CRowS designs, provide some initial lower bounds on the average squared off-diagonal values of their main-effects information matrix, and study the impact of the constraint on design quality. We also show via simulation that CRowS is statistically superior to the traditional one-compound-one-well approach as well as an existing pooling method, and demonstrate the use of the new methodology on a Verona Integron-encoded Metallo-$beta$-lactamase-2 assay.
{"title":"Large row-constrained supersaturated designs for high-throughput screening.","authors":"Byran J Smucker, Stephen E Wright, Isaac Williams, Richard C Page, Andor J Kiss, Surendra Bikram Silwal, Maria Weese, David J Edwards","doi":"10.1093/biomtc/ujaf160","DOIUrl":"10.1093/biomtc/ujaf160","url":null,"abstract":"<p><p>High-throughput screening, in which large numbers of compounds are traditionally studied one-at-a-time in multiwell plates against specific targets, is widely used across many areas of the biological sciences, including drug discovery. To improve the effectiveness of these screens, we propose a new class of supersaturated designs that guide the construction of pools of compounds in each well. Because the size of the pools is typically limited by the particular application, the new designs accommodate this constraint and are part of a larger procedure that we call Constrained Row Screening or CRowS. We develop an efficient computational procedure to construct the CRowS designs, provide some initial lower bounds on the average squared off-diagonal values of their main-effects information matrix, and study the impact of the constraint on design quality. We also show via simulation that CRowS is statistically superior to the traditional one-compound-one-well approach as well as an existing pooling method, and demonstrate the use of the new methodology on a Verona Integron-encoded Metallo-$beta$-lactamase-2 assay.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12696866/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145720530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In many genomic studies, gene co-expression graphs are influenced by subject-level covariates like single nucleotide polymorphisms. Traditional Gaussian graphical models ignore these covariates and estimate only population-level networks, potentially masking important heterogeneity. Covariate-dependent Gaussian graphical regressions address this limitation by regressing the precision matrix on covariates, thereby modeling how graph structures vary with high-dimensional subject-specific covariates. To fit the model, we adopt a multi-task learning approach that achieves lower error rates than node-wise regressions. Yet, the important problem of statistical inference in this setting remains largely unexplored. We propose a class of debiased estimators based on multi-task learners, which can be computed quickly and separately. In a key step, we introduce a novel projection technique for estimating the inverse covariance matrix, reducing optimization costs to scale with the sample size n. Our debiased estimators achieve fast convergence and asymptotic normality, enabling valid inference. Simulations demonstrate the utility of the method, and an application to a brain cancer gene-expression dataset reveals meaningful biological relationships.
{"title":"Statistical inference on high-dimensional covariate-dependent Gaussian graphical regressions.","authors":"Xuran Meng, Jingfei Zhang, Yi Li","doi":"10.1093/biomtc/ujaf165","DOIUrl":"10.1093/biomtc/ujaf165","url":null,"abstract":"<p><p>In many genomic studies, gene co-expression graphs are influenced by subject-level covariates like single nucleotide polymorphisms. Traditional Gaussian graphical models ignore these covariates and estimate only population-level networks, potentially masking important heterogeneity. Covariate-dependent Gaussian graphical regressions address this limitation by regressing the precision matrix on covariates, thereby modeling how graph structures vary with high-dimensional subject-specific covariates. To fit the model, we adopt a multi-task learning approach that achieves lower error rates than node-wise regressions. Yet, the important problem of statistical inference in this setting remains largely unexplored. We propose a class of debiased estimators based on multi-task learners, which can be computed quickly and separately. In a key step, we introduce a novel projection technique for estimating the inverse covariance matrix, reducing optimization costs to scale with the sample size n. Our debiased estimators achieve fast convergence and asymptotic normality, enabling valid inference. Simulations demonstrate the utility of the method, and an application to a brain cancer gene-expression dataset reveals meaningful biological relationships.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12720500/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145802935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rejoinder to Letter to the Editors \"Comments on 'Statistical inference on change points in generalized semiparametric segmented models' by Yang et al. (2025)\" by Vito M.R. Muggeo.","authors":"Guangyu Yang, Min Zhang","doi":"10.1093/biomtc/ujaf148","DOIUrl":"10.1093/biomtc/ujaf148","url":null,"abstract":"","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":" ","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145602083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}