{"title":"Subgroup learning for multiple mixed-type outcomes with block-structured covariates","authors":"Xun Zhao , Lu Tang , Weijia Zhang , Ling Zhou","doi":"10.1016/j.csda.2024.108105","DOIUrl":null,"url":null,"abstract":"<div><div>The increasing interest in survey research focuses on inferring grouped association patterns between risk factors and questionnaire responses, with grouping shared across multiple response variables that jointly capture one's underlying status. Aiming to identify important risk factors that are simultaneously associated with the health and well-being of senior adults, a study based on the China Health and Retirement Survey (CHRS) is conducted. Previous studies have identified several known risk factors, yet heterogeneity in the outcome-risk factor association exists, prompting the use of subgroup analysis. A subgroup analysis procedure is devised to model a multiple mixed-type outcome which describes one's general health and well-being, while tackling additional challenges including collinearity and weak signals within block-structured covariates. Computationally, an efficient algorithm that alternately updates a set of estimating equations and likelihood functions is proposed. Theoretical results establish the asymptotic consistency and normality of the proposed estimators. The validity of the proposed method is corroborated by simulation experiments. An application of the proposed method to the CHRS data identifies caring for grandchildren as a new risk factor for poor physical and mental health.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"204 ","pages":"Article 108105"},"PeriodicalIF":1.5000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Statistics & Data Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167947324001890","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
The increasing interest in survey research focuses on inferring grouped association patterns between risk factors and questionnaire responses, with grouping shared across multiple response variables that jointly capture one's underlying status. Aiming to identify important risk factors that are simultaneously associated with the health and well-being of senior adults, a study based on the China Health and Retirement Survey (CHRS) is conducted. Previous studies have identified several known risk factors, yet heterogeneity in the outcome-risk factor association exists, prompting the use of subgroup analysis. A subgroup analysis procedure is devised to model a multiple mixed-type outcome which describes one's general health and well-being, while tackling additional challenges including collinearity and weak signals within block-structured covariates. Computationally, an efficient algorithm that alternately updates a set of estimating equations and likelihood functions is proposed. Theoretical results establish the asymptotic consistency and normality of the proposed estimators. The validity of the proposed method is corroborated by simulation experiments. An application of the proposed method to the CHRS data identifies caring for grandchildren as a new risk factor for poor physical and mental health.
期刊介绍:
Computational Statistics and Data Analysis (CSDA), an Official Publication of the network Computational and Methodological Statistics (CMStatistics) and of the International Association for Statistical Computing (IASC), is an international journal dedicated to the dissemination of methodological research and applications in the areas of computational statistics and data analysis. The journal consists of four refereed sections which are divided into the following subject areas:
I) Computational Statistics - Manuscripts dealing with: 1) the explicit impact of computers on statistical methodology (e.g., Bayesian computing, bioinformatics,computer graphics, computer intensive inferential methods, data exploration, data mining, expert systems, heuristics, knowledge based systems, machine learning, neural networks, numerical and optimization methods, parallel computing, statistical databases, statistical systems), and 2) the development, evaluation and validation of statistical software and algorithms. Software and algorithms can be submitted with manuscripts and will be stored together with the online article.
II) Statistical Methodology for Data Analysis - Manuscripts dealing with novel and original data analytical strategies and methodologies applied in biostatistics (design and analytic methods for clinical trials, epidemiological studies, statistical genetics, or genetic/environmental interactions), chemometrics, classification, data exploration, density estimation, design of experiments, environmetrics, education, image analysis, marketing, model free data exploration, pattern recognition, psychometrics, statistical physics, image processing, robust procedures.
[...]
III) Special Applications - [...]
IV) Annals of Statistical Data Science [...]