{"title":"Enhancing Socioeconomic Status Prediction for Cavities: A Hybrid Method.","authors":"A T M Dao, L G Do, N Stormon, H V Nguyen, D H Ha","doi":"10.1177/00220345251324494","DOIUrl":null,"url":null,"abstract":"<p><p>Socioeconomic status (SES) measures one's access to social resources across various dimensions. Traditionally, studies on SES commonly use principal component analysis (PCA), a data-driven method, to condense these dimensions into components, typically selecting the first component to represent SES. However, PCA may lack specificity for particular outcomes. Decision tree analysis (DTA), a knowledge-driven approach that identifies outcome-specific dimensions, may address PCA's weaknesses but might not comprehensively capture SES. This study hypothesized that combining DTA and PCA to create SES predictors could enhance predictive accuracy more than using PCA alone could. It also explored whether the DTA-PCA combination, incorporating only significant loading indicators (SLIs) of the first component, could simplify SES predictors without compromising predictive accuracy. The study analyzed 12 SES indicators from the Study of Mothers' and Infants' Life Events Affecting Oral Health (SMILE) birth cohort study, involving 2,182 children. Five SES composites were created: 1 solely from DTA-identified indicators and 2 pairs combining values from either the entire first PCA component or SLIs with and without DTA. These composites served as predictors for predicting dental caries in 5 predictive models. Model accuracy was evaluated using root mean squared error with 5-fold cross-validation. SES composites derived from the DTA-PCA combination demonstrated superior predictive accuracy compared with those from the PCA-only approach. By incorporating only SLIs, this hybrid method generated SES predictors that not only outperformed those using the entire first component but also demonstrated noninferiority relative to the DTA-only method. This approach offers a promising framework for developing SES composites to predict dental caries, potentially improving the precision of predictive models. In addition, this method offers a practical framework for creating composite predictors from multi-item measurements across various outcomes. For future research using this method, a 3-step process is recommended: (1) identify relevant items using DTA, (2) determine their weights through PCA, and (3) generate a composite using the SLIs.</p>","PeriodicalId":94075,"journal":{"name":"Journal of dental research","volume":" ","pages":"220345251324494"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of dental research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/00220345251324494","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Socioeconomic status (SES) measures one's access to social resources across various dimensions. Traditionally, studies on SES commonly use principal component analysis (PCA), a data-driven method, to condense these dimensions into components, typically selecting the first component to represent SES. However, PCA may lack specificity for particular outcomes. Decision tree analysis (DTA), a knowledge-driven approach that identifies outcome-specific dimensions, may address PCA's weaknesses but might not comprehensively capture SES. This study hypothesized that combining DTA and PCA to create SES predictors could enhance predictive accuracy more than using PCA alone could. It also explored whether the DTA-PCA combination, incorporating only significant loading indicators (SLIs) of the first component, could simplify SES predictors without compromising predictive accuracy. The study analyzed 12 SES indicators from the Study of Mothers' and Infants' Life Events Affecting Oral Health (SMILE) birth cohort study, involving 2,182 children. Five SES composites were created: 1 solely from DTA-identified indicators and 2 pairs combining values from either the entire first PCA component or SLIs with and without DTA. These composites served as predictors for predicting dental caries in 5 predictive models. Model accuracy was evaluated using root mean squared error with 5-fold cross-validation. SES composites derived from the DTA-PCA combination demonstrated superior predictive accuracy compared with those from the PCA-only approach. By incorporating only SLIs, this hybrid method generated SES predictors that not only outperformed those using the entire first component but also demonstrated noninferiority relative to the DTA-only method. This approach offers a promising framework for developing SES composites to predict dental caries, potentially improving the precision of predictive models. In addition, this method offers a practical framework for creating composite predictors from multi-item measurements across various outcomes. For future research using this method, a 3-step process is recommended: (1) identify relevant items using DTA, (2) determine their weights through PCA, and (3) generate a composite using the SLIs.