Suresh K Bhavnani, Weibin Zhang, Daniel Bao, Mukaila Raji, Veronica Ajewole, Rodney Hunter, Yong-Fang Kuo, Susanne Schmidt, Monique R Pappadis, Elise Smith, Alex Bokov, Timothy Reistetter, Shyam Visweswaran, Brian Downer
{"title":"Subtyping Social Determinants of Health in the \"All of Us\" Program: Network Analysis and Visualization Study.","authors":"Suresh K Bhavnani, Weibin Zhang, Daniel Bao, Mukaila Raji, Veronica Ajewole, Rodney Hunter, Yong-Fang Kuo, Susanne Schmidt, Monique R Pappadis, Elise Smith, Alex Bokov, Timothy Reistetter, Shyam Visweswaran, Brian Downer","doi":"10.2196/48775","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Social determinants of health (SDoH), such as financial resources and housing stability, account for between 30% and 55% of people's health outcomes. While many studies have identified strong associations between specific SDoH and health outcomes, little is known about how SDoH co-occur to form subtypes critical for designing targeted interventions. Such analysis has only now become possible through the All of Us program.</p><p><strong>Objective: </strong>This study aims to analyze the All of Us dataset for addressing two research questions: (1) What are the range of and responses to survey questions related to SDoH? and (2) How do SDoH co-occur to form subtypes, and what are their risks for adverse health outcomes?</p><p><strong>Methods: </strong>For question 1, an expert panel analyzed the range of and responses to SDoH questions across 6 surveys in the full All of Us dataset (N=372,397; version 6). For question 2, due to systematic missingness and uneven granularity of questions across the surveys, we selected all participants with valid and complete SDoH data and used inverse probability weighting to adjust their imbalance in demographics. Next, an expert panel grouped the SDoH questions into SDoH factors to enable more consistent granularity. To identify the subtypes, we used bipartite modularity maximization for identifying SDoH biclusters and measured their significance and replicability. Next, we measured their association with 3 outcomes (depression, delayed medical care, and emergency room visits in the last year). Finally, the expert panel inferred the subtype labels, potential mechanisms, and targeted interventions.</p><p><strong>Results: </strong>The question 1 analysis identified 110 SDoH questions across 4 surveys covering all 5 domains in Healthy People 2030. As the SDoH questions varied in granularity, they were categorized by an expert panel into 18 SDoH factors. The question 2 analysis (n=12,913; d=18) identified 4 biclusters with significant biclusteredness (Q=0.13; random-Q=0.11; z=7.5; P<.001) and significant replication (real Rand index=0.88; random Rand index=0.62; P<.001). Each subtype had significant associations with specific outcomes and had meaningful interpretations and potential targeted interventions. For example, the Socioeconomic barriers subtype included 6 SDoH factors (eg, not employed and food insecurity) and had a significantly higher odds ratio (4.2, 95% CI 3.5-5.1; P<.001) for depression when compared to other subtypes. The expert panel inferred implications of the results for designing interventions and health care policies based on SDoH subtypes.</p><p><strong>Conclusions: </strong>This study identified SDoH subtypes that had statistically significant biclusteredness and replicability, each of which had significant associations with specific adverse health outcomes and with translational implications for targeted SDoH interventions and health care policies. However, the high degree of systematic missingness requires repeating the analysis as the data become more complete by using our generalizable and scalable machine learning code available on the All of Us workbench.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e48775"},"PeriodicalIF":6.0000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11862773/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Internet Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/48775","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Social determinants of health (SDoH), such as financial resources and housing stability, account for between 30% and 55% of people's health outcomes. While many studies have identified strong associations between specific SDoH and health outcomes, little is known about how SDoH co-occur to form subtypes critical for designing targeted interventions. Such analysis has only now become possible through the All of Us program.
Objective: This study aims to analyze the All of Us dataset for addressing two research questions: (1) What are the range of and responses to survey questions related to SDoH? and (2) How do SDoH co-occur to form subtypes, and what are their risks for adverse health outcomes?
Methods: For question 1, an expert panel analyzed the range of and responses to SDoH questions across 6 surveys in the full All of Us dataset (N=372,397; version 6). For question 2, due to systematic missingness and uneven granularity of questions across the surveys, we selected all participants with valid and complete SDoH data and used inverse probability weighting to adjust their imbalance in demographics. Next, an expert panel grouped the SDoH questions into SDoH factors to enable more consistent granularity. To identify the subtypes, we used bipartite modularity maximization for identifying SDoH biclusters and measured their significance and replicability. Next, we measured their association with 3 outcomes (depression, delayed medical care, and emergency room visits in the last year). Finally, the expert panel inferred the subtype labels, potential mechanisms, and targeted interventions.
Results: The question 1 analysis identified 110 SDoH questions across 4 surveys covering all 5 domains in Healthy People 2030. As the SDoH questions varied in granularity, they were categorized by an expert panel into 18 SDoH factors. The question 2 analysis (n=12,913; d=18) identified 4 biclusters with significant biclusteredness (Q=0.13; random-Q=0.11; z=7.5; P<.001) and significant replication (real Rand index=0.88; random Rand index=0.62; P<.001). Each subtype had significant associations with specific outcomes and had meaningful interpretations and potential targeted interventions. For example, the Socioeconomic barriers subtype included 6 SDoH factors (eg, not employed and food insecurity) and had a significantly higher odds ratio (4.2, 95% CI 3.5-5.1; P<.001) for depression when compared to other subtypes. The expert panel inferred implications of the results for designing interventions and health care policies based on SDoH subtypes.
Conclusions: This study identified SDoH subtypes that had statistically significant biclusteredness and replicability, each of which had significant associations with specific adverse health outcomes and with translational implications for targeted SDoH interventions and health care policies. However, the high degree of systematic missingness requires repeating the analysis as the data become more complete by using our generalizable and scalable machine learning code available on the All of Us workbench.
背景:健康的社会决定因素(SDoH),如财政资源和住房稳定性,占人们健康结果的30%至55%。虽然许多研究已经确定了特定的SDoH与健康结果之间的强烈关联,但对于SDoH如何共同形成对设计有针对性的干预措施至关重要的亚型,我们知之甚少。这种分析现在只有通过“我们所有人”计划才成为可能。目的:本研究旨在分析All of Us数据集,以解决两个研究问题:(1)与SDoH相关的调查问题的范围和回答是什么?(2) SDoH如何共同形成亚型,它们对不良健康结局的风险是什么?方法:对于问题1,专家组分析了所有Us数据集(N=372,397;对于问题2,由于调查中问题的系统性缺失和粒度不均匀,我们选择了所有具有有效完整SDoH数据的参与者,并使用逆概率加权来调整其人口统计学上的不平衡。接下来,专家小组将SDoH问题分组到SDoH因素中,以实现更一致的粒度。为了确定亚型,我们使用了二分模块化最大化方法来识别SDoH双集群,并测量了它们的显著性和可复制性。接下来,我们测量了它们与3个结果(抑郁症、延迟医疗护理和去年的急诊室就诊)的关联。最后,专家小组推断了亚型标签、潜在机制和有针对性的干预措施。结果:问题1分析在4项调查中确定了110个SDoH问题,涵盖了“健康人2030”的所有5个领域。由于SDoH问题的粒度不同,专家小组将它们分为18个SDoH因素。问题2分析(n=12,913;d=18)鉴定出4个显著双聚类(Q=0.13;random-Q = 0.11;z = 7.5;结论:本研究确定了SDoH亚型具有统计学上显著的双聚类性和可复制性,每种亚型都与特定的不良健康结果有显著关联,并对针对性的SDoH干预和卫生保健政策具有转化意义。然而,高度的系统性缺失需要重复分析,因为通过使用All of Us工作台上可用的可推广和可扩展的机器学习代码,数据变得更加完整。
期刊介绍:
The Journal of Medical Internet Research (JMIR) is a highly respected publication in the field of health informatics and health services. With a founding date in 1999, JMIR has been a pioneer in the field for over two decades.
As a leader in the industry, the journal focuses on digital health, data science, health informatics, and emerging technologies for health, medicine, and biomedical research. It is recognized as a top publication in these disciplines, ranking in the first quartile (Q1) by Impact Factor.
Notably, JMIR holds the prestigious position of being ranked #1 on Google Scholar within the "Medical Informatics" discipline.