Pub Date : 2026-01-01Epub Date: 2025-10-25DOI: 10.1007/s10742-025-00358-5
Gary Hettinger, Youjin Lee, Nandita Mitra
Policymakers and researchers often seek to understand how a policy differentially affects a population and the pathways driving this heterogeneity. For example, when studying an excise tax on sweetened beverages, researchers might assess the roles of cross-border shopping, economic competition, and store-level price changes on beverage sales trends. However, traditional policy evaluation tools, like the difference-in-differences (DiD) approach, primarily target average effects of the observed intervention rather than the underlying drivers of effect heterogeneity. Common approaches to evaluate sources of heterogeneity often lack a causal framework, making it difficult to determine whether observed outcome differences are truly driven by the proposed source of heterogeneity or by other confounding factors. In this paper, we present a framework for evaluating such policy drivers by representing questions of effect heterogeneity under hypothetical interventions and use it to evaluate drivers of the Philadelphia sweetened beverage tax policy effects. Building on recent advancements in estimating causal effect curves under DiD designs, we provide tools to assess policy effect heterogeneity while addressing practical challenges including confounding and neighborhood dynamics.
Supplementary information: The online version contains supplementary material available at 10.1007/s10742-025-00358-5.
{"title":"A causal framework for evaluating drivers of policy effect heterogeneity using difference-in-differences.","authors":"Gary Hettinger, Youjin Lee, Nandita Mitra","doi":"10.1007/s10742-025-00358-5","DOIUrl":"https://doi.org/10.1007/s10742-025-00358-5","url":null,"abstract":"<p><p>Policymakers and researchers often seek to understand how a policy differentially affects a population and the pathways driving this heterogeneity. For example, when studying an excise tax on sweetened beverages, researchers might assess the roles of cross-border shopping, economic competition, and store-level price changes on beverage sales trends. However, traditional policy evaluation tools, like the difference-in-differences (DiD) approach, primarily target average effects of the observed intervention rather than the underlying drivers of effect heterogeneity. Common approaches to evaluate sources of heterogeneity often lack a causal framework, making it difficult to determine whether observed outcome differences are truly driven by the proposed source of heterogeneity or by other confounding factors. In this paper, we present a framework for evaluating such policy drivers by representing questions of effect heterogeneity under hypothetical interventions and use it to evaluate drivers of the Philadelphia sweetened beverage tax policy effects. Building on recent advancements in estimating causal effect curves under DiD designs, we provide tools to assess policy effect heterogeneity while addressing practical challenges including confounding and neighborhood dynamics.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s10742-025-00358-5.</p>","PeriodicalId":45600,"journal":{"name":"Health Services and Outcomes Research Methodology","volume":"26 1","pages":"26-47"},"PeriodicalIF":1.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12967569/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147390161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-11-17DOI: 10.1007/s10742-025-00369-2
Elizabeth M Stone, Megan S Schuler, Elizabeth A Stuart, Max Rubinstein, Max Griswold, Bradley D Stein, Beth Ann Griffin
This paper reviews and details methods for state policy evaluation to guide selection of a research approach, based on an evaluation's setting and available data. We highlight key design considerations for an analysis, including treatment and control group selection, timing of policy adoption, expected effect heterogeneity, and data considerations. We then provide an overview of analytic approaches and differentiate between methods based on an evaluation's context, such as settings with no control units, a single treated unit, multiple treated units, or with multiple treatment cohorts. Methods discussed include: interrupted time series models, difference-in-differences estimators, autoregressive models, and synthetic control methods, along with method extensions which address issues like staggered policy adoption and heterogenous treatment effects. We end with an illustrative example, applying the developed framework to evaluate the impacts of state-level naloxone standing order policies on overdose rates. Overall, we provide researchers with an approach for deciding on methods for state policy evaluations, which can be used to select study designs and inform methodological choices.
Supplementary information: The online version contains supplementary material available at 10.1007/s10742-025-00369-2.
{"title":"Choosing an analytic approach: key study design considerations in state policy evaluation.","authors":"Elizabeth M Stone, Megan S Schuler, Elizabeth A Stuart, Max Rubinstein, Max Griswold, Bradley D Stein, Beth Ann Griffin","doi":"10.1007/s10742-025-00369-2","DOIUrl":"https://doi.org/10.1007/s10742-025-00369-2","url":null,"abstract":"<p><p>This paper reviews and details methods for state policy evaluation to guide selection of a research approach, based on an evaluation's setting and available data. We highlight key design considerations for an analysis, including treatment and control group selection, timing of policy adoption, expected effect heterogeneity, and data considerations. We then provide an overview of analytic approaches and differentiate between methods based on an evaluation's context, such as settings with no control units, a single treated unit, multiple treated units, or with multiple treatment cohorts. Methods discussed include: interrupted time series models, difference-in-differences estimators, autoregressive models, and synthetic control methods, along with method extensions which address issues like staggered policy adoption and heterogenous treatment effects. We end with an illustrative example, applying the developed framework to evaluate the impacts of state-level naloxone standing order policies on overdose rates. Overall, we provide researchers with an approach for deciding on methods for state policy evaluations, which can be used to select study designs and inform methodological choices.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s10742-025-00369-2.</p>","PeriodicalId":45600,"journal":{"name":"Health Services and Outcomes Research Methodology","volume":"26 1","pages":"3-25"},"PeriodicalIF":1.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12967469/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147390531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-06-07DOI: 10.1007/s10742-025-00344-x
Lihua Li, Chen Yang, Wei Zhang, Yulei He, John R Pleis, Lauren M Rossen, Bian Liu, Morgan Earp, Madhu Mazumdar
Propensity score weighting (PSW) is a valuable tool for estimating treatment effects on survival outcomes in observational studies. However, there is no clear best practice for applying PSW to complex survey data with survival outcomes. This paper addresses this gap by exploring how to integrate PSW into complex survey with design features (strata, clusters, sampling weights) for unbiased population-level estimates. We evaluate three PSW methods where: Method I: neither the propensity score (PS) model nor the outcome model accounts for the survey design; Method II: the PS model does not account for the survey design, but the outcome model does; Method III: both the PS model and outcome model account for the survey design. Through extensive simulations, we compare performance in estimating absolute treatment effects measured by population survival quantile effects and relative treatment effects measured by population marginal hazard ratios. Mean relative bias, mean absolute bias and coverage probability are estimated for model evaluations under various scenarios, including varying treatment effect magnitude, censoring type and rate, level of PS overlap, presence of outliers and nonresponse. Findings reveal that both survey-weighted Methods II and III outperform the unweighted Method I under most scenarios for both measures of treatment effects, especially when there is a true treatment effect. Both weighted methods II and III are found to perform closely, including when there exists informative censoring, influential outliers, or non-response. We recommend that when considering PSW with complex survey data for estimating population-level treatment effects on survival outcomes, both modelling stages should incorporate survey designs, but it is most critical for the outcome modelling. For illustration, all methods are applied to the public-use 2000-2018 National Health Interview Survey (NHIS) Linked to Mortality Files with mortality information through 2019 to estimate the effect of smoking cessation after a cancer diagnosis on subsequent overall survival.
{"title":"Propensity score weighting analysis with complex survey data for estimating population-level treatment effects on survival: a simulation study.","authors":"Lihua Li, Chen Yang, Wei Zhang, Yulei He, John R Pleis, Lauren M Rossen, Bian Liu, Morgan Earp, Madhu Mazumdar","doi":"10.1007/s10742-025-00344-x","DOIUrl":"10.1007/s10742-025-00344-x","url":null,"abstract":"<p><p>Propensity score weighting (PSW) is a valuable tool for estimating treatment effects on survival outcomes in observational studies. However, there is no clear best practice for applying PSW to complex survey data with survival outcomes. This paper addresses this gap by exploring how to integrate PSW into complex survey with design features (strata, clusters, sampling weights) for unbiased population-level estimates. We evaluate three PSW methods where: Method I: neither the propensity score (PS) model nor the outcome model accounts for the survey design; Method II: the PS model does not account for the survey design, but the outcome model does; Method III: both the PS model and outcome model account for the survey design. Through extensive simulations, we compare performance in estimating absolute treatment effects measured by population survival quantile effects and relative treatment effects measured by population marginal hazard ratios. Mean relative bias, mean absolute bias and coverage probability are estimated for model evaluations under various scenarios, including varying treatment effect magnitude, censoring type and rate, level of PS overlap, presence of outliers and nonresponse. Findings reveal that both survey-weighted Methods II and III outperform the unweighted Method I under most scenarios for both measures of treatment effects, especially when there is a true treatment effect. Both weighted methods II and III are found to perform closely, including when there exists informative censoring, influential outliers, or non-response. We recommend that when considering PSW with complex survey data for estimating population-level treatment effects on survival outcomes, both modelling stages should incorporate survey designs, but it is most critical for the outcome modelling. For illustration, all methods are applied to the public-use 2000-2018 National Health Interview Survey (NHIS) Linked to Mortality Files with mortality information through 2019 to estimate the effect of smoking cessation after a cancer diagnosis on subsequent overall survival.</p>","PeriodicalId":45600,"journal":{"name":"Health Services and Outcomes Research Methodology","volume":"25 4","pages":"382-409"},"PeriodicalIF":1.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12700616/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145757776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-11DOI: 10.1007/s10742-025-00364-7
Kerry Ye, Alyssa Bilinski, Youjin Lee
Difference-in-differences (DiD) approach is one of the most widely used approaches for evaluating policy effects. However, traditional DiD methods may not recover the population-level average treatment effect on the treated (ATT) in the absence of population-level panel data, particularly when the composition of units in the treatment group changes over time. In this work, we address the following two challenges when applying DiD methods with repeated cross-sectional (RCS) survey data: (1) heterogeneous compositions of study samples across different time points, and (2) availability of data for only a sample of the population. We introduce a policy-relevant target estimand and establish its identification conditions. We then propose a new weighting approach that incorporates both estimated propensity scores and given survey weights. We establish the theoretical properties of the proposed method and examine its finite-sample performance through simulations. Finally, we apply our proposed method to a real-world data application, estimating the effect of a beverage tax on adolescent soda consumption in Philadelphia.
{"title":"Difference-in-differences analysis with repeated cross-sectional survey data.","authors":"Kerry Ye, Alyssa Bilinski, Youjin Lee","doi":"10.1007/s10742-025-00364-7","DOIUrl":"10.1007/s10742-025-00364-7","url":null,"abstract":"<p><p>Difference-in-differences (DiD) approach is one of the most widely used approaches for evaluating policy effects. However, traditional DiD methods may not recover the population-level average treatment effect on the treated (ATT) in the absence of population-level panel data, particularly when the composition of units in the treatment group changes over time. In this work, we address the following two challenges when applying DiD methods with repeated cross-sectional (RCS) survey data: (1) heterogeneous compositions of study samples across different time points, and (2) availability of data for only a sample of the population. We introduce a policy-relevant target estimand and establish its identification conditions. We then propose a new weighting approach that incorporates both estimated propensity scores and given survey weights. We establish the theoretical properties of the proposed method and examine its finite-sample performance through simulations. Finally, we apply our proposed method to a real-world data application, estimating the effect of a beverage tax on adolescent soda consumption in Philadelphia.</p>","PeriodicalId":45600,"journal":{"name":"Health Services and Outcomes Research Methodology","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12674181/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145679055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01Epub Date: 2025-09-16DOI: 10.1007/s10742-025-00356-7
Eva Murphy, David Kline, Staci A Hepler
Understanding the interactions and spatio-temporal variations of public health outcomes is crucial for gaining insight into interrelated epidemics across different locations and time periods. Dynamic spatial factor models provide a flexible framework for capturing shared variability among multiple outcomes through a latent factor and its corresponding loadings. A common assumption in these models is that factor loadings are spatially constant, implying uniform relationships between outcomes across the study region. However, this assumption may overlook important regional differences in how outcomes relate to the underlying latent factor. In this study, we derive the covariance structure of the outcome vector to highlight how spatially varying versus constant loadings influence the overall correlation structure. We find that when loadings vary across space, the spatial covariance of the outcomes is shaped by both the spatial covariance of the loadings and the latent factors. In contrast, when loadings are spatially constant, the spatial covariance of the outcomes is determined primarily by the latent factors, leading to uniform variation across the spatial domain. To assess these differences in practice, we apply a Bayesian hierarchical spatial dynamic factor model to analyze the opioid syndemic in North Carolina. Our results suggest that incorporating spatially varying loadings provides a more detailed, county-specific understanding of the epidemic. This added flexibility enables a localized interpretation of opioid-related interactions and offers insights that could inform targeted public health interventions.
{"title":"The role of spatially varying loadings in dynamic spatial factor models for modeling the opioid syndemic.","authors":"Eva Murphy, David Kline, Staci A Hepler","doi":"10.1007/s10742-025-00356-7","DOIUrl":"10.1007/s10742-025-00356-7","url":null,"abstract":"<p><p>Understanding the interactions and spatio-temporal variations of public health outcomes is crucial for gaining insight into interrelated epidemics across different locations and time periods. Dynamic spatial factor models provide a flexible framework for capturing shared variability among multiple outcomes through a latent factor and its corresponding loadings. A common assumption in these models is that factor loadings are spatially constant, implying uniform relationships between outcomes across the study region. However, this assumption may overlook important regional differences in how outcomes relate to the underlying latent factor. In this study, we derive the covariance structure of the outcome vector to highlight how spatially varying versus constant loadings influence the overall correlation structure. We find that when loadings vary across space, the spatial covariance of the outcomes is shaped by both the spatial covariance of the loadings and the latent factors. In contrast, when loadings are spatially constant, the spatial covariance of the outcomes is determined primarily by the latent factors, leading to uniform variation across the spatial domain. To assess these differences in practice, we apply a Bayesian hierarchical spatial dynamic factor model to analyze the opioid syndemic in North Carolina. Our results suggest that incorporating spatially varying loadings provides a more detailed, county-specific understanding of the epidemic. This added flexibility enables a localized interpretation of opioid-related interactions and offers insights that could inform targeted public health interventions.</p>","PeriodicalId":45600,"journal":{"name":"Health Services and Outcomes Research Methodology","volume":"25 3","pages":""},"PeriodicalIF":1.6,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490277/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145233814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-13DOI: 10.1007/s10742-025-00348-7
Chuanling Qin, Curtis Peterson, William B Weeks, A James O'Malley
Despite the rapid advancement of machine learning algorithms, the important problem of distinguishing patients based on the likelihood of their mortality remains a challenge. In this paper, we investigated the degree to which the incorporation of the time-varying factor, length of hospitalization could contribute to modeling mortality. A two-part modeling approach was proposed to capture the potential heterogeneity over follow-up time and to evaluate the extent to which allowing a predictor based on a fixed-time event like hospitalization (as a time-varying coefficient) enhanced mortality prediction. A test was then conducted to assess whether the association between hospitalization and mortality diminished with continued survival of a patient. Leveraging logistic regression models and the XGBoost procedure, the findings supported the claim that the baseline hospitalization is a risk factor whose importance diminishes the longer the patient survives. While simulation studies and theoretical considerations indicate that the two-part model provides deeper insight into the evolving dynamics of regression coefficients and enhances the prediction accuracy of the marginal probability of mortality, its application to the empirical data that motivated this research yielded less compelling results, a finding that aligns with previous findings. Factors such as class imbalance and the magnitude of heterogeneous effects can significantly impact the performance of the two-part model in empirical datasets.
{"title":"The impact of knowledge of hospitalization on mortality predictions.","authors":"Chuanling Qin, Curtis Peterson, William B Weeks, A James O'Malley","doi":"10.1007/s10742-025-00348-7","DOIUrl":"10.1007/s10742-025-00348-7","url":null,"abstract":"<p><p>Despite the rapid advancement of machine learning algorithms, the important problem of distinguishing patients based on the likelihood of their mortality remains a challenge. In this paper, we investigated the degree to which the incorporation of the time-varying factor, length of hospitalization could contribute to modeling mortality. A two-part modeling approach was proposed to capture the potential heterogeneity over follow-up time and to evaluate the extent to which allowing a predictor based on a fixed-time event like hospitalization (as a time-varying coefficient) enhanced mortality prediction. A test was then conducted to assess whether the association between hospitalization and mortality diminished with continued survival of a patient. Leveraging logistic regression models and the XGBoost procedure, the findings supported the claim that the baseline hospitalization is a risk factor whose importance diminishes the longer the patient survives. While simulation studies and theoretical considerations indicate that the two-part model provides deeper insight into the evolving dynamics of regression coefficients and enhances the prediction accuracy of the marginal probability of mortality, its application to the empirical data that motivated this research yielded less compelling results, a finding that aligns with previous findings. Factors such as class imbalance and the magnitude of heterogeneous effects can significantly impact the performance of the two-part model in empirical datasets.</p>","PeriodicalId":45600,"journal":{"name":"Health Services and Outcomes Research Methodology","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12788376/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145953283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-20DOI: 10.1007/s10742-025-00343-y
Yifan Zhao, Carly A Bobak, Megan A Murphy, Olivia Sacks, Lili Liu, Natasha Ray, Amber E Barnato, A James O'Malley
Patient-sharing physician networks are increasingly recognized as valuable tools for examining physician relationships in healthcare research. However, very few studies have examined the reliability of such networks and summary measures derived from them in relation to directly measured physician relationships. In this paper, we evaluate the level of congruence between a survey-based network derived from survey responses to specific name-generator questions and a patient-sharing network derived from claims data. We also examine the association of summary measures derived from either network with physicians' beliefs about peer influence in medical practice. Statistical models with hierarchical and multiple-membership structures were used to estimate the strength of the associations. We found that a survey measure indicating whether a physician was nominated by others was statistically significantly associated with their survey reported beliefs about peer influence. We also observed notable associations between the physicians' structural importance in the network reflected in their eigenvector and betweenness centrality in the patient-sharing network and their beliefs about peer influence. This study of multi-source network relational information advances our understanding of physician survey responses and yields more precise predictions of physician beliefs toward peer-influence than either data source alone. Overall, we found that patient-sharing networks are an important alternative to directly measured survey-based name-generator questions in health services research and other applications. While patient-sharing networks recover some of the information in directly measured peer physician nominations, they also contain distinct information that is helpful for interpreting healthcare insights.
{"title":"Combining multiple sources of relationships in a network to advance understanding of physicians' beliefs regarding peer-effects.","authors":"Yifan Zhao, Carly A Bobak, Megan A Murphy, Olivia Sacks, Lili Liu, Natasha Ray, Amber E Barnato, A James O'Malley","doi":"10.1007/s10742-025-00343-y","DOIUrl":"https://doi.org/10.1007/s10742-025-00343-y","url":null,"abstract":"<p><p>Patient-sharing physician networks are increasingly recognized as valuable tools for examining physician relationships in healthcare research. However, very few studies have examined the reliability of such networks and summary measures derived from them in relation to directly measured physician relationships. In this paper, we evaluate the level of congruence between a survey-based network derived from survey responses to specific name-generator questions and a patient-sharing network derived from claims data. We also examine the association of summary measures derived from either network with physicians' beliefs about peer influence in medical practice. Statistical models with hierarchical and multiple-membership structures were used to estimate the strength of the associations. We found that a survey measure indicating whether a physician was nominated by others was statistically significantly associated with their survey reported beliefs about peer influence. We also observed notable associations between the physicians' structural importance in the network reflected in their eigenvector and betweenness centrality in the patient-sharing network and their beliefs about peer influence. This study of multi-source network relational information advances our understanding of physician survey responses and yields more precise predictions of physician beliefs toward peer-influence than either data source alone. Overall, we found that patient-sharing networks are an important alternative to directly measured survey-based name-generator questions in health services research and other applications. While patient-sharing networks recover some of the information in directly measured peer physician nominations, they also contain distinct information that is helpful for interpreting healthcare insights.</p>","PeriodicalId":45600,"journal":{"name":"Health Services and Outcomes Research Methodology","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12385539/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144973490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-04-12DOI: 10.1007/s10742-024-00327-4
Megan S Schuler, Donna L Coffman, Elizabeth A Stuart, Trang Q Nguyen, Brian Vegetabile, Daniel F McCaffrey
Mediation analysis is a statistical approach that can provide insights regarding the intermediary processes by which an intervention or exposure affects a given outcome. Mediation analyses rose to prominence, particularly in social science research, with the publication of Baron and Kenny's seminal paper and is now commonly applied in many research disciplines, including health services research. Despite the growth in popularity, applied researchers may still encounter challenges in terms of conducting mediation analyses in practice. In this paper, we provide an overview of conceptual and methodological challenges that researchers face when conducting mediation analyses. Specifically, we discuss the following key challenges: (1) Conceptually differentiating mediators from other "third variables," (2) Extending beyond the single mediator context, (3) Identifying appropriate datasets in which measurement and temporal ordering support the hypothesized mediation model, (4) Selecting mediation effects that reflect the scientific question of interest, (5) Assessing the validity of underlying assumptions of no omitted confounders, (6) Addressing measurement error regarding the mediator, and (7) Clearly reporting results from mediation analyses. We discuss each challenge and highlight ways in which the applied researcher can approach these challenges.
{"title":"Practical challenges in mediation analysis: a guide for applied researchers.","authors":"Megan S Schuler, Donna L Coffman, Elizabeth A Stuart, Trang Q Nguyen, Brian Vegetabile, Daniel F McCaffrey","doi":"10.1007/s10742-024-00327-4","DOIUrl":"10.1007/s10742-024-00327-4","url":null,"abstract":"<p><p>Mediation analysis is a statistical approach that can provide insights regarding the intermediary processes by which an intervention or exposure affects a given outcome. Mediation analyses rose to prominence, particularly in social science research, with the publication of Baron and Kenny's seminal paper and is now commonly applied in many research disciplines, including health services research. Despite the growth in popularity, applied researchers may still encounter challenges in terms of conducting mediation analyses in practice. In this paper, we provide an overview of conceptual and methodological challenges that researchers face when conducting mediation analyses. Specifically, we discuss the following key challenges: (1) Conceptually differentiating mediators from other \"third variables,\" (2) Extending beyond the single mediator context, (3) Identifying appropriate datasets in which measurement and temporal ordering support the hypothesized mediation model, (4) Selecting mediation effects that reflect the scientific question of interest, (5) Assessing the validity of underlying assumptions of no omitted confounders, (6) Addressing measurement error regarding the mediator, and (7) Clearly reporting results from mediation analyses. We discuss each challenge and highlight ways in which the applied researcher can approach these challenges.</p>","PeriodicalId":45600,"journal":{"name":"Health Services and Outcomes Research Methodology","volume":"25 1","pages":"57-84"},"PeriodicalIF":1.6,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11821701/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143434197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23DOI: 10.1007/s10742-025-00363-8
Carly L Brantner, Wenshan Yu, Congwen Zhao, Kyungeun Jeon, Grace V Ringlein, Qiao Wang, Elaona Lemoto, Trang Quynh Nguyen, Jane P Gagliardi, Peter P Zandi, Benjamin A Goldstein, Elizabeth A Stuart, Hwanhee Hong
Combining data from diverse sources including randomized controlled trials (RCTs) and observational datasets holds the potential to increase sample size, improve external validity, and gain a well-rounded view of the question under study. However, the practical implementation of integrating different data sources can be complicated, particularly when considering data collected across sites and institutions. In this paper, we use a case study of data from four RCTs and two electronic health record (EHR) systems to illustrate some of the challenges that can arise when combining these various sources of data. We group the challenges into cohort- and variable-related challenges, and for each challenge, we provide descriptive statistics and visuals from our case study to show the decisions that must be made and the subsequent implications. We provide guidance for researchers on the most important considerations and emphasize the necessity for careful, documented decision-making done through an interdisciplinary team. Through this case study and associated reflections, we highlight the dangers of naively combining data and advocate for a discussion and clear communication of the decisions made at each step in the data combination process, as well as the limitations and implications of those decisions.
{"title":"The challenges of integrating diverse data sources: A case study in major depression.","authors":"Carly L Brantner, Wenshan Yu, Congwen Zhao, Kyungeun Jeon, Grace V Ringlein, Qiao Wang, Elaona Lemoto, Trang Quynh Nguyen, Jane P Gagliardi, Peter P Zandi, Benjamin A Goldstein, Elizabeth A Stuart, Hwanhee Hong","doi":"10.1007/s10742-025-00363-8","DOIUrl":"10.1007/s10742-025-00363-8","url":null,"abstract":"<p><p>Combining data from diverse sources including randomized controlled trials (RCTs) and observational datasets holds the potential to increase sample size, improve external validity, and gain a well-rounded view of the question under study. However, the practical implementation of integrating different data sources can be complicated, particularly when considering data collected across sites and institutions. In this paper, we use a case study of data from four RCTs and two electronic health record (EHR) systems to illustrate some of the challenges that can arise when combining these various sources of data. We group the challenges into cohort- and variable-related challenges, and for each challenge, we provide descriptive statistics and visuals from our case study to show the decisions that must be made and the subsequent implications. We provide guidance for researchers on the most important considerations and emphasize the necessity for careful, documented decision-making done through an interdisciplinary team. Through this case study and associated reflections, we highlight the dangers of naively combining data and advocate for a discussion and clear communication of the decisions made at each step in the data combination process, as well as the limitations and implications of those decisions.</p>","PeriodicalId":45600,"journal":{"name":"Health Services and Outcomes Research Methodology","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12885567/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146158699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01Epub Date: 2023-08-09DOI: 10.1007/s10742-023-00310-5
Hyojung Kang, Min-Woong Sohn, Soyoun Kim, Siyao Zhang, Rajesh Balkrishnan, Roger Anderson, Anthony McCall, Timothy McMurry, Jennifer Mason Lobo
Annual preventive care is essential for diabetes patients to reduce the risk of complications including hypoglycemic events and blindness. Our aim was to examine the relative efficiency of Diabetes Belt (DB) and non-Diabetes Belt (NDB) counties in providing recommended preventive care for Medicare beneficiaries with diabetes using available health professional resources and to understand county-level socioeconomic factors associated with inefficient provision of preventive care. A data envelopment analysis (DEA) model was developed to assess relative efficiency of counties in providing diabetes preventive care. Logistic regression was performed to identify socioeconomic characteristics associated with inefficiencies. We used Medicare claims data to extract individual-level information of diabetes preventive service use and obtained county-level estimates of health resources information from the Area Health Resources File. More than 80% of counties had more than 10% inefficiencies on average. Compared to counties in the NDB, the odds of being inefficient were 2.44 times more likely in the DB (OR 2.44, CI 1.67-3.58). Counties with lower median income, with a smaller proportion of non-Hispanic Black population, and in a rural area had higher odds of being inefficient in providing preventive care. Our DEA results showed that counties in the DB and NDB were mostly inefficient. The availability of care providers may be less of a problem than how efficiently the resources are used in providing preventive care. Identifying sources of inefficiency within each community with low resource utilization and developing targeted strategies is needed to improve uptake of preventive care cost-effectively.
对于糖尿病患者来说,每年的预防性保健对于降低低血糖事件和失明等并发症的风险至关重要。我们的目的是检查糖尿病带(DB)和非糖尿病带(NDB)县在利用现有卫生专业资源为糖尿病医疗保险受益人提供推荐的预防保健方面的相对效率,并了解与预防保健提供效率低下相关的县级社会经济因素。采用数据包络分析(DEA)模型评估各县提供糖尿病预防护理的相对效率。进行逻辑回归以确定与效率低下相关的社会经济特征。我们使用医疗保险索赔数据提取糖尿病预防服务使用的个人水平信息,并从区域卫生资源文件中获得县级卫生资源信息估计。超过80%的县的平均低效率超过10%。与新开发银行的国家相比,低效率的可能性是新开发银行的2.44倍(OR 2.44, CI 1.67-3.58)。收入中位数较低、非西班牙裔黑人人口比例较小以及农村地区的县在提供预防保健方面效率较低的可能性更高。我们的DEA结果显示,在DB和NDB的县大多效率低下。保健提供者的可用性可能不是一个问题,而是如何有效地利用资源来提供预防性保健。需要在每个资源利用率低的社区内确定效率低下的根源,并制定有针对性的战略,以经济有效地提高预防保健的接受程度。
{"title":"Diabetes Belt has lower efficiency in providing diabetes preventive care than surrounding counties.","authors":"Hyojung Kang, Min-Woong Sohn, Soyoun Kim, Siyao Zhang, Rajesh Balkrishnan, Roger Anderson, Anthony McCall, Timothy McMurry, Jennifer Mason Lobo","doi":"10.1007/s10742-023-00310-5","DOIUrl":"10.1007/s10742-023-00310-5","url":null,"abstract":"<p><p>Annual preventive care is essential for diabetes patients to reduce the risk of complications including hypoglycemic events and blindness. Our aim was to examine the relative efficiency of Diabetes Belt (DB) and non-Diabetes Belt (NDB) counties in providing recommended preventive care for Medicare beneficiaries with diabetes using available health professional resources and to understand county-level socioeconomic factors associated with inefficient provision of preventive care. A data envelopment analysis (DEA) model was developed to assess relative efficiency of counties in providing diabetes preventive care. Logistic regression was performed to identify socioeconomic characteristics associated with inefficiencies. We used Medicare claims data to extract individual-level information of diabetes preventive service use and obtained county-level estimates of health resources information from the Area Health Resources File. More than 80% of counties had more than 10% inefficiencies on average. Compared to counties in the NDB, the odds of being inefficient were 2.44 times more likely in the DB (OR 2.44, CI 1.67-3.58). Counties with lower median income, with a smaller proportion of non-Hispanic Black population, and in a rural area had higher odds of being inefficient in providing preventive care. Our DEA results showed that counties in the DB and NDB were mostly inefficient. The availability of care providers may be less of a problem than how efficiently the resources are used in providing preventive care. Identifying sources of inefficiency within each community with low resource utilization and developing targeted strategies is needed to improve uptake of preventive care cost-effectively.</p>","PeriodicalId":45600,"journal":{"name":"Health Services and Outcomes Research Methodology","volume":"191 1","pages":"200-210"},"PeriodicalIF":1.6,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12392157/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73273693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}