Pub Date : 2025-03-18DOI: 10.1186/s12874-025-02521-5
Florence J Breslin, Erin L Ratliff, Zsofia P Cohen, Julie M Croff, Kara L Kerr
Background: Early life adversity (ELA) has substantial, lifelong impacts on mental and physical health and development. Data from the ABCD® Study will provide essential insights into these effects. Because the study lacks a unified adversity assessment, our objective was to use a critical, human-driven approach to identify variables that fit ELA domains measured in this study.
Methods: We clarify best practices in measurement of adversity in the ABCD Study through the creation of adversity scores based on the well-established Adverse Childhood Experiences (ACEs) questionnaire and another inclusive of broader ELA. Variables previously used to measure adversity in the ABCD dataset were determined via literature review. We assessed each variable to identify its utility in measuring domains of adversity at baseline and follow-up time points and by individual completing the assessment (i.e., youth or caregiver). Variables were selected that align with decades of ELA measurement, and therefore, can be used by research teams as measures of ELA.
Results: The literature review and critical analysis of items led to the development of three measures of ELA: an ACES-proxy score, a youth-reported ACEs-proxy score, and a broader ELA score (ELA+). We provide code using R to calculate these scores and their constituent domains for use in future ABCD adversity-related research.
Conclusions: The ABCD Study is one of the largest longitudinal studies of youth development, with data available for secondary analysis. Our review of existing measures and development of a coding schema will allow examination of ELA using this dataset, informing our understanding of risk, resilience, and prevention.
{"title":"Measuring adversity in the ABCD® Study: systematic review and recommendations for best practices.","authors":"Florence J Breslin, Erin L Ratliff, Zsofia P Cohen, Julie M Croff, Kara L Kerr","doi":"10.1186/s12874-025-02521-5","DOIUrl":"https://doi.org/10.1186/s12874-025-02521-5","url":null,"abstract":"<p><strong>Background: </strong>Early life adversity (ELA) has substantial, lifelong impacts on mental and physical health and development. Data from the ABCD® Study will provide essential insights into these effects. Because the study lacks a unified adversity assessment, our objective was to use a critical, human-driven approach to identify variables that fit ELA domains measured in this study.</p><p><strong>Methods: </strong>We clarify best practices in measurement of adversity in the ABCD Study through the creation of adversity scores based on the well-established Adverse Childhood Experiences (ACEs) questionnaire and another inclusive of broader ELA. Variables previously used to measure adversity in the ABCD dataset were determined via literature review. We assessed each variable to identify its utility in measuring domains of adversity at baseline and follow-up time points and by individual completing the assessment (i.e., youth or caregiver). Variables were selected that align with decades of ELA measurement, and therefore, can be used by research teams as measures of ELA.</p><p><strong>Results: </strong>The literature review and critical analysis of items led to the development of three measures of ELA: an ACES-proxy score, a youth-reported ACEs-proxy score, and a broader ELA score (ELA<sup>+</sup>). We provide code using R to calculate these scores and their constituent domains for use in future ABCD adversity-related research.</p><p><strong>Conclusions: </strong>The ABCD Study is one of the largest longitudinal studies of youth development, with data available for secondary analysis. Our review of existing measures and development of a coding schema will allow examination of ELA using this dataset, informing our understanding of risk, resilience, and prevention.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"77"},"PeriodicalIF":3.9,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143656037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-18DOI: 10.1186/s12874-025-02528-y
Nathan Bernard, Yoshimasa Sagawa, Nathalie Bier, Thomas Lihoreau, Lionel Pazart, Thomas Tannou
Background: Artificial intelligence (AI) tools are increasingly being used to assist researchers with various research tasks, particularly in the systematic review process. Elicit is one such tool that can generate a summary of the question asked, setting it apart from other AI tools. The aim of this study is to determine whether AI-assisted research using Elicit adds value to the systematic review process compared to traditional screening methods.
Methods: We compare the results from an umbrella review conducted independently of AI with the results of the AI-based searching using the same criteria. Elicit contribution was assessed based on three criteria: repeatability, reliability and accuracy. For repeatability the search process was repeated three times on Elicit (trial 1, trial 2, trial 3). For accuracy, articles obtained with Elicit were reviewed using the same inclusion criteria as the umbrella review. Reliability was assessed by comparing the number of publications with those without AI-based searches.
Results: The repeatability test found 246,169 results and 172 results for the trials 1, 2, and 3 respectively. Concerning accuracy, 6 articles were included at the conclusion of the selection process. Regarding, revealed 3 common articles, 3 exclusively identified by Elicit and 17 exclusively identified by the AI-independent umbrella review search.
Conclusion: Our findings suggest that AI research assistants, like Elicit, can serve as valuable complementary tools for researchers when designing or writing systematic reviews. However, AI tools have several limitations and should be used with caution. When using AI tools, certain principles must be followed to maintain methodological rigour and integrity. Improving the performance of AI tools such as Elicit and contributing to the development of guidelines for their use during the systematic review process will enhance their effectiveness.
{"title":"Using artificial intelligence for systematic review: the example of elicit.","authors":"Nathan Bernard, Yoshimasa Sagawa, Nathalie Bier, Thomas Lihoreau, Lionel Pazart, Thomas Tannou","doi":"10.1186/s12874-025-02528-y","DOIUrl":"https://doi.org/10.1186/s12874-025-02528-y","url":null,"abstract":"<p><strong>Background: </strong>Artificial intelligence (AI) tools are increasingly being used to assist researchers with various research tasks, particularly in the systematic review process. Elicit is one such tool that can generate a summary of the question asked, setting it apart from other AI tools. The aim of this study is to determine whether AI-assisted research using Elicit adds value to the systematic review process compared to traditional screening methods.</p><p><strong>Methods: </strong>We compare the results from an umbrella review conducted independently of AI with the results of the AI-based searching using the same criteria. Elicit contribution was assessed based on three criteria: repeatability, reliability and accuracy. For repeatability the search process was repeated three times on Elicit (trial 1, trial 2, trial 3). For accuracy, articles obtained with Elicit were reviewed using the same inclusion criteria as the umbrella review. Reliability was assessed by comparing the number of publications with those without AI-based searches.</p><p><strong>Results: </strong>The repeatability test found 246,169 results and 172 results for the trials 1, 2, and 3 respectively. Concerning accuracy, 6 articles were included at the conclusion of the selection process. Regarding, revealed 3 common articles, 3 exclusively identified by Elicit and 17 exclusively identified by the AI-independent umbrella review search.</p><p><strong>Conclusion: </strong>Our findings suggest that AI research assistants, like Elicit, can serve as valuable complementary tools for researchers when designing or writing systematic reviews. However, AI tools have several limitations and should be used with caution. When using AI tools, certain principles must be followed to maintain methodological rigour and integrity. Improving the performance of AI tools such as Elicit and contributing to the development of guidelines for their use during the systematic review process will enhance their effectiveness.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"75"},"PeriodicalIF":3.9,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143656044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-18DOI: 10.1186/s12874-025-02531-3
Danielle K Nagy, Lauren C Bresee, Dean T Eurich, Scot H Simpson
Purpose: An individual's location of residence may impact health, however, health services and outcomes research generally use a single point in time to define where an individual resides. While this estimate of residence becomes inaccurate when the study subject moves, the impact on observed associations is not known. This study quantifies the impact of different methods to define residence (rural, urban, metropolitan) on the association with all-cause mortality.
Methods: A diabetes cohort of new metformin users was identified from administrative data in Alberta, Canada between 2008 and 2019. An individual's residence (rural/urban/metropolitan) was defined from postal codes using 4 different methods: residence defined at 1-year before first metformin (this served as the reference model), comparison 1- stable residence for 3 years before first metformin, comparison 2- residence as time-varying (during the outcome observation window), and comparison 3 - nested case control (residence closest to the index date after identifying cases and controls). Multivariable Cox proportional hazard and logistic regression models were constructed to examine the association between residence definitions and all-cause mortality.
Results: We identified 157,146 new metformin users (mean age of 55 years and 57% male) and 8,444 (5%) deaths occurred during the mean follow up of 4.7 (SD 2.3) years. There were few instances of moving after first metformin; 2.6% of individuals moved to a smaller centre (metropolitan to urban or rural, or urban to rural) and 3.1% moved to a larger centre (rural to urban or metropolitan, or urban to metropolitan). The association between rural residence and all-cause mortality was consistent (aHR:1.18; 95%CI:1.12-1.24), regardless of the method used to define residence.
Conclusions: The method used to define residence in a population of adults newly treated with metformin for type 2 diabetes has minimal impact on measures of all-cause mortality, possibly due to infrequent migration. The observed association between residence and mortality is compelling but requires further investigation and more robust analysis.
{"title":"Evaluating methods to define place of residence in Canadian administrative data and the impact on observed associations with all-cause mortality in type 2 diabetes.","authors":"Danielle K Nagy, Lauren C Bresee, Dean T Eurich, Scot H Simpson","doi":"10.1186/s12874-025-02531-3","DOIUrl":"https://doi.org/10.1186/s12874-025-02531-3","url":null,"abstract":"<p><strong>Purpose: </strong>An individual's location of residence may impact health, however, health services and outcomes research generally use a single point in time to define where an individual resides. While this estimate of residence becomes inaccurate when the study subject moves, the impact on observed associations is not known. This study quantifies the impact of different methods to define residence (rural, urban, metropolitan) on the association with all-cause mortality.</p><p><strong>Methods: </strong>A diabetes cohort of new metformin users was identified from administrative data in Alberta, Canada between 2008 and 2019. An individual's residence (rural/urban/metropolitan) was defined from postal codes using 4 different methods: residence defined at 1-year before first metformin (this served as the reference model), comparison 1- stable residence for 3 years before first metformin, comparison 2- residence as time-varying (during the outcome observation window), and comparison 3 - nested case control (residence closest to the index date after identifying cases and controls). Multivariable Cox proportional hazard and logistic regression models were constructed to examine the association between residence definitions and all-cause mortality.</p><p><strong>Results: </strong>We identified 157,146 new metformin users (mean age of 55 years and 57% male) and 8,444 (5%) deaths occurred during the mean follow up of 4.7 (SD 2.3) years. There were few instances of moving after first metformin; 2.6% of individuals moved to a smaller centre (metropolitan to urban or rural, or urban to rural) and 3.1% moved to a larger centre (rural to urban or metropolitan, or urban to metropolitan). The association between rural residence and all-cause mortality was consistent (aHR:1.18; 95%CI:1.12-1.24), regardless of the method used to define residence.</p><p><strong>Conclusions: </strong>The method used to define residence in a population of adults newly treated with metformin for type 2 diabetes has minimal impact on measures of all-cause mortality, possibly due to infrequent migration. The observed association between residence and mortality is compelling but requires further investigation and more robust analysis.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"76"},"PeriodicalIF":3.9,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143656133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Randomized test-treatment studies are performed to evaluate the clinical effectiveness of diagnostic tests by assessing patient-relevant outcomes. The assumptions for a sample size calculation for such studies are often uncertain.
Methods: An adaptive design with a blinded sample size recalculation based on the overall success rate in a randomized test-treatment trial with restricting randomization to discordant pairs is proposed and evaluated by a simulation study. The results of the adaptive design are compared to those of the fixed design.
Results: The empirical type I error rate is sufficiently controlled in the adaptive design as well as in the fixed design and the estimates are unbiased. The adaptive design achieves the desired theoretical power, whereas the fixed design tends to be over- or under-powered.
Conclusions: It may be advisable to consider blinded recalculation of sample size in a randomized test-treatment study with restriction of randomization to discordant pairs in order to improve the conduct of the study. However, there are a number of study-related limitations that affect the implementation of the method which need to be considered.
{"title":"Sample size recalculation based on the overall success rate in a randomized test-treatment trial with restricting randomization to discordant pairs.","authors":"Caroline Elzner, Amra Pepić, Oke Gerke, Antonia Zapf","doi":"10.1186/s12874-024-02410-3","DOIUrl":"https://doi.org/10.1186/s12874-024-02410-3","url":null,"abstract":"<p><strong>Background: </strong>Randomized test-treatment studies are performed to evaluate the clinical effectiveness of diagnostic tests by assessing patient-relevant outcomes. The assumptions for a sample size calculation for such studies are often uncertain.</p><p><strong>Methods: </strong>An adaptive design with a blinded sample size recalculation based on the overall success rate in a randomized test-treatment trial with restricting randomization to discordant pairs is proposed and evaluated by a simulation study. The results of the adaptive design are compared to those of the fixed design.</p><p><strong>Results: </strong>The empirical type I error rate is sufficiently controlled in the adaptive design as well as in the fixed design and the estimates are unbiased. The adaptive design achieves the desired theoretical power, whereas the fixed design tends to be over- or under-powered.</p><p><strong>Conclusions: </strong>It may be advisable to consider blinded recalculation of sample size in a randomized test-treatment study with restriction of randomization to discordant pairs in order to improve the conduct of the study. However, there are a number of study-related limitations that affect the implementation of the method which need to be considered.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"74"},"PeriodicalIF":3.9,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143656038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-18DOI: 10.1186/s12874-025-02525-1
Md Sakhawat Hossain, Ravi Goyal, Natasha K Martin, Victor DeGruttola, Mohammad Mihrab Chowdhury, Christopher McMahan, Lior Rennert
Background: Our research focuses on local-level estimation of the effective reproductive number, which describes the transmissibility of an infectious disease and represents the average number of individuals one infectious person infects at a given time. The ability to accurately estimate the infectious disease reproductive number in geographically granular regions is critical for disaster planning and resource allocation. However, not all regions have sufficient infectious disease outcome data; this lack of data presents a significant challenge for accurate estimation.
Methods: To overcome this challenge, we propose a two-step approach that incorporates existing [Formula: see text] estimation procedures (EpiEstim, EpiFilter, EpiNow2) using data from geographic regions with sufficient data (step 1), into a covariate-adjusted Bayesian Integrated Nested Laplace Approximation (INLA) spatial model to predict [Formula: see text] in regions with sparse or missing data (step 2). Our flexible framework effectively allows us to implement any existing estimation procedure for [Formula: see text] in regions with coarse or entirely missing data. We perform external validation and a simulation study to evaluate the proposed method and assess its predictive performance.
Results: We applied our method to estimate [Formula: see text]using data from South Carolina (SC) counties and ZIP codes during the first COVID-19 wave ('Wave 1', June 16, 2020 - August 31, 2020) and the second wave ('Wave 2', December 16, 2020 - March 02, 2021). Among the three methods used in the first step, EpiNow2 yielded the highest accuracy of [Formula: see text] prediction in the regions with entirely missing data. Median county-level percentage agreement (PA) was 90.9% (Interquartile Range, IQR: 89.9-92.0%) and 92.5% (IQR: 91.6-93.4%) for Wave 1 and 2, respectively. Median zip code-level PA was 95.2% (IQR: 94.4-95.7%) and 96.5% (IQR: 95.8-97.1%) for Wave 1 and 2, respectively. Using EpiEstim, EpiFilter, and an ensemble-based approach yielded median PA ranging from 81.9 to 90.0%, 87.2-92.1%, and 88.4-90.9%, respectively, across both waves and geographic granularities.
Conclusion: These findings demonstrate that the proposed methodology is a useful tool for small-area estimation of [Formula: see text], as our flexible framework yields high prediction accuracy for regions with coarse or missing data.
{"title":"A flexible framework for local-level estimation of the effective reproductive number in geographic regions with sparse data.","authors":"Md Sakhawat Hossain, Ravi Goyal, Natasha K Martin, Victor DeGruttola, Mohammad Mihrab Chowdhury, Christopher McMahan, Lior Rennert","doi":"10.1186/s12874-025-02525-1","DOIUrl":"https://doi.org/10.1186/s12874-025-02525-1","url":null,"abstract":"<p><strong>Background: </strong>Our research focuses on local-level estimation of the effective reproductive number, which describes the transmissibility of an infectious disease and represents the average number of individuals one infectious person infects at a given time. The ability to accurately estimate the infectious disease reproductive number in geographically granular regions is critical for disaster planning and resource allocation. However, not all regions have sufficient infectious disease outcome data; this lack of data presents a significant challenge for accurate estimation.</p><p><strong>Methods: </strong>To overcome this challenge, we propose a two-step approach that incorporates existing [Formula: see text] estimation procedures (EpiEstim, EpiFilter, EpiNow2) using data from geographic regions with sufficient data (step 1), into a covariate-adjusted Bayesian Integrated Nested Laplace Approximation (INLA) spatial model to predict [Formula: see text] in regions with sparse or missing data (step 2). Our flexible framework effectively allows us to implement any existing estimation procedure for [Formula: see text] in regions with coarse or entirely missing data. We perform external validation and a simulation study to evaluate the proposed method and assess its predictive performance.</p><p><strong>Results: </strong>We applied our method to estimate [Formula: see text]using data from South Carolina (SC) counties and ZIP codes during the first COVID-19 wave ('Wave 1', June 16, 2020 - August 31, 2020) and the second wave ('Wave 2', December 16, 2020 - March 02, 2021). Among the three methods used in the first step, EpiNow2 yielded the highest accuracy of [Formula: see text] prediction in the regions with entirely missing data. Median county-level percentage agreement (PA) was 90.9% (Interquartile Range, IQR: 89.9-92.0%) and 92.5% (IQR: 91.6-93.4%) for Wave 1 and 2, respectively. Median zip code-level PA was 95.2% (IQR: 94.4-95.7%) and 96.5% (IQR: 95.8-97.1%) for Wave 1 and 2, respectively. Using EpiEstim, EpiFilter, and an ensemble-based approach yielded median PA ranging from 81.9 to 90.0%, 87.2-92.1%, and 88.4-90.9%, respectively, across both waves and geographic granularities.</p><p><strong>Conclusion: </strong>These findings demonstrate that the proposed methodology is a useful tool for small-area estimation of [Formula: see text], as our flexible framework yields high prediction accuracy for regions with coarse or missing data.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"73"},"PeriodicalIF":3.9,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143656132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Patient stratification is the cornerstone of numerous health investigations, serving to enhance the estimation of treatment efficacy and facilitating patient matching. To stratify patients, similarity measures between patients can be computed from clinical variables contained in medical health records. These variables have both values and labels structured in ontologies or other classification systems. The relevance of considering variable label relationships in the computation of patient similarity measures has been poorly studied.
Objective: We adapt and evaluate several weighted versions of the Cosine similarity in order to consider structured label relationships to compute patient similarities from a medico-administrative database.
Materials and methods: As a use case, we clustered patients aged 60 years from their annual medicine reimbursements contained in the Échantillon Généraliste des Bénéficiaires, a random sample of a French medico-administrative database. We used four patient similarity measures: the standard Cosine similarity, a weighted Cosine similarity measure that includes variable frequencies and two weighted Cosine similarity measures that consider variable label relationships. We construct patient networks from each similarity measure and identify clusters of patients using the Markov Cluster algorithm. We evaluate the performance of the different similarity measures with enrichment tests based on patient diagnoses.
Results: The weighted similarity measures that include structured variable label relationships perform better to identify similar patients. Indeed, using these weighted measures, we identify more clusters associated with different diagnose enrichment. Importantly, the enrichment tests provide clinically interpretable insights into these patient clusters.
Conclusion: Considering label relationships when computing patient similarities improves stratification of patients regarding their health status.
{"title":"Improving patient clustering by incorporating structured variable label relationships in similarity measures.","authors":"Judith Lambert, Anne-Louise Leutenegger, Anaïs Baudot, Anne-Sophie Jannot","doi":"10.1186/s12874-025-02459-8","DOIUrl":"10.1186/s12874-025-02459-8","url":null,"abstract":"<p><strong>Background: </strong>Patient stratification is the cornerstone of numerous health investigations, serving to enhance the estimation of treatment efficacy and facilitating patient matching. To stratify patients, similarity measures between patients can be computed from clinical variables contained in medical health records. These variables have both values and labels structured in ontologies or other classification systems. The relevance of considering variable label relationships in the computation of patient similarity measures has been poorly studied.</p><p><strong>Objective: </strong>We adapt and evaluate several weighted versions of the Cosine similarity in order to consider structured label relationships to compute patient similarities from a medico-administrative database.</p><p><strong>Materials and methods: </strong>As a use case, we clustered patients aged 60 years from their annual medicine reimbursements contained in the Échantillon Généraliste des Bénéficiaires, a random sample of a French medico-administrative database. We used four patient similarity measures: the standard Cosine similarity, a weighted Cosine similarity measure that includes variable frequencies and two weighted Cosine similarity measures that consider variable label relationships. We construct patient networks from each similarity measure and identify clusters of patients using the Markov Cluster algorithm. We evaluate the performance of the different similarity measures with enrichment tests based on patient diagnoses.</p><p><strong>Results: </strong>The weighted similarity measures that include structured variable label relationships perform better to identify similar patients. Indeed, using these weighted measures, we identify more clusters associated with different diagnose enrichment. Importantly, the enrichment tests provide clinically interpretable insights into these patient clusters.</p><p><strong>Conclusion: </strong>Considering label relationships when computing patient similarities improves stratification of patients regarding their health status.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"72"},"PeriodicalIF":3.9,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11910865/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143633548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-14DOI: 10.1186/s12874-025-02518-0
Caroline Struthers, James Harwood, Jennifer Anne de Beyer, Patricia Logullo, Gary S Collins
<p><strong>Background: </strong>Although medical journals endorse reporting guidelines, authors often struggle to find and use the right one for their study type and topic. The UK EQUATOR Centre developed the GoodReports website to direct authors to appropriate guidance. Pilot data suggested that authors did not improve their manuscripts when advised to use a particular reporting guideline by GoodReports.org at journal submission stage. User feedback suggested the checklist format of most reporting guidelines does not encourage use during manuscript writing. We tested whether providing customized reporting guidance within writing templates for use throughout the writing process resulted in clearer and more complete reporting than only giving advice on which reporting guideline to use.</p><p><strong>Design and methods: </strong>GRReaT was a two-group parallel 1:1 randomized trial with a target sample size of 206. Participants were lead authors at an early stage of writing up a health-related study. Eligible study designs were cohort, cross-sectional, or case-control study, randomized trial, and systematic review. After randomization, the intervention group received an article template including items from the appropriate reporting guideline and links to explanations and examples. The control group received a reporting guideline recommendation and general advice on reporting. Participants sent their completed manuscripts to the GRReaT team before submitting for publication, for completeness of each item in the title, methods, and results section of the corresponding reporting guideline. The primary outcome was reporting completeness against the corresponding reporting guideline. Participants were not blinded to allocation. Assessors were blind to group allocation. As a recruitment incentive, all participants received a feedback report identifying missing or inadequately reported items in these three sections.</p><p><strong>Results: </strong>Between 9 June 2021 and 30 June 2023, we randomized 130 participants, 65 to the intervention and 65 to the control group. We present findings from the assessment of reporting completeness for the 37 completed manuscripts we received, 18 in the intervention group and 19 in the control group. The mean (standard deviation) proportion of completely reported items from the title, methods, and results sections of the manuscripts (primary outcome) was 0.57 (0.18) in the intervention group and 0.50 (0.17) in the control group. The mean difference between the two groups was 0.069 (95% CI -0.046 to 0.184; p = 0.231). In the sensitivity analysis, when partially reported items were counted as completely reported, the mean (standard deviation) proportion of completely reported items was 0.75 (0.15) in the intervention group and 0.71 (0.11) in the control group. The mean difference between the two groups was 0.036 (95% CI -0.127 to 0.055; p = 0.423).</p><p><strong>Conclusion: </strong>As the dropout rate was higher than expec
{"title":"There is no reliable evidence that providing authors with customized article templates including items from reporting guidelines improves completeness of reporting: the GoodReports randomized trial (GRReaT).","authors":"Caroline Struthers, James Harwood, Jennifer Anne de Beyer, Patricia Logullo, Gary S Collins","doi":"10.1186/s12874-025-02518-0","DOIUrl":"10.1186/s12874-025-02518-0","url":null,"abstract":"<p><strong>Background: </strong>Although medical journals endorse reporting guidelines, authors often struggle to find and use the right one for their study type and topic. The UK EQUATOR Centre developed the GoodReports website to direct authors to appropriate guidance. Pilot data suggested that authors did not improve their manuscripts when advised to use a particular reporting guideline by GoodReports.org at journal submission stage. User feedback suggested the checklist format of most reporting guidelines does not encourage use during manuscript writing. We tested whether providing customized reporting guidance within writing templates for use throughout the writing process resulted in clearer and more complete reporting than only giving advice on which reporting guideline to use.</p><p><strong>Design and methods: </strong>GRReaT was a two-group parallel 1:1 randomized trial with a target sample size of 206. Participants were lead authors at an early stage of writing up a health-related study. Eligible study designs were cohort, cross-sectional, or case-control study, randomized trial, and systematic review. After randomization, the intervention group received an article template including items from the appropriate reporting guideline and links to explanations and examples. The control group received a reporting guideline recommendation and general advice on reporting. Participants sent their completed manuscripts to the GRReaT team before submitting for publication, for completeness of each item in the title, methods, and results section of the corresponding reporting guideline. The primary outcome was reporting completeness against the corresponding reporting guideline. Participants were not blinded to allocation. Assessors were blind to group allocation. As a recruitment incentive, all participants received a feedback report identifying missing or inadequately reported items in these three sections.</p><p><strong>Results: </strong>Between 9 June 2021 and 30 June 2023, we randomized 130 participants, 65 to the intervention and 65 to the control group. We present findings from the assessment of reporting completeness for the 37 completed manuscripts we received, 18 in the intervention group and 19 in the control group. The mean (standard deviation) proportion of completely reported items from the title, methods, and results sections of the manuscripts (primary outcome) was 0.57 (0.18) in the intervention group and 0.50 (0.17) in the control group. The mean difference between the two groups was 0.069 (95% CI -0.046 to 0.184; p = 0.231). In the sensitivity analysis, when partially reported items were counted as completely reported, the mean (standard deviation) proportion of completely reported items was 0.75 (0.15) in the intervention group and 0.71 (0.11) in the control group. The mean difference between the two groups was 0.036 (95% CI -0.127 to 0.055; p = 0.423).</p><p><strong>Conclusion: </strong>As the dropout rate was higher than expec","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"71"},"PeriodicalIF":3.9,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11907807/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143633551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-12DOI: 10.1186/s12874-025-02522-4
Shuo Yang, Huaan Su, Nanxiang Zhang, Yuduan Han, Yingfeng Ge, Yi Fei, Ying Liu, Abdullahi Hilowle, Peng Xu, Jinxin Zhang
Background: Assuming a linear relationship between continuous predictors and outcomes in clinical prediction models is often inappropriate, as true linear relationships are rare, potentially resulting in biased estimates and inaccurate conclusions. Our research group addressed a single U-shaped independent variable before. Multiple U-shaped predictors can improve predictive accuracy by capturing nuanced relationships, but they also introduce challenges like increased complexity and potential overfitting. This study aims to extend the applicability of our previous research results to more common scenarios, thereby facilitating more comprehensive and practical investigations.
Methods: In this study, we proposed a novel approach called the Recursive Gradient Scanning Method (RGS) for discretizing multiple continuous variables that exhibit U-shaped relationships with the natural logarithm of the odds ratio (lnOR). The RGS method involves a two-step approach: first, it conducts fine screening from the 2.5th to 97.5th percentiles of the lnOR. Then, it utilizes an iterative process that compares AIC metrics to identify optimal categorical variables. We conducted a Monte Carlo simulation study to investigate the performance of the RGS method. Different correlation levels, sample sizes, missing rates, and symmetry levels of U-shaped relationships were considered in the simulation process. To compare the RGS method with other common approaches (such as median, Q1-Q3, minimum P-value method), we assessed both the predictive ability (e.g., AUC) and goodness of fit (e.g., AIC) of logistic regression models with variables discretized at different cut-points using a real dataset.
Results: Both simulation and empirical studies have consistently demonstrated the effectiveness of the RGS method. In simulation studies, the RGS method showed superior performance compared to other common discretization methods in discrimination ability and overall performance for logistic regression models across various U-shaped scenarios (with varying correlation levels, sample sizes, missing rates, and symmetry levels of U-shaped relationships). Similarly, empirical study showed that the optimal cut-points identified by RGS have superior clinical predictive power, as measured by metrics such as AUC, compared to other traditional methods.
Conclusions: The simulation and empirical study demonstrated that the RGS method outperformed other common discretization methods in terms of goodness of fit and predictive ability. However, in the future, we will focus on addressing challenges related to separation or missing binary responses, and we will require more data to validate our method.
{"title":"Discretizing multiple continuous predictors with U-shaped relationships with lnOR: introducing the recursive gradient scanning method in clinical and epidemiological research.","authors":"Shuo Yang, Huaan Su, Nanxiang Zhang, Yuduan Han, Yingfeng Ge, Yi Fei, Ying Liu, Abdullahi Hilowle, Peng Xu, Jinxin Zhang","doi":"10.1186/s12874-025-02522-4","DOIUrl":"10.1186/s12874-025-02522-4","url":null,"abstract":"<p><strong>Background: </strong>Assuming a linear relationship between continuous predictors and outcomes in clinical prediction models is often inappropriate, as true linear relationships are rare, potentially resulting in biased estimates and inaccurate conclusions. Our research group addressed a single U-shaped independent variable before. Multiple U-shaped predictors can improve predictive accuracy by capturing nuanced relationships, but they also introduce challenges like increased complexity and potential overfitting. This study aims to extend the applicability of our previous research results to more common scenarios, thereby facilitating more comprehensive and practical investigations.</p><p><strong>Methods: </strong>In this study, we proposed a novel approach called the Recursive Gradient Scanning Method (RGS) for discretizing multiple continuous variables that exhibit U-shaped relationships with the natural logarithm of the odds ratio (lnOR). The RGS method involves a two-step approach: first, it conducts fine screening from the 2.5th to 97.5th percentiles of the lnOR. Then, it utilizes an iterative process that compares AIC metrics to identify optimal categorical variables. We conducted a Monte Carlo simulation study to investigate the performance of the RGS method. Different correlation levels, sample sizes, missing rates, and symmetry levels of U-shaped relationships were considered in the simulation process. To compare the RGS method with other common approaches (such as median, Q<sub>1</sub>-Q<sub>3</sub>, minimum P-value method), we assessed both the predictive ability (e.g., AUC) and goodness of fit (e.g., AIC) of logistic regression models with variables discretized at different cut-points using a real dataset.</p><p><strong>Results: </strong>Both simulation and empirical studies have consistently demonstrated the effectiveness of the RGS method. In simulation studies, the RGS method showed superior performance compared to other common discretization methods in discrimination ability and overall performance for logistic regression models across various U-shaped scenarios (with varying correlation levels, sample sizes, missing rates, and symmetry levels of U-shaped relationships). Similarly, empirical study showed that the optimal cut-points identified by RGS have superior clinical predictive power, as measured by metrics such as AUC, compared to other traditional methods.</p><p><strong>Conclusions: </strong>The simulation and empirical study demonstrated that the RGS method outperformed other common discretization methods in terms of goodness of fit and predictive ability. However, in the future, we will focus on addressing challenges related to separation or missing binary responses, and we will require more data to validate our method.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"70"},"PeriodicalIF":3.9,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11900475/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143613301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-12DOI: 10.1186/s12874-025-02479-4
Jeewuan Kim, Seung-Ho Kang
Background: The planning and analysis of multi-regional clinical trials (MRCTs) has increased in the pharmaceutical industry to facilitate global research and development. The ICH E17 guideline emphasizes the importance of considering the potential for regional differences, which may arise from shared intrinsic and extrinsic factors among patients within the same region in MRCTs. These differences remain as challenges in the design and analysis of MRCTs.
Methods: We introduce and investigate hierarchical linear models (HLMs) that account for regional differences by incorporating known factors as covariates and unknown factors as random effects. Extending previous studies, our HLMs incorporate random effects in both the intercept and slope, enhancing the model's flexibility. The proposed figures that depict the observed distribution of the primary endpoint and covariates facilitate understanding the proposed models. Moreover, we investigate the test statistics for the overall treatment effect and derive the required sample size under the HLM, considering both a fixed number of regions and real-world budgetary constraints.
Results: Our simulation studies show that when the number of regions is sufficient, HLM with random effects in the intercept and slope provides empirical type I error rates and power close to the nominal level. However, the estimate for the regional variabilities remains challenging for the small number of the regions. Budgetary constraints impact the required number of regions, while the required number of patients per region is influenced by the variability of treatment effects across regions.
Conclusions: We offer a comprehensive framework for understanding and addressing regional differences in the primary endpoint for MRCTs. Through the proposed strategies with figures and required sample size considering the budget constraints, designs for MRCT could be more efficient.
{"title":"Enhancing insight into regional differences: hierarchical linear models in multiregional clinical trials.","authors":"Jeewuan Kim, Seung-Ho Kang","doi":"10.1186/s12874-025-02479-4","DOIUrl":"10.1186/s12874-025-02479-4","url":null,"abstract":"<p><strong>Background: </strong>The planning and analysis of multi-regional clinical trials (MRCTs) has increased in the pharmaceutical industry to facilitate global research and development. The ICH E17 guideline emphasizes the importance of considering the potential for regional differences, which may arise from shared intrinsic and extrinsic factors among patients within the same region in MRCTs. These differences remain as challenges in the design and analysis of MRCTs.</p><p><strong>Methods: </strong>We introduce and investigate hierarchical linear models (HLMs) that account for regional differences by incorporating known factors as covariates and unknown factors as random effects. Extending previous studies, our HLMs incorporate random effects in both the intercept and slope, enhancing the model's flexibility. The proposed figures that depict the observed distribution of the primary endpoint and covariates facilitate understanding the proposed models. Moreover, we investigate the test statistics for the overall treatment effect and derive the required sample size under the HLM, considering both a fixed number of regions and real-world budgetary constraints.</p><p><strong>Results: </strong>Our simulation studies show that when the number of regions is sufficient, HLM with random effects in the intercept and slope provides empirical type I error rates and power close to the nominal level. However, the estimate for the regional variabilities remains challenging for the small number of the regions. Budgetary constraints impact the required number of regions, while the required number of patients per region is influenced by the variability of treatment effects across regions.</p><p><strong>Conclusions: </strong>We offer a comprehensive framework for understanding and addressing regional differences in the primary endpoint for MRCTs. Through the proposed strategies with figures and required sample size considering the budget constraints, designs for MRCT could be more efficient.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"69"},"PeriodicalIF":3.9,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11900657/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143613281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-11DOI: 10.1186/s12874-025-02523-3
Kosuke Kawai, Bryan K Ward, Joonas Toivonen, Dennis S Poe
Background: The nested frailty model, a random effects survival model that can accommodate data clustered at two hierarchical levels, has been rarely used in practice. We aimed to evaluate the utility of the Bayesian nested frailty modeling approach in the context of a study to examine the effects of various surgical procedures for patients with patulous Eustachian tube dysfunction (PETD).
Methods: A nested frailty model was employed to account for the correlation between each pair of ears within patients and the correlation between multiple event times within each ear. Some patients underwent multiple different surgical treatments in their affected ears. We incorporated two nested lognormal frailties into the Cox proportional hazards model. The Bayesian Monte Carlo Markov Chain approach was utilized. We examined the consequences of ignoring a multilevel structure of the data.
Results: The variances of patient-level and ear-level random effects were both found to be significant in the nested frailty model. Shim insertion and patulous Eustachian tube reconstruction using Alloderm or cartilage were associated with a lower risk of recurrence of PETD symptoms than calcium hydroxyapatite injection.
Conclusions: Bayesian nested frailty models provide flexibility in modeling hierarchical survival data and effectively account for multiple levels of clustering. Our study highlights the importance of accounting for all levels of hierarchical clustering for valid inference.
{"title":"Bayesian nested frailty model for evaluating surgical management of patulous Eustachian tube dysfunction.","authors":"Kosuke Kawai, Bryan K Ward, Joonas Toivonen, Dennis S Poe","doi":"10.1186/s12874-025-02523-3","DOIUrl":"10.1186/s12874-025-02523-3","url":null,"abstract":"<p><strong>Background: </strong>The nested frailty model, a random effects survival model that can accommodate data clustered at two hierarchical levels, has been rarely used in practice. We aimed to evaluate the utility of the Bayesian nested frailty modeling approach in the context of a study to examine the effects of various surgical procedures for patients with patulous Eustachian tube dysfunction (PETD).</p><p><strong>Methods: </strong>A nested frailty model was employed to account for the correlation between each pair of ears within patients and the correlation between multiple event times within each ear. Some patients underwent multiple different surgical treatments in their affected ears. We incorporated two nested lognormal frailties into the Cox proportional hazards model. The Bayesian Monte Carlo Markov Chain approach was utilized. We examined the consequences of ignoring a multilevel structure of the data.</p><p><strong>Results: </strong>The variances of patient-level and ear-level random effects were both found to be significant in the nested frailty model. Shim insertion and patulous Eustachian tube reconstruction using Alloderm or cartilage were associated with a lower risk of recurrence of PETD symptoms than calcium hydroxyapatite injection.</p><p><strong>Conclusions: </strong>Bayesian nested frailty models provide flexibility in modeling hierarchical survival data and effectively account for multiple levels of clustering. Our study highlights the importance of accounting for all levels of hierarchical clustering for valid inference.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"68"},"PeriodicalIF":3.9,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11895236/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143603819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}