首页 > 最新文献

BMC Medical Research Methodology最新文献

英文 中文
Establishing a machine learning dementia progression prediction model with multiple integrated data. 利用多种综合数据建立机器学习痴呆症进展预测模型。
IF 3.9 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-22 DOI: 10.1186/s12874-024-02411-2
Yung-Chuan Huang, Tzu-Chi Liu, Chi-Jie Lu

Objective: Dementia is a significant medical and social issue in most developed countries. Practical tools for predicting the progression of degenerative dementia are highly valuable. Machine learning (ML) methods facilitate the construction of effective models using real-world data, which may include missing values and various integrated datasets.

Method: This retrospective study analyzed data from 679 patients diagnosed with degenerative dementia at Fu Jen Catholic University Hospital, who were evaluated by neurologists, psychologists and followed for over two years. Predictive variables were categorized into demographic (D), clinical dementia rating (CDR), mini-mental state examination (MMSE), and laboratory data value (LV) groups. These categories were further integrated into three subgroups (D-CDR, D-CDR-MMSE, and D-CDR-MMSE-LV). We utilized the extreme gradient boosting (XGB) model to rank the importance of variables and identify the most effective feature combination via a step-wise approach.

Result: The D-CDR-MMSE-LV model combination showed robust performance with an excellent area under the receiver operating characteristic curve (AUC) and the highest sensitivity value (84.66). Employing both demographic and neuropsychiatric variables, our prediction model achieved an AUC of 83.74. By incorporating additional clinical information from laboratory data and applying our proposed feature selection strategy, we constructed a model based on eight variables that achieved an AUC of 85.12 using the XGB technique.

Conclusion: We established a machine-learning model to monitor the progression of dementia using a limited, real-world clinical dataset. The XGB technique identified eight critical variables across our integrated datasets, potentially providing clinicians with valuable guidance.

目的:在大多数发达国家,痴呆症是一个重要的医疗和社会问题。预测退行性痴呆症进展的实用工具非常有价值。机器学习(ML)方法有助于利用真实世界的数据(可能包括缺失值和各种综合数据集)构建有效的模型:这项回顾性研究分析了辅仁大学附属医院确诊为退行性痴呆症的 679 名患者的数据,这些患者接受了神经科医生和心理医生的评估,并接受了两年多的随访。预测变量分为人口统计学组(D)、临床痴呆评分组(CDR)、小型精神状态检查组(MMSE)和实验室数据值组(LV)。这些类别进一步整合为三个子组(D-CDR、D-CDR-MMSE 和 D-CDR-MMSE-LV)。我们利用极梯度提升(XGB)模型对变量的重要性进行排序,并通过逐步推进的方法确定最有效的特征组合:结果:D-CDR-MMSE-LV 模型组合显示出强大的性能,具有极佳的接收器工作特征曲线下面积(AUC)和最高的灵敏度值(84.66)。同时采用人口统计学和神经精神变量,我们的预测模型的AUC达到了83.74。通过纳入实验室数据中的其他临床信息并应用我们提出的特征选择策略,我们构建了一个基于八个变量的模型,利用 XGB 技术,该模型的 AUC 达到了 85.12:我们利用有限的真实临床数据集建立了一个机器学习模型来监测痴呆症的进展。XGB 技术在我们的综合数据集中识别出了八个关键变量,有可能为临床医生提供有价值的指导。
{"title":"Establishing a machine learning dementia progression prediction model with multiple integrated data.","authors":"Yung-Chuan Huang, Tzu-Chi Liu, Chi-Jie Lu","doi":"10.1186/s12874-024-02411-2","DOIUrl":"10.1186/s12874-024-02411-2","url":null,"abstract":"<p><strong>Objective: </strong>Dementia is a significant medical and social issue in most developed countries. Practical tools for predicting the progression of degenerative dementia are highly valuable. Machine learning (ML) methods facilitate the construction of effective models using real-world data, which may include missing values and various integrated datasets.</p><p><strong>Method: </strong>This retrospective study analyzed data from 679 patients diagnosed with degenerative dementia at Fu Jen Catholic University Hospital, who were evaluated by neurologists, psychologists and followed for over two years. Predictive variables were categorized into demographic (D), clinical dementia rating (CDR), mini-mental state examination (MMSE), and laboratory data value (LV) groups. These categories were further integrated into three subgroups (D-CDR, D-CDR-MMSE, and D-CDR-MMSE-LV). We utilized the extreme gradient boosting (XGB) model to rank the importance of variables and identify the most effective feature combination via a step-wise approach.</p><p><strong>Result: </strong>The D-CDR-MMSE-LV model combination showed robust performance with an excellent area under the receiver operating characteristic curve (AUC) and the highest sensitivity value (84.66). Employing both demographic and neuropsychiatric variables, our prediction model achieved an AUC of 83.74. By incorporating additional clinical information from laboratory data and applying our proposed feature selection strategy, we constructed a model based on eight variables that achieved an AUC of 85.12 using the XGB technique.</p><p><strong>Conclusion: </strong>We established a machine-learning model to monitor the progression of dementia using a limited, real-world clinical dataset. The XGB technique identified eight critical variables across our integrated datasets, potentially providing clinicians with valuable guidance.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"288"},"PeriodicalIF":3.9,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11583646/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142692622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cardinality matching versus propensity score matching for addressing cluster-level residual confounding in implantable medical device and surgical epidemiology: a parametric and plasmode simulation study. 在处理植入式医疗器械和手术流行病学中的群集级残余混杂问题时,卡片匹配与倾向得分匹配的比较:一项参数和质点模拟研究。
IF 3.9 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-22 DOI: 10.1186/s12874-024-02406-z
Mike Du, Stephen Johnston, Paul M Coplan, Victoria Y Strauss, Sara Khalid, Daniel Prieto-Alhambra

Background: Rapid innovation and new regulations lead to an increased need for post-marketing surveillance of implantable devices. However, complex multi-level confounding related not only to patient-level but also to surgeon or hospital covariates hampers observational studies of risks and benefits. We conducted parametric and plasmode simulations to compare the performance of cardinality matching (CM) vs propensity score matching (PSM) to reduce confounding bias in the presence of cluster-level confounding.

Methods: Two Monte Carlo simulation studies were carried out: 1) Parametric simulations (1,000 iterations) with patients nested in clusters (ratio 10:1, 50:1, 100:1, 200:1, 500:1) and sample size n = 10,000 were conducted with patient and cluster level confounders; 2) Plasmode simulations generated from a cohort of 9981 patients admitted for pancreatectomy between 2015 to 2019 from a US hospital database. CM with 0.1 standardised mean different constraint threshold (SMD) for confounders and PSM were used to balance the confounders for within-cluster and cross-cluster matching. Treatment effects were then estimated using logistic regression as the outcome model on the obtained matched sample.

Results: CM yielded higher sample retention but more bias than PSM for cross-cluster matching in most scenarios. For instance, with ratio of 100:1, sample retention and relative bias were 97.1% and 26.5% for CM, compared to 82.5% and 12.2% for PSM. The results for plasmode simulation were similar.

Conclusions: CM offered better sample retention but higher bias in most scenarios compared to PSM. More research is needed to guide the use of CM particularly in constraint setting for confounders for medical device and surgical epidemiology.

背景:快速的创新和新的法规导致对植入式设备上市后监测的需求增加。然而,复杂的多层次混杂因素不仅与患者水平有关,还与外科医生或医院的协变量有关,这阻碍了对风险和收益的观察研究。我们进行了参数和质谱模拟,比较了心因匹配(CM)与倾向得分匹配(PSM)的性能,以减少存在群组级混杂情况下的混杂偏差:方法:进行了两项蒙特卡罗模拟研究:1)参数模拟(1000 次迭代),患者嵌套在组群中(比例为 10:1、50:1、100:1、200:1、500:1),样本量 n = 10,000 例,患者和组群水平混杂因素;2)质谱模拟,从美国医院数据库中选取 2015 年至 2019 年期间入院接受胰腺切除术的 9981 例患者进行模拟。CM采用0.1的混杂因素标准化均值差异约束阈值(SMD),PSM用于平衡群组内和跨群组匹配的混杂因素。然后使用逻辑回归作为结果模型,对获得的匹配样本进行治疗效果估算:在大多数情况下,与 PSM 相比,CM 的样本保留率更高,但跨群组匹配的偏差更大。例如,在比例为 100:1 时,CM 的样本保留率和相对偏差分别为 97.1%和 26.5%,而 PSM 的样本保留率和相对偏差分别为 82.5%和 12.2%。等离子体模拟的结果与此类似:结论:与 PSM 相比,CM 能更好地保留样本,但在大多数情况下偏差较大。需要更多的研究来指导 CM 的使用,特别是在医疗设备和手术流行病学的混杂因素限制设置中。
{"title":"Cardinality matching versus propensity score matching for addressing cluster-level residual confounding in implantable medical device and surgical epidemiology: a parametric and plasmode simulation study.","authors":"Mike Du, Stephen Johnston, Paul M Coplan, Victoria Y Strauss, Sara Khalid, Daniel Prieto-Alhambra","doi":"10.1186/s12874-024-02406-z","DOIUrl":"10.1186/s12874-024-02406-z","url":null,"abstract":"<p><strong>Background: </strong>Rapid innovation and new regulations lead to an increased need for post-marketing surveillance of implantable devices. However, complex multi-level confounding related not only to patient-level but also to surgeon or hospital covariates hampers observational studies of risks and benefits. We conducted parametric and plasmode simulations to compare the performance of cardinality matching (CM) vs propensity score matching (PSM) to reduce confounding bias in the presence of cluster-level confounding.</p><p><strong>Methods: </strong>Two Monte Carlo simulation studies were carried out: 1) Parametric simulations (1,000 iterations) with patients nested in clusters (ratio 10:1, 50:1, 100:1, 200:1, 500:1) and sample size n = 10,000 were conducted with patient and cluster level confounders; 2) Plasmode simulations generated from a cohort of 9981 patients admitted for pancreatectomy between 2015 to 2019 from a US hospital database. CM with 0.1 standardised mean different constraint threshold (SMD) for confounders and PSM were used to balance the confounders for within-cluster and cross-cluster matching. Treatment effects were then estimated using logistic regression as the outcome model on the obtained matched sample.</p><p><strong>Results: </strong>CM yielded higher sample retention but more bias than PSM for cross-cluster matching in most scenarios. For instance, with ratio of 100:1, sample retention and relative bias were 97.1% and 26.5% for CM, compared to 82.5% and 12.2% for PSM. The results for plasmode simulation were similar.</p><p><strong>Conclusions: </strong>CM offered better sample retention but higher bias in most scenarios compared to PSM. More research is needed to guide the use of CM particularly in constraint setting for confounders for medical device and surgical epidemiology.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"289"},"PeriodicalIF":3.9,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11583411/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142692621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction: Forced randomization: the what, why, and how. 更正:强制随机化:内容、原因和方法。
IF 3.9 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-20 DOI: 10.1186/s12874-024-02407-y
Kerstine Carter, Olga Kuznetsova, Volodymyr Anisimov, Johannes Krisam, Colin Scherer, Yevgen Ryeznik, Oleksandr Sverdlov
{"title":"Correction: Forced randomization: the what, why, and how.","authors":"Kerstine Carter, Olga Kuznetsova, Volodymyr Anisimov, Johannes Krisam, Colin Scherer, Yevgen Ryeznik, Oleksandr Sverdlov","doi":"10.1186/s12874-024-02407-y","DOIUrl":"10.1186/s12874-024-02407-y","url":null,"abstract":"","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"286"},"PeriodicalIF":3.9,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11577646/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142680863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three new methodologies for calculating the effective sample size when performing population adjustment. 在进行人口调整时计算有效样本量的三种新方法。
IF 3.9 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-20 DOI: 10.1186/s12874-024-02412-1
Landan Zhang, Sylwia Bujkiewicz, Dan Jackson

Background: The concept of the population is of fundamental importance in epidemiology and statistics. In some instances, it is not possible to sample directly from the population of interest. Weighting is an established statistical approach for making inferences when the sample is not representative of this population.

Methods: The effective sample size (ESS) is a descriptive statistic that can be used to accompany this type of weighted statistical analysis. The ESS is an estimate of the sample size required by an unweighted sample that achieves the same level of precision as the weighted sample. The ESS therefore reflects the amount of information retained after weighting the data and is an intuitively appealing quantity to interpret, for example by those with little or no statistical training.

Results: The conventional formula for calculating ESS is derived under strong assumptions, for example that outcome data are homoscedastic. This is not always true in practice, for example for survival data. We propose three new approaches to compute the ESS, that are valid for any type of data and weighted statistical analysis, and so can be applied more generally.

Conclusion: We illustrate all methods using an example and conclude that our proposals should accompany, and potentially replace, the existing approach for computing the ESS.

背景:在流行病学和统计学中,人群的概念至关重要。在某些情况下,不可能直接从相关人群中抽样。加权法是一种成熟的统计方法,用于在样本不能代表该人群时进行推断:有效样本量(ESS)是一种描述性统计量,可用于此类加权统计分析。有效样本量是对非加权样本所需的样本量的估计,它能达到与加权样本相同的精确度。因此,ESS 反映了数据加权后所保留的信息量,是一个直观的解释量,例如,对于那些没有受过或很少受过统计培训的人来说:计算 ESS 的传统公式是在强有力的假设条件下得出的,例如结果数据是同方差的。在实践中并非总是如此,例如生存数据。我们提出了三种计算 ESS 的新方法,它们适用于任何类型的数据和加权统计分析,因此可以更广泛地应用:我们用一个例子说明了所有方法,并得出结论:我们的建议应与计算 ESS 的现有方法并行不悖,甚至有可能取代现有方法。
{"title":"Three new methodologies for calculating the effective sample size when performing population adjustment.","authors":"Landan Zhang, Sylwia Bujkiewicz, Dan Jackson","doi":"10.1186/s12874-024-02412-1","DOIUrl":"10.1186/s12874-024-02412-1","url":null,"abstract":"<p><strong>Background: </strong>The concept of the population is of fundamental importance in epidemiology and statistics. In some instances, it is not possible to sample directly from the population of interest. Weighting is an established statistical approach for making inferences when the sample is not representative of this population.</p><p><strong>Methods: </strong>The effective sample size (ESS) is a descriptive statistic that can be used to accompany this type of weighted statistical analysis. The ESS is an estimate of the sample size required by an unweighted sample that achieves the same level of precision as the weighted sample. The ESS therefore reflects the amount of information retained after weighting the data and is an intuitively appealing quantity to interpret, for example by those with little or no statistical training.</p><p><strong>Results: </strong>The conventional formula for calculating ESS is derived under strong assumptions, for example that outcome data are homoscedastic. This is not always true in practice, for example for survival data. We propose three new approaches to compute the ESS, that are valid for any type of data and weighted statistical analysis, and so can be applied more generally.</p><p><strong>Conclusion: </strong>We illustrate all methods using an example and conclude that our proposals should accompany, and potentially replace, the existing approach for computing the ESS.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"287"},"PeriodicalIF":3.9,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11577712/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142680864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction: Inclusion of unexposed clusters improves the precision of fixed effects analysis of stepped-wedge cluster randomized trials with binary and count outcomes. 更正:纳入未暴露群组可提高二元和计数结果的阶梯楔形群组随机试验固定效应分析的精确度。
IF 3.9 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-19 DOI: 10.1186/s12874-024-02415-y
Kenneth Menglin Lee, Grace Meijuan Yang, Yin Bun Cheung
{"title":"Correction: Inclusion of unexposed clusters improves the precision of fixed effects analysis of stepped-wedge cluster randomized trials with binary and count outcomes.","authors":"Kenneth Menglin Lee, Grace Meijuan Yang, Yin Bun Cheung","doi":"10.1186/s12874-024-02415-y","DOIUrl":"10.1186/s12874-024-02415-y","url":null,"abstract":"","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"285"},"PeriodicalIF":3.9,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11575030/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142667186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FAIR data management: a framework for fostering data literacy in biomedical sciences education. FAIR 数据管理:在生物医学教育中培养数据素养的框架。
IF 3.9 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-16 DOI: 10.1186/s12874-024-02404-1
Rocio Gonzalez Soltero, Debora Pino García, Alberto Bellido, Pablo Ryan, Ana I Rodríguez-Learte

Data literacy, the ability to understand and effectively communicate with data, is crucial for researchers to interpret and validate data. However, low reproducibility in biomedical research is nowadays a significant issue, with major implications for scientific progress and the reliability of findings. Recognizing this, funding bodies such as the European Commission emphasize the importance of regular data management practices to enhance reproducibility. Establishing a standardized framework for statistical methods and data analysis is essential to minimize biases and inaccuracies. The FAIR principles (Findable, Accessible, Interoperable, Reusable) aim to enhance data interoperability and reusability, promoting transparent and ethical data practices. The study presented here aimed to train postgraduate students at the Universidad Europea de Madrid in data literacy skills and FAIR principles, assessing their application in master thesis projects. A total of 46 participants, including students and mentors, were involved in the study during the 2022-2023 academic year. Students were trained to prioritize FAIR data sources and implement Data Management Plans (DMPs) during their master's thesis. An 11-item questionnaire was developed to evaluate the FAIRness of research data, showing strong internal consistency. The study found that integrating FAIR principles into educational curricula is crucial for enhancing research reproducibility and transparency. This approach equips future researchers with essential skills for navigating a data-driven scientific environment and contributes to advancing scientific knowledge.

数据素养,即理解数据并与之有效交流的能力,对于研究人员解释和验证数据至关重要。然而,生物医学研究的可重复性低是当今的一个重要问题,对科学进步和研究结果的可靠性有重大影响。认识到这一点后,欧盟委员会等资助机构强调了定期数据管理实践对提高可重复性的重要性。建立标准化的统计方法和数据分析框架对于最大限度地减少偏差和误差至关重要。FAIR原则(可查找、可访问、可互操作、可重用)旨在提高数据的互操作性和可重用性,促进透明和合乎道德的数据实践。本研究旨在对马德里欧洲大学(Universidad Europea de Madrid)的研究生进行数据扫盲技能和 FAIR 原则的培训,评估其在硕士论文项目中的应用情况。在 2022-2023 学年期间,包括学生和导师在内共有 46 人参与了这项研究。学生们在硕士论文期间接受了培训,以确定 FAIR 数据源的优先次序并实施数据管理计划(DMP)。为评估研究数据的 FAIR 性,编制了一份 11 个项目的调查问卷,该问卷显示出很强的内部一致性。研究发现,将 FAIR 原则纳入教育课程对于提高研究的可复制性和透明度至关重要。这种方法使未来的研究人员掌握了驾驭数据驱动的科学环境的基本技能,并有助于推动科学知识的发展。
{"title":"FAIR data management: a framework for fostering data literacy in biomedical sciences education.","authors":"Rocio Gonzalez Soltero, Debora Pino García, Alberto Bellido, Pablo Ryan, Ana I Rodríguez-Learte","doi":"10.1186/s12874-024-02404-1","DOIUrl":"10.1186/s12874-024-02404-1","url":null,"abstract":"<p><p>Data literacy, the ability to understand and effectively communicate with data, is crucial for researchers to interpret and validate data. However, low reproducibility in biomedical research is nowadays a significant issue, with major implications for scientific progress and the reliability of findings. Recognizing this, funding bodies such as the European Commission emphasize the importance of regular data management practices to enhance reproducibility. Establishing a standardized framework for statistical methods and data analysis is essential to minimize biases and inaccuracies. The FAIR principles (Findable, Accessible, Interoperable, Reusable) aim to enhance data interoperability and reusability, promoting transparent and ethical data practices. The study presented here aimed to train postgraduate students at the Universidad Europea de Madrid in data literacy skills and FAIR principles, assessing their application in master thesis projects. A total of 46 participants, including students and mentors, were involved in the study during the 2022-2023 academic year. Students were trained to prioritize FAIR data sources and implement Data Management Plans (DMPs) during their master's thesis. An 11-item questionnaire was developed to evaluate the FAIRness of research data, showing strong internal consistency. The study found that integrating FAIR principles into educational curricula is crucial for enhancing research reproducibility and transparency. This approach equips future researchers with essential skills for navigating a data-driven scientific environment and contributes to advancing scientific knowledge.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"284"},"PeriodicalIF":3.9,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568560/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142643583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multistate Markov chain modeling for child undernutrition transitions in Ethiopia: a longitudinal data analysis, 2002-2016. 埃塞俄比亚儿童营养不良转变的多态马尔可夫链建模:2002-2016 年纵向数据分析。
IF 3.9 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-15 DOI: 10.1186/s12874-024-02399-9
Getnet Bogale Begashaw, Temesgen Zewotir, Haile Mekonnen Fenta

Background: The use of the multistate Markov chain model is a valuable tool for studying child undernutrition. This allows us to examine the trends of children's transitions from one state to multiple states of undernutrition.

Objectives: In this study, our objective was to estimate the median duration for a child to first transition from one state of undernutrition to another as well as their first recurrence of undernutrition and also to analyze the typical duration of undernourishment. This involves understanding the central tendency of these transitions and durations in the context of longitudinal data.

Methods: We used a longitudinal dataset from the Young Lives cohort study (YLCS), which included approximately 1997 Ethiopian children aged 1-15 years. These children were selected from five regions and followed through five survey rounds between 2002 and 2016. The surveys provide comprehensive health and nutrition data and are designed to assess childhood poverty. To analyze this dataset, we employed a Markov chain regression model. The dataset constitutes a cohort with repeated measurements, allowing us to track the transitions of individual children across different states of undernutrition over time.

Results: The findings of our study indicate that 46% of children experienced concurrent underweight, stunting, and wasting (referred to as USW). The prevalence of underweight and stunted concurrent condition (US) was 18.7% at baseline, higher among males. The incidence density of undernutrition was calculated at 22.5% per year. On average, it took 3.02 months for a child in a wasting state to transition back to a normal state for the first time, followed by approximately 3.05 months for stunting and 3.89 months for underweight. It is noteworthy that the median duration of undernourishment among children in the US (underweight and stunted concurrently) state was 48.8 months, whereas those concurrently underweight and wasting experienced a median of 45.4 months in this state. Additionally, rural children (HR = 1.75; 95% CI: 1.53-1.97), those with illiterate fathers (HR = 1.50; 95% CI: 1.38-1.62) and mothers (HR = 1.45; 95% CI: 1.02-3.29), and those in households lacking safe drinking water (HR = 1.70; 95% CI: 1.26-2.14) or access to cooking fuel (HR = 1.95; 95% CI: 1.75-2.17) exhibited a higher risk of undernutrition and a slower recovery rate.

Conclusions: This study revealed that rural children, especially those with illiterate parents and households lacking safe drinking water but cooking fuels, face an increased risk of undernutrition and slower recovery.

背景:多态马尔可夫链模型是研究儿童营养不良问题的重要工具。这使我们能够研究儿童从一种营养不良状态过渡到多种营养不良状态的趋势:在这项研究中,我们的目标是估算儿童从一种营养不良状态首次过渡到另一种营养不良状态的中位持续时间,以及他们首次再次出现营养不良的持续时间,同时分析营养不良的典型持续时间。这就需要在纵向数据的背景下了解这些过渡和持续时间的中心趋势:我们使用了 "年轻生命 "队列研究(YLCS)的纵向数据集,其中包括约 1997 名 1-15 岁的埃塞俄比亚儿童。这些儿童选自五个地区,在 2002 年至 2016 年期间接受了五轮调查。这些调查提供了全面的健康和营养数据,旨在评估儿童贫困状况。为了分析该数据集,我们采用了马尔科夫链回归模型。该数据集构成了一个具有重复测量的队列,使我们能够跟踪儿童个体在不同营养不良状态下的过渡情况:我们的研究结果表明,46% 的儿童同时存在体重不足、发育迟缓和消瘦(简称 USW)。体重不足和发育迟缓并发症(USW)的基线发病率为 18.7%,男性发病率更高。根据计算,营养不良的发病密度为每年 22.5%。处于消瘦状态的儿童平均需要 3.02 个月才能首次恢复正常状态,发育迟缓的儿童大约需要 3.05 个月,体重不足的儿童大约需要 3.89 个月。值得注意的是,美国儿童(同时出现体重不足和发育迟缓)营养不良持续时间的中位数为 48.8 个月,而同时出现体重不足和消瘦状态的儿童营养不良持续时间的中位数为 45.4 个月。此外,农村儿童(HR = 1.75;95% CI:1.53-1.97)、父亲是文盲的儿童(HR = 1.50;95% CI:1.38-1.62)和母亲是文盲的儿童(HR = 1.45;95% CI:1.02-3.29),以及缺乏安全饮用水(HR = 1.70;95% CI:1.26-2.14)或烹饪燃料(HR = 1.95;95% CI:1.75-2.17)的家庭中的儿童营养不良的风险更高,恢复速度更慢:这项研究表明,农村儿童,尤其是父母都是文盲、缺乏安全饮用水和烹饪燃料的家庭的儿童,面临营养不良的风险更高,恢复速度更慢。
{"title":"Multistate Markov chain modeling for child undernutrition transitions in Ethiopia: a longitudinal data analysis, 2002-2016.","authors":"Getnet Bogale Begashaw, Temesgen Zewotir, Haile Mekonnen Fenta","doi":"10.1186/s12874-024-02399-9","DOIUrl":"10.1186/s12874-024-02399-9","url":null,"abstract":"<p><strong>Background: </strong>The use of the multistate Markov chain model is a valuable tool for studying child undernutrition. This allows us to examine the trends of children's transitions from one state to multiple states of undernutrition.</p><p><strong>Objectives: </strong>In this study, our objective was to estimate the median duration for a child to first transition from one state of undernutrition to another as well as their first recurrence of undernutrition and also to analyze the typical duration of undernourishment. This involves understanding the central tendency of these transitions and durations in the context of longitudinal data.</p><p><strong>Methods: </strong>We used a longitudinal dataset from the Young Lives cohort study (YLCS), which included approximately 1997 Ethiopian children aged 1-15 years. These children were selected from five regions and followed through five survey rounds between 2002 and 2016. The surveys provide comprehensive health and nutrition data and are designed to assess childhood poverty. To analyze this dataset, we employed a Markov chain regression model. The dataset constitutes a cohort with repeated measurements, allowing us to track the transitions of individual children across different states of undernutrition over time.</p><p><strong>Results: </strong>The findings of our study indicate that 46% of children experienced concurrent underweight, stunting, and wasting (referred to as USW). The prevalence of underweight and stunted concurrent condition (US) was 18.7% at baseline, higher among males. The incidence density of undernutrition was calculated at 22.5% per year. On average, it took 3.02 months for a child in a wasting state to transition back to a normal state for the first time, followed by approximately 3.05 months for stunting and 3.89 months for underweight. It is noteworthy that the median duration of undernourishment among children in the US (underweight and stunted concurrently) state was 48.8 months, whereas those concurrently underweight and wasting experienced a median of 45.4 months in this state. Additionally, rural children (HR = 1.75; 95% CI: 1.53-1.97), those with illiterate fathers (HR = 1.50; 95% CI: 1.38-1.62) and mothers (HR = 1.45; 95% CI: 1.02-3.29), and those in households lacking safe drinking water (HR = 1.70; 95% CI: 1.26-2.14) or access to cooking fuel (HR = 1.95; 95% CI: 1.75-2.17) exhibited a higher risk of undernutrition and a slower recovery rate.</p><p><strong>Conclusions: </strong>This study revealed that rural children, especially those with illiterate parents and households lacking safe drinking water but cooking fuels, face an increased risk of undernutrition and slower recovery.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"283"},"PeriodicalIF":3.9,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11566054/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142638365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bayesian analysis integrating expert beliefs to better understand how new evidence ought to update what we believe: a use case of chiropractic care and acute lumbar disc herniation with early surgery. 贝叶斯分析法整合专家信念,更好地理解新证据应如何更新我们的信念:脊骨神经治疗和急性腰椎间盘突出症与早期手术的应用案例。
IF 3.9 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-15 DOI: 10.1186/s12874-024-02359-3
Léonie Hofstetter, Michelle Fontana, George A Tomlinson, Cesar A Hincapié

Background: A Bayesian approach may be useful in the study of possible treatment-related rare serious adverse events, particularly when there are strongly held opinions in the absence of good quality previous data. We demonstrate the application of a Bayesian analysis by integrating expert opinions with population-based epidemiologic data to investigate the association between chiropractic care and acute lumbar disc herniation (LDH) with early surgery.

Methods: Experts' opinions were used to derive probability distributions of the incidence rate ratio (IRR) for acute LDH requiring early surgery associated with chiropractic care. A 'community of priors' (enthusiastic, neutral, and skeptical) was built by dividing the experts into three groups according to their perceived mean prior IRR. The likelihood was formed from the results of a population-based epidemiologic study comparing the relative incidence of acute LDH with early surgery after chiropractic care versus primary medical care, with sensitive and specific outcome case definitions and surgery occurring within 8- and 12-week time windows after acute LDH. The robustness of results to the community of priors and specific versus sensitive case definitions was assessed.

Results: The most enthusiastic 25% of experts had a prior IRR of 0.42 (95% credible interval [CrI], 0.03 to 1.27), while the most skeptical 25% of experts had a prior IRR of 1.66 (95% CrI, 0.55 to 4.25). The Bayesian posterior estimates across priors and outcome definitions ranged from an IRR of 0.39 (95% CrI, 0.21 to 0.68) to an IRR of 1.40 (95% CrI, 0.52 to 2.55). With a sensitive definition of the outcome, the analysis produced results that confirmed prior enthusiasts' beliefs and that were precise enough to shift prior beliefs of skeptics. With a specific definition of the outcome, the results were not strong enough to overcome prior skepticism.

Conclusion: A Bayesian analysis integrating expert beliefs highlighted the value of eliciting informative priors to better understand how new evidence ought to update prior existing beliefs. Clinical epidemiologists are encouraged to integrate informative and expert opinions representing the end-user community of priors in Bayesian analyses, particularly when there are strongly held opinions in the absence of definitive scientific evidence.

背景:贝叶斯方法可能有助于研究可能与治疗相关的罕见严重不良事件,尤其是在缺乏高质量的既往数据而存在强烈观点的情况下。我们将专家意见与基于人群的流行病学数据相结合,研究了脊骨神经治疗与急性腰椎间盘突出症(LDH)及早手术之间的关联,从而展示了贝叶斯分析法的应用:方法: 利用专家意见推导出脊骨神经治疗与需要早期手术的急性腰椎间盘突出症发病率比(IRR)的概率分布。根据专家的先验IRR平均值将其分为三组,从而建立了一个 "先验群体"(热情、中立和怀疑)。这种可能性是根据一项基于人群的流行病学研究结果形成的,该研究比较了脊骨神经治疗后急性LDH与早期手术的相对发生率,以及急性LDH后8周和12周内手术的敏感性和特异性结果病例定义。评估了结果对社区先验和特定与敏感病例定义的稳健性:结果:最热心的 25% 专家的先验 IRR 为 0.42(95% 可信区间 [CrI],0.03 至 1.27),而最持怀疑态度的 25% 专家的先验 IRR 为 1.66(95% 可信区间 [CrI],0.55 至 4.25)。不同先验和结果定义的贝叶斯后验估计值从 0.39(95% CrI,0.21 至 0.68)到 1.40(95% CrI,0.52 至 2.55)不等。在对结果进行敏感定义的情况下,分析得出的结果证实了热衷者先前的观点,并且精确到足以改变怀疑论者先前的观点。在对结果进行特定定义的情况下,分析结果不足以克服先前的怀疑态度:整合专家信念的贝叶斯分析凸显了激发信息先验的价值,从而更好地理解新证据应如何更新先验信念。我们鼓励临床流行病学家在贝叶斯分析中整合代表先验的最终用户群体的信息和专家意见,尤其是在缺乏明确科学证据的情况下存在强烈意见时。
{"title":"A Bayesian analysis integrating expert beliefs to better understand how new evidence ought to update what we believe: a use case of chiropractic care and acute lumbar disc herniation with early surgery.","authors":"Léonie Hofstetter, Michelle Fontana, George A Tomlinson, Cesar A Hincapié","doi":"10.1186/s12874-024-02359-3","DOIUrl":"10.1186/s12874-024-02359-3","url":null,"abstract":"<p><strong>Background: </strong>A Bayesian approach may be useful in the study of possible treatment-related rare serious adverse events, particularly when there are strongly held opinions in the absence of good quality previous data. We demonstrate the application of a Bayesian analysis by integrating expert opinions with population-based epidemiologic data to investigate the association between chiropractic care and acute lumbar disc herniation (LDH) with early surgery.</p><p><strong>Methods: </strong>Experts' opinions were used to derive probability distributions of the incidence rate ratio (IRR) for acute LDH requiring early surgery associated with chiropractic care. A 'community of priors' (enthusiastic, neutral, and skeptical) was built by dividing the experts into three groups according to their perceived mean prior IRR. The likelihood was formed from the results of a population-based epidemiologic study comparing the relative incidence of acute LDH with early surgery after chiropractic care versus primary medical care, with sensitive and specific outcome case definitions and surgery occurring within 8- and 12-week time windows after acute LDH. The robustness of results to the community of priors and specific versus sensitive case definitions was assessed.</p><p><strong>Results: </strong>The most enthusiastic 25% of experts had a prior IRR of 0.42 (95% credible interval [CrI], 0.03 to 1.27), while the most skeptical 25% of experts had a prior IRR of 1.66 (95% CrI, 0.55 to 4.25). The Bayesian posterior estimates across priors and outcome definitions ranged from an IRR of 0.39 (95% CrI, 0.21 to 0.68) to an IRR of 1.40 (95% CrI, 0.52 to 2.55). With a sensitive definition of the outcome, the analysis produced results that confirmed prior enthusiasts' beliefs and that were precise enough to shift prior beliefs of skeptics. With a specific definition of the outcome, the results were not strong enough to overcome prior skepticism.</p><p><strong>Conclusion: </strong>A Bayesian analysis integrating expert beliefs highlighted the value of eliciting informative priors to better understand how new evidence ought to update prior existing beliefs. Clinical epidemiologists are encouraged to integrate informative and expert opinions representing the end-user community of priors in Bayesian analyses, particularly when there are strongly held opinions in the absence of definitive scientific evidence.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"281"},"PeriodicalIF":3.9,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11566458/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142614487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A study within a trial (SWAT) of clinical trial feasibility and barriers to recruitment in the United Kingdom - the CapaCiTY programme experience. 英国临床试验可行性和招募障碍的试验中研究(SWAT)--CapaCiTY 计划的经验。
IF 3.9 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-15 DOI: 10.1186/s12874-024-02395-z
Natasha Stevens, Shiva Taheri, Ugo Grossi, Chris Emmett, Sybil Bannister, Christine Norton, Yan Yiannakou, Charles Knowles

Background: The CapaCiTY programme includes three, multi-centre, randomised controlled trials aiming to develop an evidence based adult chronic constipation treatment pathway. The trials were conducted in the United Kingdom, National Health Service, aiming to recruit 808 participants from 26 March 2015 to 31 January 2019. Sites were selected based on their responses to site feasibility questionnaires (2014-2015), a common tool employed by sponsors to assess a site's recruitment potential and ability to undertake the trial protocol. Failure to recruit the planned sample jeopardises reliability of results and wastes significant time and resources. The purpose of this study was to investigate barriers to recruitment in 2017.

Methods: We conducted site feasibility assessments with thirty-nine sites prior to trial commencement. Twenty-seven were selected to participate in the CapaCiTY programme, twelve were deemed unsuitable. We compared site contracted recruitment rates with actual recruitment rates and conducted a telephone survey and analysis from 5 July to 7 December 2017 (n = 24) to understand barriers to recruitment. Three sites declined to participate in the survey.

Results: At the time of survey, 15% of sites in the CapaCiTY programme were meeting recruitment targets, 85% were recruiting half or less of their target. Of these, 28% recruited no participants. The main barriers to recruitment were lack of resources, high workloads, lack of suitable participants and study design not being compatible with routine care. Despite multiple strategies employed to overcome these barriers, the trials were eventually stopped due to futility, recruiting only 34% of the programme sample size.

Conclusions: Improving the reliability of site feasibility assessments could potentially save a substantial amount in failed research investments and speed up the time to delivery of new treatments. We recommend 1) investment in training researchers in conducting and completing site feasibility; 2) funders to require pilot and feasibility data in grant applications, with an emphasis on patient and public involvement in trial design; 3) conducting site feasibility assessment at the pre-award stage; 4) development of a national database of sites' previous trial recruitment performance; 5) data-driven site level assessment of recruitment potential.

Trial registration: ISRCTN11791740; 16/07/2015, ISRCTN11093872; 11/11/2015, ISRCTN11747152; 30/09/2015.

背景:CapaCiTY计划包括三项多中心随机对照试验,旨在开发基于证据的成人慢性便秘治疗路径。试验在英国国民健康服务机构进行,旨在从2015年3月26日至2019年1月31日招募808名参与者。试验场地的选择基于其对试验场地可行性调查问卷(2014-2015年)的答复,这是申办者用来评估试验场地招募潜力和执行试验方案能力的常用工具。未能招募到计划样本会危及结果的可靠性,并浪费大量时间和资源。本研究旨在调查2017年招募的障碍:在试验开始前,我们对 39 个地点进行了地点可行性评估。其中 27 个被选中参与 CapaCiTY 计划,12 个被认为不适合。我们比较了站点合同招募率和实际招募率,并于 2017 年 7 月 5 日至 12 月 7 日进行了电话调查和分析(n = 24),以了解招募障碍。有三个站点拒绝参与调查:在调查时,CapaCiTY 计划中 15% 的研究点达到了招募目标,85% 的研究点招募了目标人数的一半或更少。其中 28% 未招募到任何参与者。招募的主要障碍是缺乏资源、工作量大、缺乏合适的参与者以及研究设计与日常护理不符。尽管采取了多种策略来克服这些障碍,但试验最终还是因徒劳无益而停止,只招募到了计划样本量的 34%:提高现场可行性评估的可靠性有可能节省大量失败的研究投资,并加快新疗法的推广速度。我们建议:1)投资培训研究人员开展并完成研究机构的可行性研究;2)资助者在申请资助时要求提供试验和可行性数据,并强调患者和公众参与试验设计;3)在授标前阶段开展研究机构的可行性评估;4)开发研究机构以往试验招募业绩的国家数据库;5)以数据为导向的研究机构招募潜力评估:试验注册:ISRCTN11791740;2015年7月16日,ISRCTN11093872;2015年11月11日,ISRCTN11747152;2015年9月30日。
{"title":"A study within a trial (SWAT) of clinical trial feasibility and barriers to recruitment in the United Kingdom - the CapaCiTY programme experience.","authors":"Natasha Stevens, Shiva Taheri, Ugo Grossi, Chris Emmett, Sybil Bannister, Christine Norton, Yan Yiannakou, Charles Knowles","doi":"10.1186/s12874-024-02395-z","DOIUrl":"10.1186/s12874-024-02395-z","url":null,"abstract":"<p><strong>Background: </strong>The CapaCiTY programme includes three, multi-centre, randomised controlled trials aiming to develop an evidence based adult chronic constipation treatment pathway. The trials were conducted in the United Kingdom, National Health Service, aiming to recruit 808 participants from 26 March 2015 to 31 January 2019. Sites were selected based on their responses to site feasibility questionnaires (2014-2015), a common tool employed by sponsors to assess a site's recruitment potential and ability to undertake the trial protocol. Failure to recruit the planned sample jeopardises reliability of results and wastes significant time and resources. The purpose of this study was to investigate barriers to recruitment in 2017.</p><p><strong>Methods: </strong>We conducted site feasibility assessments with thirty-nine sites prior to trial commencement. Twenty-seven were selected to participate in the CapaCiTY programme, twelve were deemed unsuitable. We compared site contracted recruitment rates with actual recruitment rates and conducted a telephone survey and analysis from 5 July to 7 December 2017 (n = 24) to understand barriers to recruitment. Three sites declined to participate in the survey.</p><p><strong>Results: </strong>At the time of survey, 15% of sites in the CapaCiTY programme were meeting recruitment targets, 85% were recruiting half or less of their target. Of these, 28% recruited no participants. The main barriers to recruitment were lack of resources, high workloads, lack of suitable participants and study design not being compatible with routine care. Despite multiple strategies employed to overcome these barriers, the trials were eventually stopped due to futility, recruiting only 34% of the programme sample size.</p><p><strong>Conclusions: </strong>Improving the reliability of site feasibility assessments could potentially save a substantial amount in failed research investments and speed up the time to delivery of new treatments. We recommend 1) investment in training researchers in conducting and completing site feasibility; 2) funders to require pilot and feasibility data in grant applications, with an emphasis on patient and public involvement in trial design; 3) conducting site feasibility assessment at the pre-award stage; 4) development of a national database of sites' previous trial recruitment performance; 5) data-driven site level assessment of recruitment potential.</p><p><strong>Trial registration: </strong>ISRCTN11791740; 16/07/2015, ISRCTN11093872; 11/11/2015, ISRCTN11747152; 30/09/2015.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"282"},"PeriodicalIF":3.9,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11566598/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142643579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study. 考虑非随机缺失结果数据造成的偏差:比较和说明概率偏差分析的两种方法:一项模拟研究。
IF 3.9 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Pub Date : 2024-11-13 DOI: 10.1186/s12874-024-02382-4
Emily Kawabata, Daniel Major-Smith, Gemma L Clayton, Chin Yang Shapland, Tim P Morris, Alice R Carter, Alba Fernández-Sanlés, Maria Carolina Borges, Kate Tilling, Gareth J Griffith, Louise A C Millard, George Davey Smith, Deborah A Lawlor, Rachael A Hughes

Background: Bias from data missing not at random (MNAR) is a persistent concern in health-related research. A bias analysis quantitatively assesses how conclusions change under different assumptions about missingness using bias parameters that govern the magnitude and direction of the bias. Probabilistic bias analysis specifies a prior distribution for these parameters, explicitly incorporating available information and uncertainty about their true values. A Bayesian bias analysis combines the prior distribution with the data's likelihood function whilst a Monte Carlo bias analysis samples the bias parameters directly from the prior distribution. No study has compared a Monte Carlo bias analysis to a Bayesian bias analysis in the context of MNAR missingness.

Methods: We illustrate an accessible probabilistic bias analysis using the Monte Carlo bias analysis approach and a well-known imputation method. We designed a simulation study based on a motivating example from the UK Biobank study, where a large proportion of the outcome was missing and missingness was suspected to be MNAR. We compared the performance of our Monte Carlo bias analysis to a principled Bayesian bias analysis, complete case analysis (CCA) and multiple imputation (MI) assuming missing at random.

Results: As expected, given the simulation study design, CCA and MI estimates were substantially biased, with 95% confidence interval coverages of 7-48%. Including auxiliary variables (i.e., variables not included in the substantive analysis that are predictive of missingness and the missing data) in MI's imputation model amplified the bias due to assuming missing at random. With reasonably accurate and precise information about the bias parameter, the Monte Carlo bias analysis performed as well as the Bayesian bias analysis. However, when very limited information was provided about the bias parameter, only the Bayesian bias analysis was able to eliminate most of the bias due to MNAR whilst the Monte Carlo bias analysis performed no better than the CCA and MI.

Conclusion: The Monte Carlo bias analysis we describe is easy to implement in standard software and, in the setting we explored, is a viable alternative to a Bayesian bias analysis. We caution careful consideration of choice of auxiliary variables when applying imputation where data may be MNAR.

背景:非随机数据缺失(MNAR)造成的偏差是健康相关研究中一直存在的问题。偏差分析利用控制偏差大小和方向的偏差参数,定量评估在不同的缺失假设下,结论会发生怎样的变化。概率偏倚分析为这些参数指定了一个先验分布,明确纳入了关于其真实值的可用信息和不确定性。贝叶斯偏倚分析将先验分布与数据的似然函数相结合,而蒙特卡洛偏倚分析则直接从先验分布中抽取偏倚参数。目前还没有研究将蒙特卡罗偏倚分析与贝叶斯偏倚分析在 MNAR 缺失方面进行比较:我们利用蒙特卡洛偏倚分析方法和一种著名的估算方法,说明了一种可获得的概率偏倚分析。我们以英国生物库研究中的一个激励性实例为基础,设计了一项模拟研究,在该研究中,有很大一部分结果是缺失的,缺失被怀疑是 MNAR。我们将蒙特卡洛偏倚分析的性能与原则性贝叶斯偏倚分析、完整病例分析(CCA)和假设随机缺失的多重归因(MI)进行了比较:正如预期的那样,考虑到模拟研究设计,CCA 和 MI 估计值偏差很大,95% 置信区间覆盖率为 7-48%。在 MI 的估算模型中加入辅助变量(即未包含在实质性分析中,但可预测缺失和缺失数据的变量),扩大了假设随机缺失造成的偏差。在有相当准确和精确的偏差参数信息的情况下,蒙特卡罗偏差分析与贝叶斯偏差分析的效果一样好。然而,当提供的偏差参数信息非常有限时,只有贝叶斯偏差分析能够消除 MNAR 导致的大部分偏差,而蒙特卡罗偏差分析的表现并不比 CCA 和 MI 好:我们所描述的蒙特卡洛偏倚分析很容易在标准软件中实现,在我们所探索的环境中,它是贝叶斯偏倚分析的可行替代方案。我们提醒大家,在数据可能是 MNAR 的情况下应用归因时,要仔细考虑辅助变量的选择。
{"title":"Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study.","authors":"Emily Kawabata, Daniel Major-Smith, Gemma L Clayton, Chin Yang Shapland, Tim P Morris, Alice R Carter, Alba Fernández-Sanlés, Maria Carolina Borges, Kate Tilling, Gareth J Griffith, Louise A C Millard, George Davey Smith, Deborah A Lawlor, Rachael A Hughes","doi":"10.1186/s12874-024-02382-4","DOIUrl":"10.1186/s12874-024-02382-4","url":null,"abstract":"<p><strong>Background: </strong>Bias from data missing not at random (MNAR) is a persistent concern in health-related research. A bias analysis quantitatively assesses how conclusions change under different assumptions about missingness using bias parameters that govern the magnitude and direction of the bias. Probabilistic bias analysis specifies a prior distribution for these parameters, explicitly incorporating available information and uncertainty about their true values. A Bayesian bias analysis combines the prior distribution with the data's likelihood function whilst a Monte Carlo bias analysis samples the bias parameters directly from the prior distribution. No study has compared a Monte Carlo bias analysis to a Bayesian bias analysis in the context of MNAR missingness.</p><p><strong>Methods: </strong>We illustrate an accessible probabilistic bias analysis using the Monte Carlo bias analysis approach and a well-known imputation method. We designed a simulation study based on a motivating example from the UK Biobank study, where a large proportion of the outcome was missing and missingness was suspected to be MNAR. We compared the performance of our Monte Carlo bias analysis to a principled Bayesian bias analysis, complete case analysis (CCA) and multiple imputation (MI) assuming missing at random.</p><p><strong>Results: </strong>As expected, given the simulation study design, CCA and MI estimates were substantially biased, with 95% confidence interval coverages of 7-48%. Including auxiliary variables (i.e., variables not included in the substantive analysis that are predictive of missingness and the missing data) in MI's imputation model amplified the bias due to assuming missing at random. With reasonably accurate and precise information about the bias parameter, the Monte Carlo bias analysis performed as well as the Bayesian bias analysis. However, when very limited information was provided about the bias parameter, only the Bayesian bias analysis was able to eliminate most of the bias due to MNAR whilst the Monte Carlo bias analysis performed no better than the CCA and MI.</p><p><strong>Conclusion: </strong>The Monte Carlo bias analysis we describe is easy to implement in standard software and, in the setting we explored, is a viable alternative to a Bayesian bias analysis. We caution careful consideration of choice of auxiliary variables when applying imputation where data may be MNAR.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"278"},"PeriodicalIF":3.9,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11558901/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142614505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
BMC Medical Research Methodology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1