BMC Medical Research Methodology最新文献_第5页

Effects of missing data imputation methods on univariate blood pressure time series data analysis and forecasting with ARIMA and LSTM. 缺失数据输入方法对ARIMA和LSTM单变量血压时间序列数据分析和预测的影响。

IF 3.9 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

BMC Medical Research Methodology

Pub Date : 2024-12-26 DOI: 10.1186/s12874-024-02448-3

Nicholas Niako, Jesus D Melgarejo, Gladys E Maestre, Kristina P Vatcheva

Background: Missing observations within the univariate time series are common in real-life and cause analytical problems in the flow of the analysis. Imputation of missing values is an inevitable step in every incomplete univariate time series. Most of the existing studies focus on comparing the distributions of imputed data. There is a gap of knowledge on how different imputation methods for univariate time series affect the forecasting performance of time series models. We evaluated the prediction performance of autoregressive integrated moving average (ARIMA) and long short-term memory (LSTM) network models on imputed time series data using ten different imputation techniques.

Methods: Missing values were generated under missing completely at random (MCAR) mechanism at 10%, 15%, 25%, and 35% rates of missingness using complete data of 24-h ambulatory diastolic blood pressure readings. The performance of the mean, Kalman filtering, linear, spline, and Stineman interpolations, exponentially weighted moving average (EWMA), simple moving average (SMA), k-nearest neighborhood (KNN), and last-observation-carried-forward (LOCF) imputation techniques on the time series structure and the prediction performance of the LSTM and ARIMA models were compared on imputed and original data.

Results: All imputation techniques either increased or decreased the data autocorrelation and with this affected the forecasting performance of the ARIMA and LSTM algorithms. The best imputation technique did not guarantee better predictions obtained on the imputed data. The mean imputation, LOCF, KNN, Stineman, and cubic spline interpolations methods performed better for a small rate of missingness. Interpolation with EWMA and Kalman filtering yielded consistent performances across all scenarios of missingness. Disregarding the imputation methods, the LSTM resulted with a slightly better predictive accuracy among the best performing ARIMA and LSTM models; otherwise, the results varied. In our small sample, ARIMA tended to perform better on data with higher autocorrelation.

Conclusions: We recommend to the researchers that they consider Kalman smoothing techniques, interpolation techniques (linear, spline, and Stineman), moving average techniques (SMA and EWMA) for imputing univariate time series data as they perform well on both data distribution and forecasting with ARIMA and LSTM models. The LSTM slightly outperforms ARIMA models, however, for small samples, ARIMA is simpler and faster to execute.

背景：在现实生活中，单变量时间序列中缺少观测值是很常见的，并且会导致分析流程中的分析问题。缺失值的估计是每一个不完全单变量时间序列中不可避免的步骤。现有的研究大多集中在比较输入数据的分布上。对于单变量时间序列不同的imputation方法如何影响时间序列模型的预测性能，目前还缺乏相关的知识。研究了自回归综合移动平均（ARIMA）和长短期记忆（LSTM）网络模型在10种不同的输入技术下对输入时间序列数据的预测性能。方法：使用24小时动态舒张压读数的完整数据，在完全随机缺失（MCAR）机制下，以10%、15%、25%和35%的缺失率产生缺失值。比较了均值、卡尔曼滤波、线性、样条和Stineman插值、指数加权移动平均（EWMA）、简单移动平均（SMA）、k-近邻（KNN）和最后观测-前向（LOCF）插值技术对时间序列结构的预测性能，以及LSTM和ARIMA模型在输入数据和原始数据上的预测性能。结果：所有的归算技术都增加或减少了数据的自相关性，从而影响了ARIMA和LSTM算法的预测性能。最好的归算技术并不能保证在归算数据上得到更好的预测。平均插值、LOCF、KNN、Stineman和三次样条插值方法在较小的缺失率下表现更好。利用EWMA和卡尔曼滤波的插值在所有丢失场景中产生一致的性能。在不考虑估算方法的情况下，LSTM模型在ARIMA和LSTM模型中预测精度略高；除此之外，结果各不相同。在我们的小样本中，ARIMA倾向于在具有较高自相关性的数据上表现更好。结论：我们建议研究人员考虑卡尔曼平滑技术、插值技术（线性、样条和斯坦曼）、移动平均技术（SMA和EWMA）来输入单变量时间序列数据，因为它们在ARIMA和LSTM模型的数据分布和预测方面都表现良好。LSTM略优于ARIMA模型，然而，对于小样本，ARIMA更简单，执行速度更快。

{"title":"Effects of missing data imputation methods on univariate blood pressure time series data analysis and forecasting with ARIMA and LSTM.","authors":"Nicholas Niako, Jesus D Melgarejo, Gladys E Maestre, Kristina P Vatcheva","doi":"10.1186/s12874-024-02448-3","DOIUrl":"10.1186/s12874-024-02448-3","url":null,"abstract":"Background: Missing observations within the univariate time series are common in real-life and cause analytical problems in the flow of the analysis. Imputation of missing values is an inevitable step in every incomplete univariate time series. Most of the existing studies focus on comparing the distributions of imputed data. There is a gap of knowledge on how different imputation methods for univariate time series affect the forecasting performance of time series models. We evaluated the prediction performance of autoregressive integrated moving average (ARIMA) and long short-term memory (LSTM) network models on imputed time series data using ten different imputation techniques.Methods: Missing values were generated under missing completely at random (MCAR) mechanism at 10%, 15%, 25%, and 35% rates of missingness using complete data of 24-h ambulatory diastolic blood pressure readings. The performance of the mean, Kalman filtering, linear, spline, and Stineman interpolations, exponentially weighted moving average (EWMA), simple moving average (SMA), k-nearest neighborhood (KNN), and last-observation-carried-forward (LOCF) imputation techniques on the time series structure and the prediction performance of the LSTM and ARIMA models were compared on imputed and original data.Results: All imputation techniques either increased or decreased the data autocorrelation and with this affected the forecasting performance of the ARIMA and LSTM algorithms. The best imputation technique did not guarantee better predictions obtained on the imputed data. The mean imputation, LOCF, KNN, Stineman, and cubic spline interpolations methods performed better for a small rate of missingness. Interpolation with EWMA and Kalman filtering yielded consistent performances across all scenarios of missingness. Disregarding the imputation methods, the LSTM resulted with a slightly better predictive accuracy among the best performing ARIMA and LSTM models; otherwise, the results varied. In our small sample, ARIMA tended to perform better on data with higher autocorrelation.Conclusions: We recommend to the researchers that they consider Kalman smoothing techniques, interpolation techniques (linear, spline, and Stineman), moving average techniques (SMA and EWMA) for imputing univariate time series data as they perform well on both data distribution and forecasting with ARIMA and LSTM models. The LSTM slightly outperforms ARIMA models, however, for small samples, ARIMA is simpler and faster to execute.","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"320"},"PeriodicalIF":3.9,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11670515/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142892064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Identifying the presence of atrial fibrillation during sinus rhythm using a dual-input mixed neural network with ECG coloring technology. 利用ECG染色技术的双输入混合神经网络识别窦性心律中心房颤动的存在。

IF 3.9 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

BMC Medical Research Methodology

Pub Date : 2024-12-23 DOI: 10.1186/s12874-024-02421-0

Wei-Wen Chen, Chih-Min Liu, Chien-Chao Tseng, Ching-Chun Huang, I-Chien Wu, Pei-Fen Chen, Shih-Lin Chang, Yenn-Jiang Lin, Li-Wei Lo, Fa-Po Chung, Tze-Fan Chao, Ta-Chuan Tuan, Jo-Nan Liao, Chin-Yu Lin, Ting-Yung Chang, Ling Kuo, Cheng-I Wu, Shin-Huei Liu, Jacky Chung-Hao Wu, Yu-Feng Hu, Shih-Ann Chen, Henry Horng-Shing Lu

Background: Undetected atrial fibrillation (AF) poses a significant risk of stroke and cardiovascular mortality. However, diagnosing AF in real-time can be challenging as the arrhythmia is often not captured instantly. To address this issue, a deep-learning model was developed to diagnose AF even during periods of arrhythmia-free windows.Methods: The proposed method introduces a novel approach that integrates clinical data and electrocardiograms (ECGs) using a colorization technique. This technique recolors ECG images based on patients' demographic information while preserving their original characteristics and incorporating color correlations from statistical data features. Our primary objective is to enhance atrial fibrillation (AF) detection by fusing ECG images with demographic data for colorization. To ensure the reliability of our dataset for training, validation, and testing, we rigorously maintained separation to prevent cross-contamination among these sets. We designed a Dual-input Mixed Neural Network (DMNN) that effectively handles different types of inputs, including demographic and image data, leveraging their mixed characteristics to optimize prediction performance. Unlike previous approaches, this method introduces demographic data through color transformation within ECG images, enriching the diversity of features for improved learning outcomes.Results: The proposed approach yielded promising results on the independent test set, achieving an impressive AUC of 83.4%. This outperformed the AUC of 75.8% obtained when using only the original signal values as input for the CNN. The evaluation of performance improvement revealed significant enhancements, including a 7.6% increase in AUC, an 11.3% boost in accuracy, a 9.4% improvement in sensitivity, an 11.6% enhancement in specificity, and a substantial 25.1% increase in the F1 score. Notably, AI diagnosis of AF was associated with future cardiovascular mortality. For clinical application, over a median follow-up of 71.6 ± 29.1 months, high-risk AI-predicted AF patients exhibited significantly higher cardiovascular mortality (AF vs. non-AF; 47 [18.7%] vs. 34 [4.8%]) and all-cause mortality (176 [52.9%] vs. 216 [26.3%]) compared to non-AF patients. In the low-risk group, AI-predicted AF patients showed slightly elevated cardiovascular (7 [0.7%] vs. 1 [0.3%]) and all-cause mortality (103 [9.0%] vs. 26 [6.4%]) than AI-predicted non-AF patients during six-year follow-up. These findings underscore the potential clinical utility of the AI model in predicting AF-related outcomes.Conclusions: This study introduces an ECG colorization approach to enhance atrial fibrillation (AF) detection using deep learning and demographic data, improving performance compared to ECG-only methods. This method is effective in identifying high-risk and low-risk populations, providing valuable features for future AF research

背景：未被发现的心房颤动（AF）具有显著的卒中和心血管死亡风险。然而，实时诊断房颤可能具有挑战性，因为心律失常通常不会立即被捕获。为了解决这个问题，我们开发了一种深度学习模型来诊断AF，即使是在无心律失常窗口期。方法：提出的方法引入了一种新的方法，将临床数据和心电图（ECGs）结合使用着色技术。该技术根据患者的人口统计信息对心电图图像进行重新着色，同时保留其原始特征，并结合统计数据特征的颜色相关性。我们的主要目标是通过融合心电图图像和人口统计学数据进行着色来增强心房颤动（AF）的检测。为了确保我们的数据集在训练、验证和测试方面的可靠性，我们严格保持了分离，以防止这些集之间的交叉污染。我们设计了一个双输入混合神经网络（DMNN），它有效地处理不同类型的输入，包括人口统计和图像数据，利用它们的混合特性来优化预测性能。与以前的方法不同，该方法通过心电图像中的颜色变换引入人口统计数据，丰富了特征的多样性，从而提高了学习效果。结果：提出的方法在独立测试集上取得了令人满意的结果，AUC达到了令人印象深刻的83.4%。这优于仅使用原始信号值作为CNN输入时获得的75.8%的AUC。对性能改进的评估显示了显著的增强，包括AUC增加7.6%，准确性提高11.3%，灵敏度提高9.4%，特异性提高11.6%，F1评分提高25.1%。值得注意的是，AF的AI诊断与未来心血管死亡率相关。在临床应用方面，中位随访时间为71.6±29.1个月，人工智能预测的高危房颤患者心血管死亡率明显更高(房颤vs非房颤；47例[18.7%]对34例[4.8%])和全因死亡率（176例[52.9%]对216例[26.3%]）。在低危组中，ai预测的房颤患者在6年随访期间心血管疾病（7例[0.7%]对1例[0.3%]）和全因死亡率（103例[9.0%]对26例[6.4%]）较ai预测的非房颤患者略有升高。这些发现强调了人工智能模型在预测af相关结果方面的潜在临床应用。结论：本研究引入了一种ECG染色方法，利用深度学习和人口统计学数据增强心房颤动（AF）的检测，与仅使用ECG的方法相比，提高了性能。该方法可有效识别高风险和低风险人群，为未来房颤研究和临床应用提供有价值的特征，并有利于基于ecg的分类研究。

{"title":"Identifying the presence of atrial fibrillation during sinus rhythm using a dual-input mixed neural network with ECG coloring technology.","authors":"Wei-Wen Chen, Chih-Min Liu, Chien-Chao Tseng, Ching-Chun Huang, I-Chien Wu, Pei-Fen Chen, Shih-Lin Chang, Yenn-Jiang Lin, Li-Wei Lo, Fa-Po Chung, Tze-Fan Chao, Ta-Chuan Tuan, Jo-Nan Liao, Chin-Yu Lin, Ting-Yung Chang, Ling Kuo, Cheng-I Wu, Shin-Huei Liu, Jacky Chung-Hao Wu, Yu-Feng Hu, Shih-Ann Chen, Henry Horng-Shing Lu","doi":"10.1186/s12874-024-02421-0","DOIUrl":"10.1186/s12874-024-02421-0","url":null,"abstract":"Background: Undetected atrial fibrillation (AF) poses a significant risk of stroke and cardiovascular mortality. However, diagnosing AF in real-time can be challenging as the arrhythmia is often not captured instantly. To address this issue, a deep-learning model was developed to diagnose AF even during periods of arrhythmia-free windows.Methods: The proposed method introduces a novel approach that integrates clinical data and electrocardiograms (ECGs) using a colorization technique. This technique recolors ECG images based on patients' demographic information while preserving their original characteristics and incorporating color correlations from statistical data features. Our primary objective is to enhance atrial fibrillation (AF) detection by fusing ECG images with demographic data for colorization. To ensure the reliability of our dataset for training, validation, and testing, we rigorously maintained separation to prevent cross-contamination among these sets. We designed a Dual-input Mixed Neural Network (DMNN) that effectively handles different types of inputs, including demographic and image data, leveraging their mixed characteristics to optimize prediction performance. Unlike previous approaches, this method introduces demographic data through color transformation within ECG images, enriching the diversity of features for improved learning outcomes.Results: The proposed approach yielded promising results on the independent test set, achieving an impressive AUC of 83.4%. This outperformed the AUC of 75.8% obtained when using only the original signal values as input for the CNN. The evaluation of performance improvement revealed significant enhancements, including a 7.6% increase in AUC, an 11.3% boost in accuracy, a 9.4% improvement in sensitivity, an 11.6% enhancement in specificity, and a substantial 25.1% increase in the F1 score. Notably, AI diagnosis of AF was associated with future cardiovascular mortality. For clinical application, over a median follow-up of 71.6 ± 29.1 months, high-risk AI-predicted AF patients exhibited significantly higher cardiovascular mortality (AF vs. non-AF; 47 [18.7%] vs. 34 [4.8%]) and all-cause mortality (176 [52.9%] vs. 216 [26.3%]) compared to non-AF patients. In the low-risk group, AI-predicted AF patients showed slightly elevated cardiovascular (7 [0.7%] vs. 1 [0.3%]) and all-cause mortality (103 [9.0%] vs. 26 [6.4%]) than AI-predicted non-AF patients during six-year follow-up. These findings underscore the potential clinical utility of the AI model in predicting AF-related outcomes.Conclusions: This study introduces an ECG colorization approach to enhance atrial fibrillation (AF) detection using deep learning and demographic data, improving performance compared to ECG-only methods. This method is effective in identifying high-risk and low-risk populations, providing valuable features for future AF research ","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"318"},"PeriodicalIF":3.9,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11665121/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142881113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The analysis and reporting of multiple outcomes in mental health trials: a methodological systematic review. 心理健康试验中多个结果的分析和报告：一项方法学系统综述。

IF 3.9 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

BMC Medical Research Methodology

Pub Date : 2024-12-21 DOI: 10.1186/s12874-024-02451-8

Dominic Stringer, Mollie Payne, Ben Carter, Richard Emsley

Background: The choice of a single primary outcome in randomised trials can be difficult, especially in mental health where interventions may be complex and target several outcomes simultaneously. We carried out a systematic review to assess the quality of the analysis and reporting of multiple outcomes in mental health RCTs, comparing approaches with current CONSORT and other regulatory guidance.

Methods: The review included all late-stage mental health trials published between 1st January 2019 to 31st December 2020 in 9 leading medical and mental health journals. Pilot and feasibility trials, non-randomised trials, and early phase trials were excluded. The total number of primary, secondary and other outcomes was recorded, as was any strategy used to incorporate multiple primary outcomes in the primary analysis.

Results: There were 147 included mental health trials. Most trials (101/147) followed CONSORT guidance by specifying a single primary outcome with other outcomes defined as secondary and analysed in separate statistical analyses, although a minority (10/147) did not specify any outcomes as primary. Where multiple primary outcomes were specified (33/147), most (26/33) did not correct for multiplicity, contradicting regulatory guidance. The median number of clinical outcomes reported across studies was 8 (IQR 5-11 ).

Conclusions: Most trials are correctly following CONSORT guidance. However, there was little consideration given to multiplicity or correlation between outcomes even where multiple primary outcomes were stated. Trials should correct for multiplicity when multiple primary outcomes are specified or describe some other strategy to address the multiplicity. Overall, very few mental health trials are taking advantage of multiple outcome strategies in the primary analysis, especially more complex strategies such as multivariate modelling. More work is required to show these exist, aid interpretation, increase efficiency and are easily implemented.

Registration: Our systematic review protocol was registered with the International Prospective Register of Systematic Reviews (PROSPERO) on 11th January 2023 (CRD42023382274).

背景：在随机试验中选择单一主要结局可能是困难的，特别是在心理健康方面，干预措施可能很复杂，同时针对多个结局。我们进行了一项系统综述，以评估心理健康随机对照试验中多结果分析和报告的质量，并将方法与当前的CONSORT和其他监管指南进行了比较。方法：本综述纳入了2019年1月1日至2020年12月31日在9种主要医学和心理健康期刊上发表的所有晚期心理健康试验。排除了试点和可行性试验、非随机试验和早期试验。记录主要、次要和其他结局的总数，以及用于在主要分析中纳入多个主要结局的任何策略。结果：共纳入147项心理健康试验。大多数试验（101/147）遵循CONSORT指南，指定单个主要结局，其他结局定义为次要结局，并在单独的统计分析中进行分析，尽管少数试验（10/147）没有指定任何结局作为主要结局。在指定多个主要结局的情况下（33/147），大多数（26/33）没有纠正多重性，这与监管指导相矛盾。所有研究报告的临床结果中位数为8 （IQR 5-11）。结论：大多数试验正确地遵循CONSORT指南。然而，即使在陈述多个主要结果时，也很少考虑结果之间的多重性或相关性。当指定多个主要结果时，试验应纠正多重性或描述一些其他策略来解决多重性。总体而言，很少有心理健康试验在初级分析中利用多种结果策略，特别是更复杂的策略，如多变量建模。需要做更多的工作来证明这些存在，帮助解释，提高效率并易于实施。注册：我们的系统评价方案已于2023年1月11日在国际前瞻性系统评价注册中心（PROSPERO）注册（CRD42023382274）。

{"title":"The analysis and reporting of multiple outcomes in mental health trials: a methodological systematic review.","authors":"Dominic Stringer, Mollie Payne, Ben Carter, Richard Emsley","doi":"10.1186/s12874-024-02451-8","DOIUrl":"10.1186/s12874-024-02451-8","url":null,"abstract":"Background: The choice of a single primary outcome in randomised trials can be difficult, especially in mental health where interventions may be complex and target several outcomes simultaneously. We carried out a systematic review to assess the quality of the analysis and reporting of multiple outcomes in mental health RCTs, comparing approaches with current CONSORT and other regulatory guidance.Methods: The review included all late-stage mental health trials published between 1st January 2019 to 31st December 2020 in 9 leading medical and mental health journals. Pilot and feasibility trials, non-randomised trials, and early phase trials were excluded. The total number of primary, secondary and other outcomes was recorded, as was any strategy used to incorporate multiple primary outcomes in the primary analysis.Results: There were 147 included mental health trials. Most trials (101/147) followed CONSORT guidance by specifying a single primary outcome with other outcomes defined as secondary and analysed in separate statistical analyses, although a minority (10/147) did not specify any outcomes as primary. Where multiple primary outcomes were specified (33/147), most (26/33) did not correct for multiplicity, contradicting regulatory guidance. The median number of clinical outcomes reported across studies was 8 (IQR 5-11 ).Conclusions: Most trials are correctly following CONSORT guidance. However, there was little consideration given to multiplicity or correlation between outcomes even where multiple primary outcomes were stated. Trials should correct for multiplicity when multiple primary outcomes are specified or describe some other strategy to address the multiplicity. Overall, very few mental health trials are taking advantage of multiple outcome strategies in the primary analysis, especially more complex strategies such as multivariate modelling. More work is required to show these exist, aid interpretation, increase efficiency and are easily implemented.Registration: Our systematic review protocol was registered with the International Prospective Register of Systematic Reviews (PROSPERO) on 11th January 2023 (CRD42023382274).","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"317"},"PeriodicalIF":3.9,"publicationDate":"2024-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11662570/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142871332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Built-in selection or confounder bias? Dynamic Landmarking in matched propensity score analyses. 内在选择还是混杂偏差？匹配倾向评分分析中的动态地标。

IF 3.9 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

BMC Medical Research Methodology

Pub Date : 2024-12-21 DOI: 10.1186/s12874-024-02444-7

Alexandra Strobel, Andreas Wienke, Jan Gummert, Sabine Bleiziffer, Oliver Kuss

Background: Propensity score matching has become a popular method for estimating causal treatment effects in non-randomized studies. However, for time-to-event outcomes, the estimation of hazard ratios based on propensity scores can be challenging if omitted or unobserved covariates are present. Not accounting for such covariates could lead to treatment estimates, differing from the estimate of interest. However, researchers often do not know whether (and, if so, which) covariates will cause this divergence.

Methods: To address this issue, we extended a previously described method, Dynamic Landmarking, which was originally developed for randomized trials. The method is based on successively deletion of sorted observations and gradually fitting univariable Cox models. In addition, the balance of observed, but omitted covariates can be measured by the sum of squared z-differences.

Results: By simulation we show, that Dynamic Landmarking provides a good visual tool for detecting and distinguishing treatment effect estimates underlying built-in selection or confounding bias. We illustrate the approach with a data set from cardiac surgery and provide some recommendations on how to use and interpret Dynamic Landmarking in propensity score matched studies.

Conclusion: Dynamic Landmarking is a useful post-hoc diagnosis tool for visualizing whether an estimated hazard ratio could be distorted by confounding or built-in selection bias.

背景：倾向评分匹配已成为估计非随机研究中因果治疗效果的流行方法。然而，对于时间到事件的结果，如果存在遗漏或未观察到的协变量，基于倾向得分的风险比估计可能具有挑战性。不考虑这些协变量可能导致治疗估计值与兴趣估计值不同。然而，研究人员往往不知道是否（如果是，哪些）协变量会导致这种分歧。方法：为了解决这个问题，我们扩展了先前描述的方法，动态地标，它最初是为随机试验开发的。该方法基于逐次删除已排序的观测值，逐步拟合单变量Cox模型。此外，观察到的但省略的协变量的平衡可以通过z差的平方和来测量。结果：通过模拟，我们表明，动态地标提供了很好的视觉工具，用于检测和区分潜在的内置选择或混杂偏差的治疗效果估计。我们用心脏外科手术的数据集说明了这种方法，并就如何在倾向评分匹配研究中使用和解释动态地标提供了一些建议。结论：动态标记是一种有用的事后诊断工具，用于可视化估计的风险比是否会因混杂或内置选择偏差而扭曲。

{"title":"Built-in selection or confounder bias? Dynamic Landmarking in matched propensity score analyses.","authors":"Alexandra Strobel, Andreas Wienke, Jan Gummert, Sabine Bleiziffer, Oliver Kuss","doi":"10.1186/s12874-024-02444-7","DOIUrl":"10.1186/s12874-024-02444-7","url":null,"abstract":"Background: Propensity score matching has become a popular method for estimating causal treatment effects in non-randomized studies. However, for time-to-event outcomes, the estimation of hazard ratios based on propensity scores can be challenging if omitted or unobserved covariates are present. Not accounting for such covariates could lead to treatment estimates, differing from the estimate of interest. However, researchers often do not know whether (and, if so, which) covariates will cause this divergence.Methods: To address this issue, we extended a previously described method, Dynamic Landmarking, which was originally developed for randomized trials. The method is based on successively deletion of sorted observations and gradually fitting univariable Cox models. In addition, the balance of observed, but omitted covariates can be measured by the sum of squared z-differences.Results: By simulation we show, that Dynamic Landmarking provides a good visual tool for detecting and distinguishing treatment effect estimates underlying built-in selection or confounding bias. We illustrate the approach with a data set from cardiac surgery and provide some recommendations on how to use and interpret Dynamic Landmarking in propensity score matched studies.Conclusion: Dynamic Landmarking is a useful post-hoc diagnosis tool for visualizing whether an estimated hazard ratio could be distorted by confounding or built-in selection bias.","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"316"},"PeriodicalIF":3.9,"publicationDate":"2024-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11662801/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142871317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Addressing treatment switching in the ALTA-1L trial with g-methods: exploring the impact of model specification. 用g方法解决ALTA-1L试验中的治疗切换：探索模型规范的影响。

IF 3.9 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

BMC Medical Research Methodology

Pub Date : 2024-12-20 DOI: 10.1186/s12874-024-02437-6

Amani Al Tawil, Sean McGrath, Robin Ristl, Ulrich Mansmann

Background: Treatment switching in randomized clinical trials introduces challenges in performing causal inference. Intention To Treat (ITT) analyses often fail to fully capture the causal effect of treatment in the presence of treatment switching. Consequently, decision makers may instead be interested in causal effects of hypothetical treatment strategies that do not allow for treatment switching. For example, the phase 3 ALTA-1L trial showed that brigatinib may have improved Overall Survival (OS) compared to crizotinib if treatment switching had not occurred. Their sensitivity analysis using Inverse Probability of Censoring Weights (IPCW), reported a Hazard Ratio (HR) of 0.50 (95% CI, 0.28-0.87), while their initial ITT analysis estimated an HR of 0.81 (0.53-1.22).

Methods: We used a directed acyclic graph to depict the clinical setting of the ALTA-1L trial in the presence of treatment switching, illustrating the concept of treatment-confounder feedback and highlighting the need for g-methods. In a re-analysis of the ALTA-1L trial data, we used IPCW and the parametric g-formula to adjust for baseline and time-varying covariates to estimate the effect of two hypothetical treatment strategies on OS: "always treat with brigatinib" versus "always treat with crizotinib". We conducted various sensitivity analyses using different model specifications and weight truncation approaches.

Results: Applying the IPCW approach in a series of sensitivity analyses yielded Cumulative HRs (cHRs) ranging between 0.38 (0.12, 0.98) and 0.73 (0.45,1.22) and Risk Ratios (RRs) ranging between 0.52 (0.32, 0.98) and 0.79 (0.54,1.17). Applying the parametric g-formula resulted in cHRs ranging between 0.61 (0.38,0.91) and 0.72 (0.43,1.07) and RRs ranging between 0.71 (0.48,0.94) and 0.79 (0.54,1.05).

Conclusion: Our results consistently indicated that our estimated ITT effect estimate (cHR: 0.82 (0.51,1.22) may have underestimated brigatinib's benefit by around 10-45 percentage points (using IPCW) and 10-20 percentage points (using the parametric g-formula) across a wide range of model choices. Our analyses underscore the importance of performing sensitivity analyses, as the result from a single analysis could potentially stand as an outlier in a whole range of sensitivity analyses.

Trial registration: Clinicaltrials.gov Identifier: NCT02737501 on April 14, 2016.

背景：随机临床试验中的治疗转换给因果推理带来了挑战。治疗意图（ITT）分析往往不能完全捕捉治疗的因果效应在治疗转换的存在。因此，决策者可能会对不允许治疗转换的假设治疗策略的因果效应感兴趣。例如，3期ALTA-1L试验表明，如果没有发生治疗转换，布加替尼可能比克唑替尼提高了总生存期（OS）。他们使用审查权重逆概率（IPCW）进行敏感性分析，报告的风险比（HR）为0.50 (95% CI, 0.28-0.87)，而他们最初的ITT分析估计的风险比为0.81（0.53-1.22）。方法：我们使用有向无环图来描述存在治疗切换的ALTA-1L试验的临床环境，说明了治疗混杂因素反馈的概念，并强调了g方法的必要性。在对ALTA-1L试验数据的重新分析中，我们使用IPCW和参数g公式来调整基线和时变协变量，以估计两种假设治疗策略对OS的影响：“总是用布加替尼治疗”和“总是用克唑替尼治疗”。我们使用不同的模型规格和权重截断方法进行了各种敏感性分析。结果：应用IPCW方法进行一系列敏感性分析，累积hr （cHRs）范围为0.38（0.12,0.98）至0.73(0.45,1.22)，风险比（RRs）范围为0.52（0.32,0.98）至0.79（0.54,1.17）。应用参数g公式，cHRs范围为0.61（0.38,0.91）和0.72 (0.43,1.07)，RRs范围为0.71（0.48,0.94）和0.79（0.54,1.05）。结论：我们的结果一致表明，我们估计的ITT效应估计（cHR: 0.82(0.51,1.22)）可能低估了布加替尼的益处，在广泛的模型选择范围内，低估了约10-45个百分点（使用IPCW）和10-20个百分点（使用参数g公式）。我们的分析强调了进行敏感性分析的重要性，因为单个分析的结果可能会成为整个敏感性分析范围的异常值。试验注册：Clinicaltrials.gov标识符：NCT02737501，于2016年4月14日注册。

{"title":"Addressing treatment switching in the ALTA-1L trial with g-methods: exploring the impact of model specification.","authors":"Amani Al Tawil, Sean McGrath, Robin Ristl, Ulrich Mansmann","doi":"10.1186/s12874-024-02437-6","DOIUrl":"10.1186/s12874-024-02437-6","url":null,"abstract":"Background: Treatment switching in randomized clinical trials introduces challenges in performing causal inference. Intention To Treat (ITT) analyses often fail to fully capture the causal effect of treatment in the presence of treatment switching. Consequently, decision makers may instead be interested in causal effects of hypothetical treatment strategies that do not allow for treatment switching. For example, the phase 3 ALTA-1L trial showed that brigatinib may have improved Overall Survival (OS) compared to crizotinib if treatment switching had not occurred. Their sensitivity analysis using Inverse Probability of Censoring Weights (IPCW), reported a Hazard Ratio (HR) of 0.50 (95% CI, 0.28-0.87), while their initial ITT analysis estimated an HR of 0.81 (0.53-1.22).Methods: We used a directed acyclic graph to depict the clinical setting of the ALTA-1L trial in the presence of treatment switching, illustrating the concept of treatment-confounder feedback and highlighting the need for g-methods. In a re-analysis of the ALTA-1L trial data, we used IPCW and the parametric g-formula to adjust for baseline and time-varying covariates to estimate the effect of two hypothetical treatment strategies on OS: \"always treat with brigatinib\" versus \"always treat with crizotinib\". We conducted various sensitivity analyses using different model specifications and weight truncation approaches.Results: Applying the IPCW approach in a series of sensitivity analyses yielded Cumulative HRs (cHRs) ranging between 0.38 (0.12, 0.98) and 0.73 (0.45,1.22) and Risk Ratios (RRs) ranging between 0.52 (0.32, 0.98) and 0.79 (0.54,1.17). Applying the parametric g-formula resulted in cHRs ranging between 0.61 (0.38,0.91) and 0.72 (0.43,1.07) and RRs ranging between 0.71 (0.48,0.94) and 0.79 (0.54,1.05).Conclusion: Our results consistently indicated that our estimated ITT effect estimate (cHR: 0.82 (0.51,1.22) may have underestimated brigatinib's benefit by around 10-45 percentage points (using IPCW) and 10-20 percentage points (using the parametric g-formula) across a wide range of model choices. Our analyses underscore the importance of performing sensitivity analyses, as the result from a single analysis could potentially stand as an outlier in a whole range of sensitivity analyses.Trial registration: Clinicaltrials.gov Identifier: NCT02737501 on April 14, 2016.","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"314"},"PeriodicalIF":3.9,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11660711/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142871316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimising research investment by simulating and evaluating monitoring strategies to inform a trial: a simulation of liver fibrosis monitoring. 通过模拟和评估监测策略来优化研究投资，为试验提供信息：模拟肝纤维化监测。

IF 3.9 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

BMC Medical Research Methodology

Pub Date : 2024-12-20 DOI: 10.1186/s12874-024-02425-w

Alice J Sitch, Jacqueline Dinnes, Jenny Hewison, Walter Gregory, Julie Parkes, Jonathan J Deeks

Background: The aim of the study was to investigate the development of evidence-based monitoring strategies in a population with progressive or recurrent disease. A simulation study of monitoring strategies using a new biomarker (ELF) for the detection of liver cirrhosis in people with known liver fibrosis was undertaken alongside a randomised controlled trial (ELUCIDATE).

Methods: Existing data and expert opinion were used to estimate the progression of disease and the performance of repeat testing with ELF. Knowledge of the true disease status in addition to the observed test results for a cohort of simulated patients allowed various monitoring strategies to be implemented, evaluated and validated against trial data.

Results: Several monitoring strategies ranging in complexity were successfully modelled and compared regarding the timing of detection of disease, the duration of monitoring, and the predictive value of a positive test result. The results of sensitivity analysis showed the importance of accurate data to inform the simulation. Results of the simulation were similar to those from the trial.

Conclusion: Monitoring data can be simulated and strategies compared given adequate knowledge of disease progression and test performance. Such exercises should be carried out to ensure optimal strategies are evaluated in trials thus reducing research waste. Monitoring data can be generated and monitoring strategies can be assessed if data is available on the monitoring test performance and the test variability. This work highlights the data necessary and the general method for evaluating the performance of monitoring strategies, allowing appropriate strategies to be selected for evaluation. Modelling work should be conducted prior to full scale investigation of monitoring strategies, allowing optimal monitoring strategies to be assessed.

背景：本研究的目的是调查进展性或复发性疾病人群中循证监测策略的发展。在一项随机对照试验（ELUCIDATE）的同时，对已知肝纤维化患者使用一种新的生物标志物（ELF）检测肝硬化的监测策略进行了模拟研究。方法：利用现有资料和专家意见评估疾病进展和ELF重复检测的性能。除了对一组模拟患者观察到的测试结果之外，对真实疾病状态的了解允许根据试验数据实施、评估和验证各种监测策略。结果：成功地对几种复杂的监测策略进行了建模，并就疾病发现的时间、监测的持续时间和阳性检测结果的预测值进行了比较。灵敏度分析的结果表明，准确的数据对模拟的重要性。模拟的结果与试验的结果相似。结论：在充分了解疾病进展和测试性能的情况下，可以模拟监测数据并比较监测策略。应该进行这样的练习，以确保在试验中评估最佳策略，从而减少研究浪费。如果有关于监视测试性能和测试可变性的可用数据，则可以生成监视数据并评估监视策略。这项工作突出了评价监测战略执行情况所需的数据和一般方法，从而可以选择适当的战略进行评价。应在对监测策略进行全面调查之前进行建模工作，以便评估最佳监测策略。

{"title":"Optimising research investment by simulating and evaluating monitoring strategies to inform a trial: a simulation of liver fibrosis monitoring.","authors":"Alice J Sitch, Jacqueline Dinnes, Jenny Hewison, Walter Gregory, Julie Parkes, Jonathan J Deeks","doi":"10.1186/s12874-024-02425-w","DOIUrl":"10.1186/s12874-024-02425-w","url":null,"abstract":"Background: The aim of the study was to investigate the development of evidence-based monitoring strategies in a population with progressive or recurrent disease. A simulation study of monitoring strategies using a new biomarker (ELF) for the detection of liver cirrhosis in people with known liver fibrosis was undertaken alongside a randomised controlled trial (ELUCIDATE).Methods: Existing data and expert opinion were used to estimate the progression of disease and the performance of repeat testing with ELF. Knowledge of the true disease status in addition to the observed test results for a cohort of simulated patients allowed various monitoring strategies to be implemented, evaluated and validated against trial data.Results: Several monitoring strategies ranging in complexity were successfully modelled and compared regarding the timing of detection of disease, the duration of monitoring, and the predictive value of a positive test result. The results of sensitivity analysis showed the importance of accurate data to inform the simulation. Results of the simulation were similar to those from the trial.Conclusion: Monitoring data can be simulated and strategies compared given adequate knowledge of disease progression and test performance. Such exercises should be carried out to ensure optimal strategies are evaluated in trials thus reducing research waste. Monitoring data can be generated and monitoring strategies can be assessed if data is available on the monitoring test performance and the test variability. This work highlights the data necessary and the general method for evaluating the performance of monitoring strategies, allowing appropriate strategies to be selected for evaluation. Modelling work should be conducted prior to full scale investigation of monitoring strategies, allowing optimal monitoring strategies to be assessed.","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"315"},"PeriodicalIF":3.9,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11660973/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142871270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Partial-linear single-index Cox regression models with multiple time-dependent covariates. 多时变协变量的部分线性单指标Cox回归模型。

IF 3.9 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

BMC Medical Research Methodology

Pub Date : 2024-12-20 DOI: 10.1186/s12874-024-02434-9

Myeonggyun Lee, Andrea B Troxel, Sophia Kwon, George Crowley, Theresa Schwartz, Rachel Zeig-Owens, David J Prezant, Anna Nolan, Mengling Liu

Background: In cohort studies with time-to-event outcomes, covariates of interest often have values that change over time. The classical Cox regression model can handle time-dependent covariates but assumes linear effects on the log hazard function, which can be limiting in practice. Furthermore, when multiple correlated covariates are studied, it is of great interest to model their joint effects by allowing a flexible functional form and to delineate their relative contributions to survival risk.

Methods: Motivated by the World Trade Center (WTC)-exposed Fire Department of New York cohort study, we proposed a partial-linear single-index Cox (PLSI-Cox) model to investigate the effects of repeatedly measured metabolic syndrome indicators on the risk of developing WTC lung injury associated with particulate matter exposure. The PLSI-Cox model reduces the dimensionality of covariates while providing interpretable estimates of their effects. The model's flexible link function accommodates nonlinear effects on the log hazard function. We developed an iterative estimation algorithm using spline techniques to model the nonparametric single-index component for potential nonlinear effects, followed by maximum partial likelihood estimation of the parameters.

Results: Extensive simulations showed that the proposed PLSI-Cox model outperformed the classical time-dependent Cox regression model when the true relationship was nonlinear. When the relationship was linear, both the PLSI-Cox model and classical time-dependent Cox regression model performed similarly. In the data application, we found a possible nonlinear joint effect of metabolic syndrome indicators on survival risk. Among the different indicators, BMI had the largest positive effect on the risk of developing lung injury, followed by triglycerides.

Conclusion: The PLSI-Cox models allow for the evaluation of nonlinear effects of covariates and offer insights into their relative importance and direction. These methods provide a powerful set of tools for analyzing data with multiple time-dependent covariates and survival outcomes, potentially offering valuable insights for both current and future studies.

背景：在具有时间-事件结果的队列研究中，感兴趣的协变量通常具有随时间变化的值。经典的Cox回归模型可以处理时变协变量，但假设对数风险函数是线性的，这在实际应用中可能会受到限制。此外，当研究多个相关协变量时，通过允许灵活的函数形式来模拟它们的联合效应并描绘它们对生存风险的相对贡献是非常有趣的。方法：受世界贸易中心（WTC）暴露的纽约消防局队列研究的启发，我们提出了一个部分线性单指数Cox （PLSI-Cox）模型，研究反复测量代谢综合征指标对WTC暴露相关颗粒物肺损伤风险的影响。PLSI-Cox模型降低了协变量的维度，同时提供了对其影响的可解释估计。模型的柔性连杆函数可以适应对数危害函数的非线性影响。我们开发了一种迭代估计算法，利用样条技术对潜在非线性效应的非参数单指标成分进行建模，然后对参数进行最大部分似然估计。结果：大量的模拟表明，当真实关系为非线性时，所提出的PLSI-Cox模型优于经典的时变Cox回归模型。当关系为线性时，PLSI-Cox模型与经典时间相关Cox回归模型的表现相似。在数据应用中，我们发现代谢综合征指标对生存风险可能存在非线性联合效应。在不同的指标中，BMI对肺损伤风险的积极影响最大，其次是甘油三酯。结论：PLSI-Cox模型允许评估协变量的非线性效应，并提供对其相对重要性和方向的见解。这些方法为分析具有多个时间相关协变量和生存结果的数据提供了一套强大的工具，可能为当前和未来的研究提供有价值的见解。

{"title":"Partial-linear single-index Cox regression models with multiple time-dependent covariates.","authors":"Myeonggyun Lee, Andrea B Troxel, Sophia Kwon, George Crowley, Theresa Schwartz, Rachel Zeig-Owens, David J Prezant, Anna Nolan, Mengling Liu","doi":"10.1186/s12874-024-02434-9","DOIUrl":"10.1186/s12874-024-02434-9","url":null,"abstract":"Background: In cohort studies with time-to-event outcomes, covariates of interest often have values that change over time. The classical Cox regression model can handle time-dependent covariates but assumes linear effects on the log hazard function, which can be limiting in practice. Furthermore, when multiple correlated covariates are studied, it is of great interest to model their joint effects by allowing a flexible functional form and to delineate their relative contributions to survival risk.Methods: Motivated by the World Trade Center (WTC)-exposed Fire Department of New York cohort study, we proposed a partial-linear single-index Cox (PLSI-Cox) model to investigate the effects of repeatedly measured metabolic syndrome indicators on the risk of developing WTC lung injury associated with particulate matter exposure. The PLSI-Cox model reduces the dimensionality of covariates while providing interpretable estimates of their effects. The model's flexible link function accommodates nonlinear effects on the log hazard function. We developed an iterative estimation algorithm using spline techniques to model the nonparametric single-index component for potential nonlinear effects, followed by maximum partial likelihood estimation of the parameters.Results: Extensive simulations showed that the proposed PLSI-Cox model outperformed the classical time-dependent Cox regression model when the true relationship was nonlinear. When the relationship was linear, both the PLSI-Cox model and classical time-dependent Cox regression model performed similarly. In the data application, we found a possible nonlinear joint effect of metabolic syndrome indicators on survival risk. Among the different indicators, BMI had the largest positive effect on the risk of developing lung injury, followed by triglycerides.Conclusion: The PLSI-Cox models allow for the evaluation of nonlinear effects of covariates and offer insights into their relative importance and direction. These methods provide a powerful set of tools for analyzing data with multiple time-dependent covariates and survival outcomes, potentially offering valuable insights for both current and future studies.","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"311"},"PeriodicalIF":3.9,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11661057/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142871273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mixed-effects neural network modelling to predict longitudinal trends in fasting plasma glucose. 预测空腹血糖纵向趋势的混合效应神经网络模型。

IF 3.9 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

BMC Medical Research Methodology

Pub Date : 2024-12-20 DOI: 10.1186/s12874-024-02442-9

Qiong Zou, Borui Chen, Yang Zhang, Xi Wu, Yi Wan, Changsheng Chen

Background: Accurate fasting plasma glucose (FPG) trend prediction is important for management and treatment of patients with type 2 diabetes mellitus (T2DM), a globally prevalent chronic disease. (Generalised) linear mixed-effects (LME) models and machine learning (ML) are commonly used to analyse longitudinal data; however, the former is insufficient for dealing with complex, nonlinear data, whereas with the latter, random effects are ignored. The aim of this study was to develop LME, back propagation neural network (BPNN), and mixed-effects NN models that combine the 2 to predict FPG levels.

Methods: Monitoring data from 779 patients with T2DM from a multicentre, prospective study from the shared platform Figshare repository were divided 80/20 into training/test sets. The first 10 important features were modelled via random forest (RF) screening. First, an LME model was built to model interindividual differences, analyse the factors affecting FPG levels, compare the AIC and BIC values to screen the optimal model, and predict FPG levels. Second, multiple BPNN models were constructed via different variable sets to screen the optimal BPNN. Finally, an LME/BPNN combined model, named LMENN, was constructed via stacking integration. A 10-fold cross-validation cycle was performed using the training set to build the model and evaluate its performance, and then the final model was evaluated on the test set.

Results: The top 10 variables screened by RF were HOMA-β, HbA1c, HOMA-IR, urinary sugar, insulin, BMI, waist circumference, weight, age, and group. The best-fitting random-intercept mixed-effects (lm22) model showed that each patient's baseline glucose levels influenced subsequent glucose measurements, but the trend over time was consistent. The LMENN model combines the strengths of LME and BPNN and accounts for random effects. The RMSE of the LMENN model ranges were 0.447-0.471 (training set), 0.525-0.552 (validation set), and 0.511-0.565 (test set). It improves the prediction performance of the single LME and BPNN models and shows some advantages in predicting FPG levels.

Conclusions: The LMENN model built by integrating LME and BPNN has several potential applications in analysing longitudinal FPG monitoring data. This study provides new ideas and methods for further research in the field of blood glucose prediction.

背景：准确的空腹血糖（FPG）趋势预测对于2型糖尿病（T2DM）患者的管理和治疗非常重要。（广义）线性混合效应（LME）模型和机器学习（ML）通常用于分析纵向数据；然而，前者不足以处理复杂的非线性数据，而后者可以忽略随机效应。本研究的目的是开发LME、反向传播神经网络（BPNN）和混合效应神经网络模型，将两者结合起来预测FPG水平。方法：来自共享平台Figshare存储库的多中心前瞻性研究的779例T2DM患者的监测数据被分成80/20的训练/测试集。前10个重要特征通过随机森林（RF）筛选建模。首先，建立LME模型，模拟个体间差异，分析影响FPG水平的因素，比较AIC和BIC值，筛选最优模型，预测FPG水平。其次，通过不同的变量集构建多个BPNN模型，筛选最优的BPNN；最后，通过叠加积分构建LME/BPNN组合模型LMENN。使用训练集进行10次交叉验证循环，构建模型并评估其性能，然后在测试集上对最终模型进行评估。结果：RF筛选的前10个变量为HOMA-β、HbA1c、HOMA- ir、尿糖、胰岛素、BMI、腰围、体重、年龄、组。最佳拟合随机截距混合效应（lm22）模型显示，每位患者的基线血糖水平影响随后的血糖测量，但随时间变化的趋势是一致的。LMENN模型结合了LME和BPNN的优点，并考虑了随机效应。LMENN模型的RMSE范围分别为0.447 ~ 0.471（训练集）、0.525 ~ 0.552（验证集）和0.511 ~ 0.565（测试集）。它提高了单LME和BPNN模型的预测性能，并在预测FPG水平方面显示出一定的优势。结论：结合LME和BPNN建立的LMENN模型在FPG纵向监测数据分析中具有一定的应用前景。本研究为血糖预测领域的进一步研究提供了新的思路和方法。

{"title":"Mixed-effects neural network modelling to predict longitudinal trends in fasting plasma glucose.","authors":"Qiong Zou, Borui Chen, Yang Zhang, Xi Wu, Yi Wan, Changsheng Chen","doi":"10.1186/s12874-024-02442-9","DOIUrl":"10.1186/s12874-024-02442-9","url":null,"abstract":"Background: Accurate fasting plasma glucose (FPG) trend prediction is important for management and treatment of patients with type 2 diabetes mellitus (T2DM), a globally prevalent chronic disease. (Generalised) linear mixed-effects (LME) models and machine learning (ML) are commonly used to analyse longitudinal data; however, the former is insufficient for dealing with complex, nonlinear data, whereas with the latter, random effects are ignored. The aim of this study was to develop LME, back propagation neural network (BPNN), and mixed-effects NN models that combine the 2 to predict FPG levels.Methods: Monitoring data from 779 patients with T2DM from a multicentre, prospective study from the shared platform Figshare repository were divided 80/20 into training/test sets. The first 10 important features were modelled via random forest (RF) screening. First, an LME model was built to model interindividual differences, analyse the factors affecting FPG levels, compare the AIC and BIC values to screen the optimal model, and predict FPG levels. Second, multiple BPNN models were constructed via different variable sets to screen the optimal BPNN. Finally, an LME/BPNN combined model, named LMENN, was constructed via stacking integration. A 10-fold cross-validation cycle was performed using the training set to build the model and evaluate its performance, and then the final model was evaluated on the test set.Results: The top 10 variables screened by RF were HOMA-β, HbA1c, HOMA-IR, urinary sugar, insulin, BMI, waist circumference, weight, age, and group. The best-fitting random-intercept mixed-effects (lm22) model showed that each patient's baseline glucose levels influenced subsequent glucose measurements, but the trend over time was consistent. The LMENN model combines the strengths of LME and BPNN and accounts for random effects. The RMSE of the LMENN model ranges were 0.447-0.471 (training set), 0.525-0.552 (validation set), and 0.511-0.565 (test set). It improves the prediction performance of the single LME and BPNN models and shows some advantages in predicting FPG levels.Conclusions: The LMENN model built by integrating LME and BPNN has several potential applications in analysing longitudinal FPG monitoring data. This study provides new ideas and methods for further research in the field of blood glucose prediction.","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"313"},"PeriodicalIF":3.9,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11660730/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142871267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient analysis of drug interactions in liver injury: a retrospective study leveraging natural language processing and machine learning. 肝损伤中药物相互作用的有效分析：利用自然语言处理和机器学习的回顾性研究。

IF 3.9 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

BMC Medical Research Methodology

Pub Date : 2024-12-20 DOI: 10.1186/s12874-024-02443-8

Junlong Ma, Heng Chen, Ji Sun, Juanjuan Huang, Gefei He, Guoping Yang

Background: Liver injury from drug-drug interactions (DDIs), notably with anti-tuberculosis drugs such as isoniazid, poses a significant safety concern. Electronic medical records contain comprehensive clinical information and have gained increasing attention as a potential resource for DDI detection. However, a substantial portion of adverse drug reaction (ADR) information is hidden in unstructured narrative text, which has yet to be efficiently harnessed, thereby introducing bias into the research. There is a significant need for an efficient framework for the DDI assessment.

Methods: Using a Chinese natural language processing (NLP) model, we extracted 25,130 adverse drug reaction (ADR) records, dividing them into sets for training an automated normalization model. The trained models, in conjunction with liver function laboratory tests, were used to thoroughly and efficiently identify liver injury cases. Ultimately, we applied a case-control study design to detect DDI signals increasing isoniazid's liver injury risk.

Results: The Logistic Regression model demonstrated stable and superior performance in classification task. Based on laboratory criteria and NLP, we identified 128 liver injury cases among a cohort of 3,209 patients treated with isoniazid. Preliminary screening of 113 drug combinations with isoniazid highlighted 20 potential signal drugs, with antibacterials constituting 25%. Sensitivity analysis confirmed the robustness of signal drugs, especially in cardiac therapy and antibacterials.

Conclusion: Our NLP and machine learning approach effectively identifies isoniazid-related DDIs that increase the risk of liver injury, identifying 20 signal drugs, mainly antibacterials. Further research is required to validate these DDI signals.

背景：药物-药物相互作用（ddi）引起的肝损伤，特别是与异烟肼等抗结核药物的相互作用，引起了严重的安全性问题。电子病历包含全面的临床信息，作为检测DDI的潜在资源，越来越受到人们的关注。然而，很大一部分药物不良反应（ADR）信息隐藏在非结构化的叙事文本中，这些信息尚未得到有效利用，从而给研究带来了偏见。非常需要一个有效的DDI评估框架。方法：采用中文自然语言处理（NLP）模型，提取25130例药品不良反应（ADR）记录，将其划分为多个集，训练自动归一化模型。经过训练的模型与肝功能实验室测试相结合，用于彻底有效地识别肝损伤病例。最后，我们采用病例对照研究设计来检测增加异烟肼肝损伤风险的DDI信号。结果：Logistic回归模型在分类任务中表现出稳定和优越的性能。基于实验室标准和NLP，我们在3209例异烟肼治疗的患者中确定了128例肝损伤病例。初步筛选了113种异烟肼联合用药，突出了20种潜在的信号药物，其中抗菌药占25%。敏感性分析证实了信号药物的稳健性，特别是在心脏治疗和抗菌药物方面。结论：我们的NLP和机器学习方法可以有效识别出与异烟肼相关的ddi增加肝损伤风险，识别出20种信号药物，主要是抗菌药物。需要进一步的研究来验证这些DDI信号。

{"title":"Efficient analysis of drug interactions in liver injury: a retrospective study leveraging natural language processing and machine learning.","authors":"Junlong Ma, Heng Chen, Ji Sun, Juanjuan Huang, Gefei He, Guoping Yang","doi":"10.1186/s12874-024-02443-8","DOIUrl":"10.1186/s12874-024-02443-8","url":null,"abstract":"Background: Liver injury from drug-drug interactions (DDIs), notably with anti-tuberculosis drugs such as isoniazid, poses a significant safety concern. Electronic medical records contain comprehensive clinical information and have gained increasing attention as a potential resource for DDI detection. However, a substantial portion of adverse drug reaction (ADR) information is hidden in unstructured narrative text, which has yet to be efficiently harnessed, thereby introducing bias into the research. There is a significant need for an efficient framework for the DDI assessment.Methods: Using a Chinese natural language processing (NLP) model, we extracted 25,130 adverse drug reaction (ADR) records, dividing them into sets for training an automated normalization model. The trained models, in conjunction with liver function laboratory tests, were used to thoroughly and efficiently identify liver injury cases. Ultimately, we applied a case-control study design to detect DDI signals increasing isoniazid's liver injury risk.Results: The Logistic Regression model demonstrated stable and superior performance in classification task. Based on laboratory criteria and NLP, we identified 128 liver injury cases among a cohort of 3,209 patients treated with isoniazid. Preliminary screening of 113 drug combinations with isoniazid highlighted 20 potential signal drugs, with antibacterials constituting 25%. Sensitivity analysis confirmed the robustness of signal drugs, especially in cardiac therapy and antibacterials.Conclusion: Our NLP and machine learning approach effectively identifies isoniazid-related DDIs that increase the risk of liver injury, identifying 20 signal drugs, mainly antibacterials. Further research is required to validate these DDI signals.","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"312"},"PeriodicalIF":3.9,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11660714/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142871322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A comprehensive guide to study the agreement and reliability of multi-observer ordinal data. 研究多观测者有序数据一致性和可靠性的综合指南。

IF 3.9 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

BMC Medical Research Methodology

Pub Date : 2024-12-20 DOI: 10.1186/s12874-024-02431-y

Sophie Vanbelle, Christina Hernandez Engelhart, Ellen Blix

Background: A recent systematic review revealed issues in regard to performing and reporting agreement and reliability studies for ordinal scales, especially in the presence of more than two observers. This paper therefore aims to provide all necessary information in regard to the choice among the most meaningful and most used measures and the planning of agreement and reliability studies for ordinal outcomes.

Methods: This paper considers the generalisation of the proportion of (dis)agreement, the mean absolute deviation, the mean squared deviation and weighted kappa coefficients to more than two observers in the presence of an ordinal outcome.

Results: After highlighting the difference between the concepts of agreement and reliability, a clear and simple interpretation of the agreement and reliability coefficients is provided. The large sample variance of the various coefficients with the delta method is presented or derived if not available in the literature to construct Wald confidence intervals. Finally, a procedure to determine the minimum number of raters and patients needed to limit the uncertainty associated with the sampling process is provided. All the methods are available in an R package and a Shiny application to circumvent the limitations of current software.

Conclusions: The present paper completes existing guidelines, such as the Guidelines for Reporting Reliability and Agreement Studies (GRRAS), to improve the quality of reliability and agreement studies of clinical tests. Furthermore, we provide open source software to researchers with minimum programming skills.

背景：最近的一项系统综述揭示了有关执行和报告序数量表的一致性和可靠性研究的问题，特别是在两个以上观察员在场的情况下。因此，本文旨在提供所有必要的信息，以选择最有意义和最常用的措施，并规划有序结果的一致性和可靠性研究。方法：本文考虑了（不）一致的比例，平均绝对偏差，均方差和加权kappa系数的推广到两个以上的观察者在一个有序结果的存在。结果：在强调一致性和信度概念的区别后，对一致性和信度系数提供了清晰简单的解释。如果在文献中没有可用来构建Wald置信区间，则用delta方法提出或推导出各种系数的大样本方差。最后，提供了一个程序，以确定最低数量的评分者和患者需要限制与采样过程相关的不确定性。所有的方法都可以在R包和Shiny应用程序中获得，以绕过当前软件的限制。结论：本文完善了现有的指南，如可靠性和一致性研究报告指南（GRRAS），以提高临床试验的可靠性和一致性研究的质量。此外，我们为具有最低编程技能的研究人员提供开源软件。

{"title":"A comprehensive guide to study the agreement and reliability of multi-observer ordinal data.","authors":"Sophie Vanbelle, Christina Hernandez Engelhart, Ellen Blix","doi":"10.1186/s12874-024-02431-y","DOIUrl":"10.1186/s12874-024-02431-y","url":null,"abstract":"Background: A recent systematic review revealed issues in regard to performing and reporting agreement and reliability studies for ordinal scales, especially in the presence of more than two observers. This paper therefore aims to provide all necessary information in regard to the choice among the most meaningful and most used measures and the planning of agreement and reliability studies for ordinal outcomes.Methods: This paper considers the generalisation of the proportion of (dis)agreement, the mean absolute deviation, the mean squared deviation and weighted kappa coefficients to more than two observers in the presence of an ordinal outcome.Results: After highlighting the difference between the concepts of agreement and reliability, a clear and simple interpretation of the agreement and reliability coefficients is provided. The large sample variance of the various coefficients with the delta method is presented or derived if not available in the literature to construct Wald confidence intervals. Finally, a procedure to determine the minimum number of raters and patients needed to limit the uncertainty associated with the sampling process is provided. All the methods are available in an R package and a Shiny application to circumvent the limitations of current software.Conclusions: The present paper completes existing guidelines, such as the Guidelines for Reporting Reliability and Agreement Studies (GRRAS), to improve the quality of reliability and agreement studies of clinical tests. Furthermore, we provide open source software to researchers with minimum programming skills.","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"310"},"PeriodicalIF":3.9,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11660713/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142871329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0