首页 > 最新文献

BioMedInformatics最新文献

英文 中文
Towards the Generation of Medical Imaging Classifiers Robust to Common Perturbations 努力生成不受常见干扰影响的医学影像分类器
Pub Date : 2024-04-01 DOI: 10.3390/biomedinformatics4020050
Joshua Chuah, Pingkun Yan, Ge Wang, Juergen Hahn
Background: Machine learning (ML) and artificial intelligence (AI)-based classifiers can be used to diagnose diseases from medical imaging data. However, few of the classifiers proposed in the literature translate to clinical use because of robustness concerns. Materials and methods: This study investigates how to improve the robustness of AI/ML imaging classifiers by simultaneously applying perturbations of common effects (Gaussian noise, contrast, blur, rotation, and tilt) to different amounts of training and test images. Furthermore, a comparison with classifiers trained with adversarial noise is also presented. This procedure is illustrated using two publicly available datasets, the PneumoniaMNIST dataset and the Breast Ultrasound Images dataset (BUSI dataset). Results: Classifiers trained with small amounts of perturbed training images showed similar performance on unperturbed test images compared to the classifier trained with no perturbations. Additionally, classifiers trained with perturbed data performed significantly better on test data both perturbed by a single perturbation (p-values: noise = 0.0186; contrast = 0.0420; rotation, tilt, and blur = 0.000977) and multiple perturbations (p-values: PneumoniaMNIST = 0.000977; BUSI = 0.00684) than the classifier trained with unperturbed data. Conclusions: Classifiers trained with perturbed data were found to be more robust to perturbed test data than the unperturbed classifier without exhibiting a performance decrease on unperturbed test images, indicating benefits to training with data that include some perturbed images and no significant downsides.
背景:基于机器学习(ML)和人工智能(AI)的分类器可用于从医学影像数据中诊断疾病。然而,由于鲁棒性问题,文献中提出的分类器很少能应用于临床。材料和方法:本研究探讨了如何通过对不同数量的训练和测试图像同时应用常见效应(高斯噪声、对比度、模糊、旋转和倾斜)的扰动来提高人工智能/ML 成像分类器的鲁棒性。此外,还对使用对抗噪声训练的分类器进行了比较。该程序使用两个公开可用的数据集(PneumoniaMNIST 数据集和乳腺超声图像数据集(BUSI 数据集))进行说明。结果与没有扰动的分类器相比,使用少量扰动训练图像训练的分类器在未扰动测试图像上表现出相似的性能。此外,使用扰动数据训练的分类器在受到单一扰动(p 值:噪声 = 0.0186;对比度 = 0.0420;旋转、倾斜和模糊 = 0.000977)和多重扰动(p 值:PneumoniaMNIST = 0.000977;BUSI = 0.00684)扰动的测试数据上的表现明显优于使用未扰动数据训练的分类器。结论使用扰动数据训练的分类器与未扰动分类器相比,对扰动测试数据的鲁棒性更强,而在未扰动测试图像上的性能却没有下降,这表明使用包含一些扰动图像的数据进行训练是有好处的,而没有明显的坏处。
{"title":"Towards the Generation of Medical Imaging Classifiers Robust to Common Perturbations","authors":"Joshua Chuah, Pingkun Yan, Ge Wang, Juergen Hahn","doi":"10.3390/biomedinformatics4020050","DOIUrl":"https://doi.org/10.3390/biomedinformatics4020050","url":null,"abstract":"Background: Machine learning (ML) and artificial intelligence (AI)-based classifiers can be used to diagnose diseases from medical imaging data. However, few of the classifiers proposed in the literature translate to clinical use because of robustness concerns. Materials and methods: This study investigates how to improve the robustness of AI/ML imaging classifiers by simultaneously applying perturbations of common effects (Gaussian noise, contrast, blur, rotation, and tilt) to different amounts of training and test images. Furthermore, a comparison with classifiers trained with adversarial noise is also presented. This procedure is illustrated using two publicly available datasets, the PneumoniaMNIST dataset and the Breast Ultrasound Images dataset (BUSI dataset). Results: Classifiers trained with small amounts of perturbed training images showed similar performance on unperturbed test images compared to the classifier trained with no perturbations. Additionally, classifiers trained with perturbed data performed significantly better on test data both perturbed by a single perturbation (p-values: noise = 0.0186; contrast = 0.0420; rotation, tilt, and blur = 0.000977) and multiple perturbations (p-values: PneumoniaMNIST = 0.000977; BUSI = 0.00684) than the classifier trained with unperturbed data. Conclusions: Classifiers trained with perturbed data were found to be more robust to perturbed test data than the unperturbed classifier without exhibiting a performance decrease on unperturbed test images, indicating benefits to training with data that include some perturbed images and no significant downsides.","PeriodicalId":72394,"journal":{"name":"BioMedInformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140758561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hearables: In-Ear Multimodal Data Fusion for Robust Heart Rate Estimation Hearables:耳内多模态数据融合实现可靠的心率估计
Pub Date : 2024-04-01 DOI: 10.3390/biomedinformatics4020051
Marek Żyliński, Amir Nassibi, Edoardo Occhipinti, Adil Malik, Matteo Bermond, H. Davies, Danilo P. Mandic
Background: Ambulatory heart rate (HR) monitors that acquire electrocardiogram (ECG) or/and photoplethysmographm (PPG) signals from the torso, wrists, or ears are notably less accurate in tasks associated with high levels of movement compared to clinical measurements. However, a reliable estimation of HR can be obtained through data fusion from different sensors. These methods are especially suitable for multimodal hearable devices, where heart rate can be tracked from different modalities, including electrical ECG, optical PPG, and sounds (heart tones). Combined information from different modalities can compensate for single source limitations. Methods: In this paper, we evaluate the possible application of data fusion methods in hearables. We assess data fusion for heart rate estimation from simultaneous in-ear ECG and in-ear PPG, recorded on ten subjects while performing 5-min sitting and walking tasks. Results: Our findings show that data fusion methods provide a similar level of mean absolute error as the best single-source heart rate estimation but with much lower intra-subject variability, especially during walking activities. Conclusion: We conclude that data fusion methods provide more robust HR estimation than a single cardiovascular signal. These methods can enhance the performance of wearable devices, especially multimodal hearables, in heart rate tracking during physical activity.
背景:从躯干、手腕或耳朵获取心电图(ECG)或/和光敏血压计(PPG)信号的非卧姿心率(HR)监测仪,与临床测量相比,在与高运动相关的任务中准确性明显较低。不过,通过不同传感器的数据融合,可以获得可靠的心率估计值。这些方法尤其适用于多模态可听设备,在这些设备中,心率可通过不同模态进行跟踪,包括电子心电图、光学 PPG 和声音(心音)。来自不同模式的综合信息可以弥补单一来源的局限性。方法:本文评估了数据融合方法在可听设备中的可能应用。我们对 10 名受试者在完成 5 分钟坐姿和步行任务时同时记录的耳内心电图和耳内 PPG 的心率估算进行了数据融合评估。结果显示我们的研究结果表明,数据融合方法提供的平均绝对误差水平与最佳单源心率估算相似,但受试者内部的变异性要低得多,尤其是在步行活动中。结论我们得出结论,数据融合方法比单一心血管信号能提供更稳健的心率估计。这些方法可以提高可穿戴设备,尤其是多模态可听设备在体育活动期间心率跟踪方面的性能。
{"title":"Hearables: In-Ear Multimodal Data Fusion for Robust Heart Rate Estimation","authors":"Marek Żyliński, Amir Nassibi, Edoardo Occhipinti, Adil Malik, Matteo Bermond, H. Davies, Danilo P. Mandic","doi":"10.3390/biomedinformatics4020051","DOIUrl":"https://doi.org/10.3390/biomedinformatics4020051","url":null,"abstract":"Background: Ambulatory heart rate (HR) monitors that acquire electrocardiogram (ECG) or/and photoplethysmographm (PPG) signals from the torso, wrists, or ears are notably less accurate in tasks associated with high levels of movement compared to clinical measurements. However, a reliable estimation of HR can be obtained through data fusion from different sensors. These methods are especially suitable for multimodal hearable devices, where heart rate can be tracked from different modalities, including electrical ECG, optical PPG, and sounds (heart tones). Combined information from different modalities can compensate for single source limitations. Methods: In this paper, we evaluate the possible application of data fusion methods in hearables. We assess data fusion for heart rate estimation from simultaneous in-ear ECG and in-ear PPG, recorded on ten subjects while performing 5-min sitting and walking tasks. Results: Our findings show that data fusion methods provide a similar level of mean absolute error as the best single-source heart rate estimation but with much lower intra-subject variability, especially during walking activities. Conclusion: We conclude that data fusion methods provide more robust HR estimation than a single cardiovascular signal. These methods can enhance the performance of wearable devices, especially multimodal hearables, in heart rate tracking during physical activity.","PeriodicalId":72394,"journal":{"name":"BioMedInformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140795589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing Early Leukemia Diagnostics: A Comprehensive Study Incorporating Image Processing and Transfer Learning 推进早期白血病诊断:结合图像处理和迁移学习的综合研究
Pub Date : 2024-04-01 DOI: 10.3390/biomedinformatics4020054
Rezaul Haque, Abdullah Al Sakib, Md Forhad Hossain, Fahadul Islam, Ferdaus Ibne Aziz, Md Redwan Ahmed, Somasundar Kannan, Ali Rohan, Md Junayed Hasan
Disease recognition has been revolutionized by autonomous systems in the rapidly developing field of medical technology. A crucial aspect of diagnosis involves the visual assessment and enumeration of white blood cells in microscopic peripheral blood smears. This practice yields invaluable insights into a patient’s health, enabling the identification of conditions of blood malignancies such as leukemia. Early identification of leukemia subtypes is paramount for tailoring appropriate therapeutic interventions and enhancing patient survival rates. However, traditional diagnostic techniques, which depend on visual assessment, are arbitrary, laborious, and prone to errors. The advent of ML technologies offers a promising avenue for more accurate and efficient leukemia classification. In this study, we introduced a novel approach to leukemia classification by integrating advanced image processing, diverse dataset utilization, and sophisticated feature extraction techniques, coupled with the development of TL models. Focused on improving accuracy of previous studies, our approach utilized Kaggle datasets for binary and multiclass classifications. Extensive image processing involved a novel LoGMH method, complemented by diverse augmentation techniques. Feature extraction employed DCNN, with subsequent utilization of extracted features to train various ML and TL models. Rigorous evaluation using traditional metrics revealed Inception-ResNet’s superior performance, surpassing other models with F1 scores of 96.07% and 95.89% for binary and multiclass classification, respectively. Our results notably surpass previous research, particularly in cases involving a higher number of classes. These findings promise to influence clinical decision support systems, guide future research, and potentially revolutionize cancer diagnostics beyond leukemia, impacting broader medical imaging and oncology domains.
在快速发展的医疗技术领域,自主系统为疾病识别带来了革命性的变化。诊断的一个重要方面是对显微外周血涂片中的白细胞进行目测和计数。这种做法能为了解病人的健康状况提供宝贵的信息,从而识别血液恶性肿瘤(如白血病)的病情。早期识别白血病亚型对于制定适当的治疗干预措施和提高患者存活率至关重要。然而,依赖视觉评估的传统诊断技术随意性大、费力且容易出错。人工智能技术的出现为更准确、更高效地进行白血病分类提供了一条大有可为的途径。在这项研究中,我们通过整合先进的图像处理、多样化的数据集利用、复杂的特征提取技术以及 TL 模型的开发,引入了一种新的白血病分类方法。为了提高以往研究的准确性,我们的方法利用 Kaggle 数据集进行二元和多元分类。广泛的图像处理涉及一种新颖的 LoGMH 方法,并辅以多种增强技术。特征提取采用 DCNN,随后利用提取的特征训练各种 ML 和 TL 模型。使用传统指标进行的严格评估显示,Inception-ResNet 的性能优越,在二分类和多分类方面的 F1 分数分别为 96.07% 和 95.89%,超过了其他模型。我们的结果明显超过了之前的研究,尤其是在涉及较多类别的情况下。这些发现有望影响临床决策支持系统,指导未来的研究,并有可能彻底改变白血病以外的癌症诊断,影响更广泛的医学成像和肿瘤学领域。
{"title":"Advancing Early Leukemia Diagnostics: A Comprehensive Study Incorporating Image Processing and Transfer Learning","authors":"Rezaul Haque, Abdullah Al Sakib, Md Forhad Hossain, Fahadul Islam, Ferdaus Ibne Aziz, Md Redwan Ahmed, Somasundar Kannan, Ali Rohan, Md Junayed Hasan","doi":"10.3390/biomedinformatics4020054","DOIUrl":"https://doi.org/10.3390/biomedinformatics4020054","url":null,"abstract":"Disease recognition has been revolutionized by autonomous systems in the rapidly developing field of medical technology. A crucial aspect of diagnosis involves the visual assessment and enumeration of white blood cells in microscopic peripheral blood smears. This practice yields invaluable insights into a patient’s health, enabling the identification of conditions of blood malignancies such as leukemia. Early identification of leukemia subtypes is paramount for tailoring appropriate therapeutic interventions and enhancing patient survival rates. However, traditional diagnostic techniques, which depend on visual assessment, are arbitrary, laborious, and prone to errors. The advent of ML technologies offers a promising avenue for more accurate and efficient leukemia classification. In this study, we introduced a novel approach to leukemia classification by integrating advanced image processing, diverse dataset utilization, and sophisticated feature extraction techniques, coupled with the development of TL models. Focused on improving accuracy of previous studies, our approach utilized Kaggle datasets for binary and multiclass classifications. Extensive image processing involved a novel LoGMH method, complemented by diverse augmentation techniques. Feature extraction employed DCNN, with subsequent utilization of extracted features to train various ML and TL models. Rigorous evaluation using traditional metrics revealed Inception-ResNet’s superior performance, surpassing other models with F1 scores of 96.07% and 95.89% for binary and multiclass classification, respectively. Our results notably surpass previous research, particularly in cases involving a higher number of classes. These findings promise to influence clinical decision support systems, guide future research, and potentially revolutionize cancer diagnostics beyond leukemia, impacting broader medical imaging and oncology domains.","PeriodicalId":72394,"journal":{"name":"BioMedInformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140785117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the Role of ChatGPT in Oncology: Providing Information and Support for Cancer Patients 探索 ChatGPT 在肿瘤学中的作用:为癌症患者提供信息和支持
Pub Date : 2024-03-25 DOI: 10.3390/biomedinformatics4020049
Maurizio Cè, Vittoria Chiarpenello, Alessandra Bubba, P. Felisaz, G. Oliva, Giovanni Irmici, M. Cellina
Introduction: Oncological patients face numerous challenges throughout their cancer journey while navigating complex medical information. The advent of AI-based conversational models like ChatGPT (San Francisco, OpenAI) represents an innovation in oncological patient management. Methods: We conducted a comprehensive review of the literature on the use of ChatGPT in providing tailored information and support to patients with various types of cancer, including head and neck, liver, prostate, breast, lung, pancreas, colon, and cervical cancer. Results and Discussion: Our findings indicate that, in most instances, ChatGPT responses were accurate, dependable, and aligned with the expertise of oncology professionals, especially for certain subtypes of cancers like head and neck and prostate cancers. Furthermore, the system demonstrated a remarkable ability to comprehend patients’ emotional responses and offer proactive solutions and advice. Nevertheless, these models have also showed notable limitations and cannot serve as a substitute for the role of a physician under any circumstances. Conclusions: Conversational models like ChatGPT can significantly enhance the overall well-being and empowerment of oncological patients. Both patients and healthcare providers must become well-versed in the advantages and limitations of these emerging technologies.
导言:肿瘤患者在整个癌症治疗过程中,在浏览复杂的医疗信息时面临着诸多挑战。像 ChatGPT(旧金山,OpenAI)这样基于人工智能的对话模型的出现,是肿瘤患者管理领域的一项创新。方法:我们对有关使用 ChatGPT 为头颈癌、肝癌、前列腺癌、乳腺癌、肺癌、胰腺癌、结肠癌和宫颈癌等各类癌症患者提供定制信息和支持的文献进行了全面回顾。结果与讨论:我们的研究结果表明,在大多数情况下,ChatGPT 的回复是准确、可靠的,并且与肿瘤学专业人员的专业知识相一致,尤其是对于头颈癌和前列腺癌等某些亚型癌症。此外,该系统在理解患者的情绪反应并提供积极的解决方案和建议方面也表现出了卓越的能力。不过,这些模型也显示出明显的局限性,在任何情况下都不能替代医生的作用。结论像 ChatGPT 这样的对话模式可以大大提高肿瘤患者的整体健康水平和能力。患者和医疗服务提供者都必须熟知这些新兴技术的优势和局限性。
{"title":"Exploring the Role of ChatGPT in Oncology: Providing Information and Support for Cancer Patients","authors":"Maurizio Cè, Vittoria Chiarpenello, Alessandra Bubba, P. Felisaz, G. Oliva, Giovanni Irmici, M. Cellina","doi":"10.3390/biomedinformatics4020049","DOIUrl":"https://doi.org/10.3390/biomedinformatics4020049","url":null,"abstract":"Introduction: Oncological patients face numerous challenges throughout their cancer journey while navigating complex medical information. The advent of AI-based conversational models like ChatGPT (San Francisco, OpenAI) represents an innovation in oncological patient management. Methods: We conducted a comprehensive review of the literature on the use of ChatGPT in providing tailored information and support to patients with various types of cancer, including head and neck, liver, prostate, breast, lung, pancreas, colon, and cervical cancer. Results and Discussion: Our findings indicate that, in most instances, ChatGPT responses were accurate, dependable, and aligned with the expertise of oncology professionals, especially for certain subtypes of cancers like head and neck and prostate cancers. Furthermore, the system demonstrated a remarkable ability to comprehend patients’ emotional responses and offer proactive solutions and advice. Nevertheless, these models have also showed notable limitations and cannot serve as a substitute for the role of a physician under any circumstances. Conclusions: Conversational models like ChatGPT can significantly enhance the overall well-being and empowerment of oncological patients. Both patients and healthcare providers must become well-versed in the advantages and limitations of these emerging technologies.","PeriodicalId":72394,"journal":{"name":"BioMedInformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140381646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing ANOVA and PowerShap Feature Selection Methods via Shapley Additive Explanations of Models of Mental Workload Built with the Theta and Alpha EEG Band Ratios 通过 Shapley Additive Explanations 对使用 Theta 和 Alpha 脑电图波段比建立的心理工作量模型进行方差分析和 PowerShap 特征选择方法比较
Pub Date : 2024-03-19 DOI: 10.3390/biomedinformatics4010048
Bujar Raufi, Luca Longo
Background: Creating models to differentiate self-reported mental workload perceptions is challenging and requires machine learning to identify features from EEG signals. EEG band ratios quantify human activity, but limited research on mental workload assessment exists. This study evaluates the use of theta-to-alpha and alpha-to-theta EEG band ratio features to distinguish human self-reported perceptions of mental workload. Methods: In this study, EEG data from 48 participants were analyzed while engaged in resting and task-intensive activities. Multiple mental workload indices were developed using different EEG channel clusters and band ratios. ANOVA’s F-score and PowerSHAP were used to extract the statistical features. At the same time, models were built and tested using techniques such as Logistic Regression, Gradient Boosting, and Random Forest. These models were then explained using Shapley Additive Explanations. Results: Based on the results, using PowerSHAP to select features led to improved model performance, exhibiting an accuracy exceeding 90% across three mental workload indexes. In contrast, statistical techniques for model building indicated poorer results across all mental workload indexes. Moreover, using Shapley values to evaluate feature contributions to the model output, it was noted that features rated low in importance by both ANOVA F-score and PowerSHAP measures played the most substantial role in determining the model output. Conclusions: Using models with Shapley values can reduce data complexity and improve the training of better discriminative models for perceived human mental workload. However, the outcomes can sometimes be unclear due to variations in the significance of features during the selection process and their actual impact on the model output.
背景:创建模型来区分自我报告的脑力劳动负荷感知具有挑战性,需要通过机器学习来识别脑电信号的特征。脑电图波段比可以量化人类活动,但有关脑力劳动负荷评估的研究却十分有限。本研究评估了使用θ-α和α-θ脑电图波段比特征来区分人类自我报告的脑力劳动负荷感知。研究方法本研究分析了 48 名参与者在休息和任务密集活动时的脑电图数据。利用不同的脑电图通道集群和频带比制定了多种脑力劳动负荷指数。使用方差分析的 F 分数和 PowerSHAP 提取统计特征。同时,利用逻辑回归、梯度提升和随机森林等技术建立并测试了模型。然后使用 Shapley Additive Explanations 对这些模型进行解释。结果根据结果,使用 PowerSHAP 选择特征提高了模型的性能,在三个脑力劳动负荷指数中的准确率超过了 90%。相比之下,用于建立模型的统计技术在所有脑力劳动负荷指标上的结果都较差。此外,使用 Shapley 值评估特征对模型输出的贡献时发现,方差分析 F 分数和 PowerSHAP 测量中被评为低重要性的特征在决定模型输出方面发挥了最重要的作用。结论使用带有 Shapley 值的模型可以降低数据的复杂性,并改进对感知人类脑力劳动负荷的判别模型的训练。然而,由于在选择过程中特征的重要性及其对模型输出的实际影响存在差异,有时结果可能并不明确。
{"title":"Comparing ANOVA and PowerShap Feature Selection Methods via Shapley Additive Explanations of Models of Mental Workload Built with the Theta and Alpha EEG Band Ratios","authors":"Bujar Raufi, Luca Longo","doi":"10.3390/biomedinformatics4010048","DOIUrl":"https://doi.org/10.3390/biomedinformatics4010048","url":null,"abstract":"Background: Creating models to differentiate self-reported mental workload perceptions is challenging and requires machine learning to identify features from EEG signals. EEG band ratios quantify human activity, but limited research on mental workload assessment exists. This study evaluates the use of theta-to-alpha and alpha-to-theta EEG band ratio features to distinguish human self-reported perceptions of mental workload. Methods: In this study, EEG data from 48 participants were analyzed while engaged in resting and task-intensive activities. Multiple mental workload indices were developed using different EEG channel clusters and band ratios. ANOVA’s F-score and PowerSHAP were used to extract the statistical features. At the same time, models were built and tested using techniques such as Logistic Regression, Gradient Boosting, and Random Forest. These models were then explained using Shapley Additive Explanations. Results: Based on the results, using PowerSHAP to select features led to improved model performance, exhibiting an accuracy exceeding 90% across three mental workload indexes. In contrast, statistical techniques for model building indicated poorer results across all mental workload indexes. Moreover, using Shapley values to evaluate feature contributions to the model output, it was noted that features rated low in importance by both ANOVA F-score and PowerSHAP measures played the most substantial role in determining the model output. Conclusions: Using models with Shapley values can reduce data complexity and improve the training of better discriminative models for perceived human mental workload. However, the outcomes can sometimes be unclear due to variations in the significance of features during the selection process and their actual impact on the model output.","PeriodicalId":72394,"journal":{"name":"BioMedInformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140229698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative Pre-Trained Transformer-Empowered Healthcare Conversations: Current Trends, Challenges, and Future Directions in Large Language Model-Enabled Medical Chatbots 生成式预训练转换器驱动的医疗保健对话:支持大型语言模型的医疗聊天机器人的当前趋势、挑战和未来方向
Pub Date : 2024-03-14 DOI: 10.3390/biomedinformatics4010047
J. Chow, Valerie Wong, Kay Li
This review explores the transformative integration of artificial intelligence (AI) and healthcare through conversational AI leveraging Natural Language Processing (NLP). Focusing on Large Language Models (LLMs), this paper navigates through various sections, commencing with an overview of AI’s significance in healthcare and the role of conversational AI. It delves into fundamental NLP techniques, emphasizing their facilitation of seamless healthcare conversations. Examining the evolution of LLMs within NLP frameworks, the paper discusses key models used in healthcare, exploring their advantages and implementation challenges. Practical applications in healthcare conversations, from patient-centric utilities like diagnosis and treatment suggestions to healthcare provider support systems, are detailed. Ethical and legal considerations, including patient privacy, ethical implications, and regulatory compliance, are addressed. The review concludes by spotlighting current challenges, envisaging future trends, and highlighting the transformative potential of LLMs and NLP in reshaping healthcare interactions.
本综述通过利用自然语言处理(NLP)的对话式人工智能,探讨人工智能(AI)与医疗保健的变革性融合。本文以大型语言模型(LLMs)为重点,通过多个部分展开论述,首先概述了人工智能在医疗保健领域的意义以及对话式人工智能的作用。本文深入探讨了基本的 NLP 技术,强调了这些技术对无缝医疗对话的促进作用。本文探讨了 NLP 框架中 LLM 的演变,讨论了医疗保健中使用的关键模型,探讨了它们的优势和实施挑战。本文详细介绍了医疗保健对话中的实际应用,从以患者为中心的实用工具(如诊断和治疗建议)到医疗保健提供者支持系统。此外,还讨论了伦理和法律方面的考虑因素,包括患者隐私、伦理影响和监管合规性。综述最后强调了当前的挑战,展望了未来的趋势,并突出了 LLM 和 NLP 在重塑医疗保健互动方面的变革潜力。
{"title":"Generative Pre-Trained Transformer-Empowered Healthcare Conversations: Current Trends, Challenges, and Future Directions in Large Language Model-Enabled Medical Chatbots","authors":"J. Chow, Valerie Wong, Kay Li","doi":"10.3390/biomedinformatics4010047","DOIUrl":"https://doi.org/10.3390/biomedinformatics4010047","url":null,"abstract":"This review explores the transformative integration of artificial intelligence (AI) and healthcare through conversational AI leveraging Natural Language Processing (NLP). Focusing on Large Language Models (LLMs), this paper navigates through various sections, commencing with an overview of AI’s significance in healthcare and the role of conversational AI. It delves into fundamental NLP techniques, emphasizing their facilitation of seamless healthcare conversations. Examining the evolution of LLMs within NLP frameworks, the paper discusses key models used in healthcare, exploring their advantages and implementation challenges. Practical applications in healthcare conversations, from patient-centric utilities like diagnosis and treatment suggestions to healthcare provider support systems, are detailed. Ethical and legal considerations, including patient privacy, ethical implications, and regulatory compliance, are addressed. The review concludes by spotlighting current challenges, envisaging future trends, and highlighting the transformative potential of LLMs and NLP in reshaping healthcare interactions.","PeriodicalId":72394,"journal":{"name":"BioMedInformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140244748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Overall Survival Time Estimation for Epithelioid Peritoneal Mesothelioma Patients from Whole-Slide Images 从全切片图像估算上皮样腹膜间皮瘤患者的总生存时间
Pub Date : 2024-03-13 DOI: 10.3390/biomedinformatics4010046
Kleanthis Marios Papadopoulos, P. Barmpoutis, Tania Stathaki, V. Kepenekian, Peggy Dartigues, S. Valmary-Degano, Claire Illac-Vauquelin, G. Avérous, A. Chevallier, M. Lavérriere, L. Villeneuve, Olivier Glehen, Sylvie Isaac, J. Hommell-Fontaine, Francois Ng Kee Kwong, N. Benzerdjeb
Background: The advent of Deep Learning initiated a new era in which neural networks relying solely on Whole-Slide Images can estimate the survival time of cancer patients. Remarkably, despite deep learning’s potential in this domain, no prior research has been conducted on image-based survival analysis specifically for peritoneal mesothelioma. Prior studies performed statistical analysis to identify disease factors impacting patients’ survival time. Methods: Therefore, we introduce MPeMSupervisedSurv, a Convolutional Neural Network designed to predict the survival time of patients diagnosed with this disease. We subsequently perform patient stratification based on factors such as their Peritoneal Cancer Index and on whether patients received chemotherapy treatment. Results: MPeMSupervisedSurv demonstrates improvements over comparable methods. Using our proposed model, we performed patient stratification to assess the impact of clinical variables on survival time. Notably, the inclusion of information regarding adjuvant chemotherapy significantly enhances the model’s predictive prowess. Conversely, repeating the process for other factors did not yield significant performance improvements. Conclusions: Overall, MPeMSupervisedSurv is an effective neural network which can predict the survival time of peritoneal mesothelioma patients. Our findings also indicate that treatment by adjuvant chemotherapy could be a factor affecting survival time.
背景:深度学习(Deep Learning)的出现开创了一个新时代,在这个时代中,神经网络只需依靠整体滑动图像就能估算出癌症患者的生存时间。值得注意的是,尽管深度学习在这一领域大有可为,但此前还没有专门针对腹膜间皮瘤进行基于图像的生存分析研究。之前的研究进行了统计分析,以确定影响患者生存时间的疾病因素。方法:因此,我们引入了 MPeMSupervisedSurv,这是一种卷积神经网络,旨在预测确诊为该疾病的患者的生存时间。随后,我们根据腹膜癌指数和患者是否接受化疗等因素对患者进行分层。结果与同类方法相比,MPeMSupervisedSurv 有所改进。利用我们提出的模型,我们对患者进行了分层,以评估临床变量对生存时间的影响。值得注意的是,加入辅助化疗的相关信息能显著提高模型的预测能力。相反,对其他因素重复这一过程并不能显著提高性能。结论总的来说,MPeMSupervisedSurv 是一种有效的神经网络,可以预测腹膜间皮瘤患者的生存时间。我们的研究结果还表明,辅助化疗可能是影响生存时间的一个因素。
{"title":"Overall Survival Time Estimation for Epithelioid Peritoneal Mesothelioma Patients from Whole-Slide Images","authors":"Kleanthis Marios Papadopoulos, P. Barmpoutis, Tania Stathaki, V. Kepenekian, Peggy Dartigues, S. Valmary-Degano, Claire Illac-Vauquelin, G. Avérous, A. Chevallier, M. Lavérriere, L. Villeneuve, Olivier Glehen, Sylvie Isaac, J. Hommell-Fontaine, Francois Ng Kee Kwong, N. Benzerdjeb","doi":"10.3390/biomedinformatics4010046","DOIUrl":"https://doi.org/10.3390/biomedinformatics4010046","url":null,"abstract":"Background: The advent of Deep Learning initiated a new era in which neural networks relying solely on Whole-Slide Images can estimate the survival time of cancer patients. Remarkably, despite deep learning’s potential in this domain, no prior research has been conducted on image-based survival analysis specifically for peritoneal mesothelioma. Prior studies performed statistical analysis to identify disease factors impacting patients’ survival time. Methods: Therefore, we introduce MPeMSupervisedSurv, a Convolutional Neural Network designed to predict the survival time of patients diagnosed with this disease. We subsequently perform patient stratification based on factors such as their Peritoneal Cancer Index and on whether patients received chemotherapy treatment. Results: MPeMSupervisedSurv demonstrates improvements over comparable methods. Using our proposed model, we performed patient stratification to assess the impact of clinical variables on survival time. Notably, the inclusion of information regarding adjuvant chemotherapy significantly enhances the model’s predictive prowess. Conversely, repeating the process for other factors did not yield significant performance improvements. Conclusions: Overall, MPeMSupervisedSurv is an effective neural network which can predict the survival time of peritoneal mesothelioma patients. Our findings also indicate that treatment by adjuvant chemotherapy could be a factor affecting survival time.","PeriodicalId":72394,"journal":{"name":"BioMedInformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140247976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Effect of Data Missingness on Machine Learning Predictions of Uncontrolled Diabetes Using All of Us Data 数据缺失对机器学习利用所有人数据预测糖尿病失控的影响
Pub Date : 2024-03-06 DOI: 10.3390/biomedinformatics4010043
Zain Jabbar, Peter Washington
Electronic Health Records (EHR) provide a vast amount of patient data that are relevant to predicting clinical outcomes. The inherent presence of missing values poses challenges to building performant machine learning models. This paper aims to investigate the effect of various imputation methods on the National Institutes of Health’s All of Us dataset, a dataset containing a high degree of data missingness. We apply several imputation techniques such as mean substitution, constant filling, and multiple imputation on the same dataset for the task of diabetes prediction. We find that imputing values causes heteroskedastic performance for machine learning models with increased data missingness. That is, the more missing values a patient has for their tests, the higher variance there is on a diabetes model AUROC, F1, precision, recall, and accuracy scores. This highlights a critical challenge in using EHR data for predictive modeling. This work highlights the need for future research to develop methodologies to mitigate the effects of missing data and heteroskedasticity in EHR-based predictive models.
电子健康记录(EHR)提供了大量与预测临床结果相关的患者数据。缺失值的固有存在给建立性能良好的机器学习模型带来了挑战。本文旨在研究各种估算方法对美国国立卫生研究院的 "All of Us "数据集的影响。我们在同一数据集上应用了几种归因技术,如均值替换、常数填充和多重归因,以完成糖尿病预测任务。我们发现,随着数据缺失度的增加,估算值会导致机器学习模型的异方差性能。也就是说,患者测试的缺失值越多,糖尿病模型的 AUROC、F1、精确度、召回率和准确度得分的方差就越大。这凸显了使用电子病历数据进行预测建模的一个关键挑战。这项工作凸显了未来研究的必要性,即在基于电子病历的预测模型中开发减轻缺失数据和异方差影响的方法。
{"title":"The Effect of Data Missingness on Machine Learning Predictions of Uncontrolled Diabetes Using All of Us Data","authors":"Zain Jabbar, Peter Washington","doi":"10.3390/biomedinformatics4010043","DOIUrl":"https://doi.org/10.3390/biomedinformatics4010043","url":null,"abstract":"Electronic Health Records (EHR) provide a vast amount of patient data that are relevant to predicting clinical outcomes. The inherent presence of missing values poses challenges to building performant machine learning models. This paper aims to investigate the effect of various imputation methods on the National Institutes of Health’s All of Us dataset, a dataset containing a high degree of data missingness. We apply several imputation techniques such as mean substitution, constant filling, and multiple imputation on the same dataset for the task of diabetes prediction. We find that imputing values causes heteroskedastic performance for machine learning models with increased data missingness. That is, the more missing values a patient has for their tests, the higher variance there is on a diabetes model AUROC, F1, precision, recall, and accuracy scores. This highlights a critical challenge in using EHR data for predictive modeling. This work highlights the need for future research to develop methodologies to mitigate the effects of missing data and heteroskedasticity in EHR-based predictive models.","PeriodicalId":72394,"journal":{"name":"BioMedInformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140261449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning Models and Technologies for Evidence-Based Telehealth and Smart Care: A Review 基于证据的远程医疗和智能护理的机器学习模型和技术:综述
Pub Date : 2024-03-04 DOI: 10.3390/biomedinformatics4010042
Stella C. Christopoulou
Background: Over the past few years, clinical studies have utilized machine learning in telehealth and smart care for disease management, self-management, and managing health issues like pulmonary diseases, heart failure, diabetes screening, and intraoperative risks. However, a systematic review of machine learning’s use in evidence-based telehealth and smart care is lacking, as evidence-based practice aims to eliminate biases and subjective opinions. Methods: The author conducted a mixed methods review to explore machine learning applications in evidence-based telehealth and smart care. A systematic search of the literature was performed during 16 June 2023–27 June 2023 in Google Scholar, PubMed, and the clinical registry platform ClinicalTrials.gov. The author included articles in the review if they were implemented by evidence-based health informatics and concerned with telehealth and smart care technologies. Results: The author identifies 18 key studies (17 clinical trials) from 175 citations found in internet databases and categorizes them using problem-specific groupings, medical/health domains, machine learning models, algorithms, and techniques. Conclusions: Machine learning combined with the application of evidence-based practices in healthcare can enhance telehealth and smart care strategies by improving quality of personalized care, early detection of health-related problems, patient quality of life, patient-physician communication, resource efficiency and cost-effectiveness. However, this requires interdisciplinary expertise and collaboration among stakeholders, including clinicians, informaticians, and policymakers. Therefore, further research using clinicall studies, systematic reviews, analyses, and meta-analyses is required to fully exploit the potential of machine learning in this area.
背景:在过去几年中,临床研究已将机器学习应用于远程医疗和智能护理中的疾病管理、自我管理以及肺部疾病、心力衰竭、糖尿病筛查和术中风险等健康问题的管理。然而,由于循证实践旨在消除偏见和主观意见,因此缺乏对机器学习在循证远程医疗和智能护理中应用的系统性回顾。方法:作者采用混合方法综述了机器学习在循证远程医疗和智能护理中的应用。作者于 2023 年 6 月 16 日至 2023 年 6 月 27 日期间在谷歌学术、PubMed 和临床注册平台 ClinicalTrials.gov 上对文献进行了系统检索。作者将循证健康信息学实施的、与远程医疗和智能护理技术相关的文章纳入了综述。结果:作者从互联网数据库中找到的 175 篇引文中确定了 18 项关键研究(17 项临床试验),并使用特定问题分组、医疗/健康领域、机器学习模型、算法和技术对其进行了分类。结论:机器学习与循证实践在医疗保健领域的应用相结合,可以提高个性化护理的质量、健康相关问题的早期发现、患者的生活质量、医患沟通、资源效率和成本效益,从而加强远程医疗和智能护理战略。然而,这需要跨学科的专业知识以及包括临床医生、信息学家和政策制定者在内的利益相关者之间的合作。因此,需要利用临床研究、系统回顾、分析和荟萃分析开展进一步研究,以充分挖掘机器学习在这一领域的潜力。
{"title":"Machine Learning Models and Technologies for Evidence-Based Telehealth and Smart Care: A Review","authors":"Stella C. Christopoulou","doi":"10.3390/biomedinformatics4010042","DOIUrl":"https://doi.org/10.3390/biomedinformatics4010042","url":null,"abstract":"Background: Over the past few years, clinical studies have utilized machine learning in telehealth and smart care for disease management, self-management, and managing health issues like pulmonary diseases, heart failure, diabetes screening, and intraoperative risks. However, a systematic review of machine learning’s use in evidence-based telehealth and smart care is lacking, as evidence-based practice aims to eliminate biases and subjective opinions. Methods: The author conducted a mixed methods review to explore machine learning applications in evidence-based telehealth and smart care. A systematic search of the literature was performed during 16 June 2023–27 June 2023 in Google Scholar, PubMed, and the clinical registry platform ClinicalTrials.gov. The author included articles in the review if they were implemented by evidence-based health informatics and concerned with telehealth and smart care technologies. Results: The author identifies 18 key studies (17 clinical trials) from 175 citations found in internet databases and categorizes them using problem-specific groupings, medical/health domains, machine learning models, algorithms, and techniques. Conclusions: Machine learning combined with the application of evidence-based practices in healthcare can enhance telehealth and smart care strategies by improving quality of personalized care, early detection of health-related problems, patient quality of life, patient-physician communication, resource efficiency and cost-effectiveness. However, this requires interdisciplinary expertise and collaboration among stakeholders, including clinicians, informaticians, and policymakers. Therefore, further research using clinicall studies, systematic reviews, analyses, and meta-analyses is required to fully exploit the potential of machine learning in this area.","PeriodicalId":72394,"journal":{"name":"BioMedInformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140079600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Forecasting Survival Rates in Metastatic Colorectal Cancer Patients Undergoing Bevacizumab-Based Chemotherapy: A Machine Learning Approach 预测接受贝伐单抗化疗的转移性结直肠癌患者的生存率:一种机器学习方法
Pub Date : 2024-03-02 DOI: 10.3390/biomedinformatics4010041
Sergio Sánchez-Herrero, Abtin Tondar, E. Pérez-Bernabeu, Laura Calvet, Angel A. Juan
Background: Antibiotics can play a pivotal role in the treatment of colorectal cancer (CRC) at various stages of the disease, both directly and indirectly. Identifying novel patterns of antibiotic effects or responses in CRC within extensive medical data poses a significant challenge that can be addressed through algorithmic approaches. Machine Learning (ML) emerges as a promising solution for predicting clinical outcomes using clinical and heterogeneous cancer data. In the pursuit of our objective, we employed ML techniques for predicting CRC mortality and antibiotic influence. Methods: We utilized a dataset to examine the accuracy of death prediction in metastatic colorectal cancer. In addition, we analyzed the association between antibiotic exposure and mortality in metastatic colorectal cancer. The dataset comprised 147 patients, nineteen independent variables, and one dependent variable. Our analysis involved testing different classification-supervised ML, including an oversampling pool for classification models, Logistic Regression, Decision Trees, Naive Bayes, Support Vector Machine, Random Forest, XGBboost Classifier, a consensus of all models, and a consensus of top models (meta models). Results: The consensus of the top models’ classifier exhibited the highest accuracy among the algorithms tested (93%). This model met the standards for good accuracy, surpassing the 90% threshold considered useful in ML applications. Consistent with the accuracy results, other metrics are also good, including precision (0.96), recall (0.93), F-Beta (0.94), and AUC (0.93). Hazard ratio analysis suggests that there is no discernible difference between patients who received antibiotics and those who did not. Conclusions: Our modelling approach provides an alternative for analyzing and predicting the relationship between antibiotics and mortality in metastatic colorectal cancer patients treated with bevacizumab, complementing classic statistical methods. This methodology lays the groundwork for future use of datasets in cancer treatment research and highlights the advantages of meta models.
背景:抗生素在结直肠癌(CRC)治疗的各个阶段都能直接或间接地发挥关键作用。在大量医疗数据中识别抗生素对 CRC 的影响或反应的新模式是一项重大挑战,可通过算法方法加以解决。机器学习(ML)是利用临床和异构癌症数据预测临床结果的一种有前途的解决方案。为了实现我们的目标,我们采用了 ML 技术来预测 CRC 死亡率和抗生素的影响。方法我们利用一个数据集来检验转移性结直肠癌死亡预测的准确性。此外,我们还分析了抗生素暴露与转移性结直肠癌死亡率之间的关联。数据集包括 147 名患者、19 个自变量和 1 个因变量。我们的分析涉及测试不同的分类监督 ML,包括分类模型的超采样池、逻辑回归、决策树、Naive Bayes、支持向量机、随机森林、XGBboost 分类器、所有模型的共识以及顶级模型的共识(元模型)。结果:在所测试的算法中,顶级模型分类器共识的准确率最高(93%)。该模型达到了良好准确率的标准,超过了 90% 的阈值,在 ML 应用中被认为是有用的。与准确率结果一致,其他指标也很好,包括精确度(0.96)、召回率(0.93)、F-Beta(0.94)和 AUC(0.93)。危险比分析表明,接受抗生素治疗的患者与未接受抗生素治疗的患者之间没有明显差异。结论:我们的建模方法为分析和预测接受贝伐珠单抗治疗的转移性结直肠癌患者抗生素与死亡率之间的关系提供了一种替代方法,是对传统统计方法的补充。这种方法为今后在癌症治疗研究中使用数据集奠定了基础,并凸显了元模型的优势。
{"title":"Forecasting Survival Rates in Metastatic Colorectal Cancer Patients Undergoing Bevacizumab-Based Chemotherapy: A Machine Learning Approach","authors":"Sergio Sánchez-Herrero, Abtin Tondar, E. Pérez-Bernabeu, Laura Calvet, Angel A. Juan","doi":"10.3390/biomedinformatics4010041","DOIUrl":"https://doi.org/10.3390/biomedinformatics4010041","url":null,"abstract":"Background: Antibiotics can play a pivotal role in the treatment of colorectal cancer (CRC) at various stages of the disease, both directly and indirectly. Identifying novel patterns of antibiotic effects or responses in CRC within extensive medical data poses a significant challenge that can be addressed through algorithmic approaches. Machine Learning (ML) emerges as a promising solution for predicting clinical outcomes using clinical and heterogeneous cancer data. In the pursuit of our objective, we employed ML techniques for predicting CRC mortality and antibiotic influence. Methods: We utilized a dataset to examine the accuracy of death prediction in metastatic colorectal cancer. In addition, we analyzed the association between antibiotic exposure and mortality in metastatic colorectal cancer. The dataset comprised 147 patients, nineteen independent variables, and one dependent variable. Our analysis involved testing different classification-supervised ML, including an oversampling pool for classification models, Logistic Regression, Decision Trees, Naive Bayes, Support Vector Machine, Random Forest, XGBboost Classifier, a consensus of all models, and a consensus of top models (meta models). Results: The consensus of the top models’ classifier exhibited the highest accuracy among the algorithms tested (93%). This model met the standards for good accuracy, surpassing the 90% threshold considered useful in ML applications. Consistent with the accuracy results, other metrics are also good, including precision (0.96), recall (0.93), F-Beta (0.94), and AUC (0.93). Hazard ratio analysis suggests that there is no discernible difference between patients who received antibiotics and those who did not. Conclusions: Our modelling approach provides an alternative for analyzing and predicting the relationship between antibiotics and mortality in metastatic colorectal cancer patients treated with bevacizumab, complementing classic statistical methods. This methodology lays the groundwork for future use of datasets in cancer treatment research and highlights the advantages of meta models.","PeriodicalId":72394,"journal":{"name":"BioMedInformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140081693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
BioMedInformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1