Pub Date : 2024-10-16DOI: 10.1109/JBHI.2024.3481412
Dongyuan Wu, Liming Nie, Rao Asad Mumtaz, Kadambri Agarwal
The application of computer vision-powered large language models (LLMs) for medical image diagnosis has significantly advanced healthcare systems. Recent progress in developing symmetrical architectures has greatly impacted various medical imaging tasks. While CNNs or RNNs have demonstrated excellent performance, these architectures often face notable limitations of substantial losses in detailed information, such as requiring to capture global semantic information effectively and relying heavily on deep encoders and aggressive downsampling. This paper introduces a novel LLM-based Hybrid-Transformer Network (HybridTransNet) designed to encode tokenized Big Data patches with the transformer mechanism, which elegantly embeds multimodal data of varying sizes as token sequence inputs of LLMS. Subsequently, the network performs both inter-scale and intra-scale self-attention, processing data features through a transformer-based symmetric architecture with a refining module, which facilitates accurately recovering both local and global context information. Additionally, the output is refined using a novel fuzzy selector. Compared to other existing methods on two distinct datasets, the experimental findings and formal assessment demonstrate that our LLM-based HybridTransNet provides superior performance for brain tumor diagnosis in healthcare informatics.
{"title":"A LLM-Based Hybrid-Transformer Diagnosis System in Healthcare.","authors":"Dongyuan Wu, Liming Nie, Rao Asad Mumtaz, Kadambri Agarwal","doi":"10.1109/JBHI.2024.3481412","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3481412","url":null,"abstract":"<p><p>The application of computer vision-powered large language models (LLMs) for medical image diagnosis has significantly advanced healthcare systems. Recent progress in developing symmetrical architectures has greatly impacted various medical imaging tasks. While CNNs or RNNs have demonstrated excellent performance, these architectures often face notable limitations of substantial losses in detailed information, such as requiring to capture global semantic information effectively and relying heavily on deep encoders and aggressive downsampling. This paper introduces a novel LLM-based Hybrid-Transformer Network (HybridTransNet) designed to encode tokenized Big Data patches with the transformer mechanism, which elegantly embeds multimodal data of varying sizes as token sequence inputs of LLMS. Subsequently, the network performs both inter-scale and intra-scale self-attention, processing data features through a transformer-based symmetric architecture with a refining module, which facilitates accurately recovering both local and global context information. Additionally, the output is refined using a novel fuzzy selector. Compared to other existing methods on two distinct datasets, the experimental findings and formal assessment demonstrate that our LLM-based HybridTransNet provides superior performance for brain tumor diagnosis in healthcare informatics.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brain tumors are fatal and severely disrupt brain function as they advance. Timely detection and precise monitoring are crucial for improving patient outcomes and survival. A smart healthcare system leveraging the Internet of Medical Things (IoMT) revolutionizes patient care by offering streamlined remote healthcare, especially for individuals with acute medical conditions like brain tumors. However, such systems face significant challenges, such as (1) the increasing prevalence of cyber attacks in the expanding digital healthcare landscape, and (2) the lack of reliability and accuracy in existing tumor detection methods. To address these issues, we propose Secured Brain Tumor Detection (SBTD), the first unified system integrating IoMT with secure tumor detection. SBTD features: (1) a robust security framework, grounded in chaos theory, to safeguard medical data; and (2) a reliable machine learning-based tumor detection framework that accurately localizes tumors using their anatomy. Comprehensive experimental evaluations on different multimodal MRI datasets demonstrate the system's suitability, clinical applicability and superior performance over state-of-the-art algorithms.
{"title":"SBTD: Secured Brain Tumor Detection in IoMT Enabled Smart Healthcare.","authors":"Nishtha Tomar, Parkala Vishnu Bharadwaj Bayari, Gaurav Bhatnagar","doi":"10.1109/JBHI.2024.3482465","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3482465","url":null,"abstract":"<p><p>Brain tumors are fatal and severely disrupt brain function as they advance. Timely detection and precise monitoring are crucial for improving patient outcomes and survival. A smart healthcare system leveraging the Internet of Medical Things (IoMT) revolutionizes patient care by offering streamlined remote healthcare, especially for individuals with acute medical conditions like brain tumors. However, such systems face significant challenges, such as (1) the increasing prevalence of cyber attacks in the expanding digital healthcare landscape, and (2) the lack of reliability and accuracy in existing tumor detection methods. To address these issues, we propose Secured Brain Tumor Detection (SBTD), the first unified system integrating IoMT with secure tumor detection. SBTD features: (1) a robust security framework, grounded in chaos theory, to safeguard medical data; and (2) a reliable machine learning-based tumor detection framework that accurately localizes tumors using their anatomy. Comprehensive experimental evaluations on different multimodal MRI datasets demonstrate the system's suitability, clinical applicability and superior performance over state-of-the-art algorithms.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vignetting constitutes a prevalent optical degradation that significantly compromises the quality of biomedical microscopic imaging. However, a robust and efficient vignetting correction methodology in multi-channel microscopic images remains absent at present. In this paper, we take advantage of a prior knowledge about the homogeneity of microscopic images and radial attenuation property of vignetting to develop a self-supervised deep learning algorithm that achieves complex vignetting removal in color microscopic images. Our proposed method, vignetting correction lookup table (VCLUT), is trainable on both single and multiple images, which employs adversarial learning to effectively transfer good imaging conditions from the user visually defined central region of its own light field to the entire image. To illustrate its effectiveness, we performed individual correction experiments on data from five distinct biological specimens. The results demonstrate that VCLUT exhibits enhanced performance compared to classical methods. We further examined its performance as a multi-image-based approach on a pathological dataset, revealing its advantage over other stateof-the-art approaches in both qualitative and quantitative measurements. Moreover, it uniquely possesses the capacity for generalization across various levels of vignetting intensity and an ultra-fast model computation capability, rendering it well-suited for integration into high-throughput imaging pipelines of digital microscopy.
{"title":"Prior Visual-guided Self-supervised Learning Enables Color Vignetting Correction for High-throughput Microscopic Imaging.","authors":"Jianhang Wang, Tianyu Ma, Luhong Jin, Yunqi Zhu, Jiahui Yu, Feng Chen, Shujun Fu, Yingke Xu","doi":"10.1109/JBHI.2024.3471907","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3471907","url":null,"abstract":"<p><p>Vignetting constitutes a prevalent optical degradation that significantly compromises the quality of biomedical microscopic imaging. However, a robust and efficient vignetting correction methodology in multi-channel microscopic images remains absent at present. In this paper, we take advantage of a prior knowledge about the homogeneity of microscopic images and radial attenuation property of vignetting to develop a self-supervised deep learning algorithm that achieves complex vignetting removal in color microscopic images. Our proposed method, vignetting correction lookup table (VCLUT), is trainable on both single and multiple images, which employs adversarial learning to effectively transfer good imaging conditions from the user visually defined central region of its own light field to the entire image. To illustrate its effectiveness, we performed individual correction experiments on data from five distinct biological specimens. The results demonstrate that VCLUT exhibits enhanced performance compared to classical methods. We further examined its performance as a multi-image-based approach on a pathological dataset, revealing its advantage over other stateof-the-art approaches in both qualitative and quantitative measurements. Moreover, it uniquely possesses the capacity for generalization across various levels of vignetting intensity and an ultra-fast model computation capability, rendering it well-suited for integration into high-throughput imaging pipelines of digital microscopy.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-16DOI: 10.1109/JBHI.2024.3481505
Eunbin Park, Youngjoo Lee
This paper addresses the critical need for elctrocardiogram (ECG) classifier architectures that balance high classification performance with robust privacy protection against membership inference attacks (MIA). We introduce a comprehensive approach that innovates in both machine learning efficacy and privacy preservation. Key contributions include the development of a privacy estimator to quantify and mitigate privacy leakage in neural network architectures used for ECG classification. Utilizing this privacy estimator, we propose mDARTS (searching MLbased ECG classifier against MIA), integrating MIA's attack loss into the architecture search process to identify architectures that are both accurate and resilient to MIA threats. Our method achieves significant improvements, with an ECG classification accuracy of 92.1% and a lower privacy score of 54.3%, indicating reduced potential for sensitive information leakage. Heuristic experiments refine architecture search parameters specifically for ECG classification, enhancing classifier performance and privacy scores by up to 3.0% and 1.0%, respectively. The framework's adaptability supports user customization, enabling the extraction of architectures that meet specific criteria such as optimal classification performance with minimal privacy risk. By focusing on the intersection of high-performance ECG classification and the mitigation of privacy risks associated with MIA, our study offers a pioneering solution addressing the limitations of previous approaches.
本文探讨了心电图(ECG)分类器架构的关键需求,这种架构既能兼顾高分类性能,又能保护隐私免受成员推理攻击(MIA)。我们介绍了一种在机器学习效率和隐私保护方面都有所创新的综合方法。主要贡献包括开发了一种隐私估算器,用于量化和减轻用于心电图分类的神经网络架构中的隐私泄露。利用这种隐私估算器,我们提出了 mDARTS(搜索基于 ML 的心电图分类器以对抗 MIA),将 MIA 的攻击损失整合到架构搜索过程中,以识别既准确又能抵御 MIA 威胁的架构。我们的方法取得了重大改进,心电图分类准确率达到 92.1%,隐私得分降低了 54.3%,这表明敏感信息泄漏的可能性降低了。启发式实验改进了专门针对心电图分类的架构搜索参数,使分类器性能和隐私得分分别提高了 3.0% 和 1.0%。该框架的适应性支持用户定制,能够提取符合特定标准的架构,如最佳分类性能和最小隐私风险。通过关注高性能心电图分类与降低与 MIA 相关的隐私风险的交叉点,我们的研究提供了一种开创性的解决方案,解决了以往方法的局限性。
{"title":"mDARTS: Searching ML-Based ECG Classifiers against Membership Inference Attacks.","authors":"Eunbin Park, Youngjoo Lee","doi":"10.1109/JBHI.2024.3481505","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3481505","url":null,"abstract":"<p><p>This paper addresses the critical need for elctrocardiogram (ECG) classifier architectures that balance high classification performance with robust privacy protection against membership inference attacks (MIA). We introduce a comprehensive approach that innovates in both machine learning efficacy and privacy preservation. Key contributions include the development of a privacy estimator to quantify and mitigate privacy leakage in neural network architectures used for ECG classification. Utilizing this privacy estimator, we propose mDARTS (searching MLbased ECG classifier against MIA), integrating MIA's attack loss into the architecture search process to identify architectures that are both accurate and resilient to MIA threats. Our method achieves significant improvements, with an ECG classification accuracy of 92.1% and a lower privacy score of 54.3%, indicating reduced potential for sensitive information leakage. Heuristic experiments refine architecture search parameters specifically for ECG classification, enhancing classifier performance and privacy scores by up to 3.0% and 1.0%, respectively. The framework's adaptability supports user customization, enabling the extraction of architectures that meet specific criteria such as optimal classification performance with minimal privacy risk. By focusing on the intersection of high-performance ECG classification and the mitigation of privacy risks associated with MIA, our study offers a pioneering solution addressing the limitations of previous approaches.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-16DOI: 10.1109/JBHI.2024.3482001
Jinwei Liu, Yashu Xu, Yi Liu, Huating Luo, Wenxiang Huang, Lizhong Yao
Predicting the progression from mild cognitive impairment (MCI) to Alzheimer's disease (AD) is critical for early intervention. Towards this end, various deep learning models have been applied in this domain, typically relying on structural magnetic resonance imaging (sMRI) data from a single time point whereas neglecting the dynamic changes in brain structure over time. Current longitudinal studies inadequately explore disease evolution dynamics and are burdened by high computational complexity. This paper introduces a novel lightweight 3D convolutional neural network specifically designed to capture the evolution of brain diseases for modeling the progression of MCI. First, a longitudinal lesion feature selection strategy is proposed to extract core features from temporal data, facilitating the detection of subtle differences in brain structure between two time points. Next, to refine the model for a more concentrated emphasis on lesion features, a disease trend attention mechanism is introduced to learn the dependencies between overall disease trends and local variation features. Finally, disease prediction visualization techniques are employed to improve the interpretability of the final predictions. Extensive experiments demonstrate that the proposed model achieves state-of-the-art performance in terms of area under the curve (AUC), accuracy, specificity, precision, and F1 score. This study confirms the efficacy of our early diagnostic method, utilizing only two follow-up sMRI scans to predict the disease status of MCI patients 24 months later with an AUC of 79.03%.
{"title":"Attention-guided 3D CNN With Lesion Feature Selection for Early Alzheimer's Disease Prediction Using Longitudinal sMRI.","authors":"Jinwei Liu, Yashu Xu, Yi Liu, Huating Luo, Wenxiang Huang, Lizhong Yao","doi":"10.1109/JBHI.2024.3482001","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3482001","url":null,"abstract":"<p><p>Predicting the progression from mild cognitive impairment (MCI) to Alzheimer's disease (AD) is critical for early intervention. Towards this end, various deep learning models have been applied in this domain, typically relying on structural magnetic resonance imaging (sMRI) data from a single time point whereas neglecting the dynamic changes in brain structure over time. Current longitudinal studies inadequately explore disease evolution dynamics and are burdened by high computational complexity. This paper introduces a novel lightweight 3D convolutional neural network specifically designed to capture the evolution of brain diseases for modeling the progression of MCI. First, a longitudinal lesion feature selection strategy is proposed to extract core features from temporal data, facilitating the detection of subtle differences in brain structure between two time points. Next, to refine the model for a more concentrated emphasis on lesion features, a disease trend attention mechanism is introduced to learn the dependencies between overall disease trends and local variation features. Finally, disease prediction visualization techniques are employed to improve the interpretability of the final predictions. Extensive experiments demonstrate that the proposed model achieves state-of-the-art performance in terms of area under the curve (AUC), accuracy, specificity, precision, and F1 score. This study confirms the efficacy of our early diagnostic method, utilizing only two follow-up sMRI scans to predict the disease status of MCI patients 24 months later with an AUC of 79.03%.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-15DOI: 10.1109/JBHI.2024.3481005
Baolian Shan, Haiqing Yu, Yongzhi Huang, Minpeng Xu, Dong Ming
Currently, spatiotemporal convolutional neural networks (CNNs) for electroencephalogram (EEG) signals have emerged as promising tools for seizure prediction (SP), which explore the spatiotemporal biomarkers in an epileptic brain. Generally, these CNNs capture spatiotemporal features at single spectral resolution. However, epileptiform EEG signals contain irregular neural oscillations of different frequencies in different brain regions. Therefore, it may be underperforming and uninterpretable for the CNNs without capturing complex spectral properties sufficiently. This study proposed a novel interpretable multi-branch architecture for spatiotemporal CNNs, namely MultiSincNet. On the one hand, the MultiSincNet could directly show the frequency boundaries using the interpretable sinc-convolution layers. On the other hand, it could extract and integrate multiple spatiotemporal features across varying spectral resolutions using parallel branches. Moreover, we also constructed a post-hoc explanation technique for multi-branch CNNs, using the first-order Taylor expansion and chain rule based on the multivariate composite function, which demonstrates the crucial spatiotemporal features learned by the proposed multi-branch spatiotemporal CNN. When combined with the optimal MultiSincNet, ShallowConvNet, DeepConvNet, and EEGWaveNet had significantly improved the subject-specific performance on most metrics. Specifically, the optimal MultiSincNet significantly increased the average accuracy, sensitivity, specificity, binary F1-score, weighted F1-score, and AUC of EEGWaveNet by about 7%, 8%, 7%, 8%, 7%, and 7%, respectively. Besides, the visualization results showed that the optimal model mainly extracts the spectral energy difference from the high gamma band focalized to specific spatial areas as the dominant spatiotemporal EEG feature.
{"title":"Interpretable Multi-Branch Architecture for Spatiotemporal Neural Networks and Its Application in Seizure Prediction.","authors":"Baolian Shan, Haiqing Yu, Yongzhi Huang, Minpeng Xu, Dong Ming","doi":"10.1109/JBHI.2024.3481005","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3481005","url":null,"abstract":"<p><p>Currently, spatiotemporal convolutional neural networks (CNNs) for electroencephalogram (EEG) signals have emerged as promising tools for seizure prediction (SP), which explore the spatiotemporal biomarkers in an epileptic brain. Generally, these CNNs capture spatiotemporal features at single spectral resolution. However, epileptiform EEG signals contain irregular neural oscillations of different frequencies in different brain regions. Therefore, it may be underperforming and uninterpretable for the CNNs without capturing complex spectral properties sufficiently. This study proposed a novel interpretable multi-branch architecture for spatiotemporal CNNs, namely MultiSincNet. On the one hand, the MultiSincNet could directly show the frequency boundaries using the interpretable sinc-convolution layers. On the other hand, it could extract and integrate multiple spatiotemporal features across varying spectral resolutions using parallel branches. Moreover, we also constructed a post-hoc explanation technique for multi-branch CNNs, using the first-order Taylor expansion and chain rule based on the multivariate composite function, which demonstrates the crucial spatiotemporal features learned by the proposed multi-branch spatiotemporal CNN. When combined with the optimal MultiSincNet, ShallowConvNet, DeepConvNet, and EEGWaveNet had significantly improved the subject-specific performance on most metrics. Specifically, the optimal MultiSincNet significantly increased the average accuracy, sensitivity, specificity, binary F1-score, weighted F1-score, and AUC of EEGWaveNet by about 7%, 8%, 7%, 8%, 7%, and 7%, respectively. Besides, the visualization results showed that the optimal model mainly extracts the spectral energy difference from the high gamma band focalized to specific spatial areas as the dominant spatiotemporal EEG feature.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-15DOI: 10.1109/JBHI.2024.3481012
Qing Zhang, Dan Shao, Lin Lin, Guoliang Gong, Rui Xu, Shoji Kido, HongWei Cui
In the field of diagnosing lung diseases, the application of neural networks (NNs) in image classification exhibits significant potential. However, NNs are considered "black boxes," making it difficult to discern their decision-making processes, thereby leading to skepticism and concern regarding NNs. This compromises model reliability and hampers intelligent medicine's development. To tackle this issue, we introduce the Evolutionary Neural Architecture Search (EvoNAS). In image classification tasks, EvoNAS initially utilizes an Evolutionary Algorithm to explore various Convolutional Neural Networks, ultimately yielding an optimized network that excels at separating between redundant texture features and the most discriminative ones. Retaining the most discriminative features improves classification accuracy, particularly in distinguishing similar features. This approach illuminates the intrinsic mechanics of classification, thereby enhancing the accuracy of the results. Subsequently, we incorporate a Differential Evolution algorithm based on distribution estimation, significantly enhancing search efficiency. Employing visualization techniques, we demonstrate the effectiveness of EvoNAS, endowing the model with interpretability. Finally, we conduct experiments on the diffuse lung disease texture dataset using EvoNAS. Compared to the original network, the classification accuracy increases by 0.56%. Moreover, our EvoNAS approach demonstrates significant advantages over existing methods in the same dataset.
{"title":"Feature Separation in Diffuse Lung Disease Image Classification by Using Evolutionary Algorithm-Based NAS.","authors":"Qing Zhang, Dan Shao, Lin Lin, Guoliang Gong, Rui Xu, Shoji Kido, HongWei Cui","doi":"10.1109/JBHI.2024.3481012","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3481012","url":null,"abstract":"<p><p>In the field of diagnosing lung diseases, the application of neural networks (NNs) in image classification exhibits significant potential. However, NNs are considered \"black boxes,\" making it difficult to discern their decision-making processes, thereby leading to skepticism and concern regarding NNs. This compromises model reliability and hampers intelligent medicine's development. To tackle this issue, we introduce the Evolutionary Neural Architecture Search (EvoNAS). In image classification tasks, EvoNAS initially utilizes an Evolutionary Algorithm to explore various Convolutional Neural Networks, ultimately yielding an optimized network that excels at separating between redundant texture features and the most discriminative ones. Retaining the most discriminative features improves classification accuracy, particularly in distinguishing similar features. This approach illuminates the intrinsic mechanics of classification, thereby enhancing the accuracy of the results. Subsequently, we incorporate a Differential Evolution algorithm based on distribution estimation, significantly enhancing search efficiency. Employing visualization techniques, we demonstrate the effectiveness of EvoNAS, endowing the model with interpretability. Finally, we conduct experiments on the diffuse lung disease texture dataset using EvoNAS. Compared to the original network, the classification accuracy increases by 0.56%. Moreover, our EvoNAS approach demonstrates significant advantages over existing methods in the same dataset.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-15DOI: 10.1109/JBHI.2024.3481310
Honglei Liu, Yi Shi, Ying Xu, Ao Li, Minghui Wang
Cancer is a pressing public health problem and one of the main causes of mortality worldwide. The development of advanced computational methods for predicting cancer survival is pivotal in aiding clinicians to formulate effective treatment strategies and improve patient quality of life. Recent advances in survival prediction methods show that integrating diverse information from various cancer-related data, such as pathological images and genomics, is crucial for improving prediction accuracy. Despite promising results of existing approaches, there are great challenges of modality gap and semantic redundancy presented in multiple cancer data, which could hinder the comprehensive integration and pose substantial obstacles to further enhancing cancer survival prediction. In this study, we propose a novel agnostic-specific modality learning (ASML) framework for accurate cancer survival prediction. To bridge the modality gap and provide a comprehensive view of distinct data modalities, we employ an agnostic-specific learning strategy to learn the commonality across modalities and the uniqueness of each modality. Moreover, a cross-modal fusion network is exerted to integrate multimodal information by modeling modality correlations and diminish semantic redundancy in a divide-and-conquer manner. Extensive experiment results on three TCGA datasets demonstrate that ASML reaches better performance than other existing cancer survival prediction methods for multiple data.
{"title":"Agnostic-Specific Modality Learning for Cancer Survival Prediction from Multiple Data.","authors":"Honglei Liu, Yi Shi, Ying Xu, Ao Li, Minghui Wang","doi":"10.1109/JBHI.2024.3481310","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3481310","url":null,"abstract":"<p><p>Cancer is a pressing public health problem and one of the main causes of mortality worldwide. The development of advanced computational methods for predicting cancer survival is pivotal in aiding clinicians to formulate effective treatment strategies and improve patient quality of life. Recent advances in survival prediction methods show that integrating diverse information from various cancer-related data, such as pathological images and genomics, is crucial for improving prediction accuracy. Despite promising results of existing approaches, there are great challenges of modality gap and semantic redundancy presented in multiple cancer data, which could hinder the comprehensive integration and pose substantial obstacles to further enhancing cancer survival prediction. In this study, we propose a novel agnostic-specific modality learning (ASML) framework for accurate cancer survival prediction. To bridge the modality gap and provide a comprehensive view of distinct data modalities, we employ an agnostic-specific learning strategy to learn the commonality across modalities and the uniqueness of each modality. Moreover, a cross-modal fusion network is exerted to integrate multimodal information by modeling modality correlations and diminish semantic redundancy in a divide-and-conquer manner. Extensive experiment results on three TCGA datasets demonstrate that ASML reaches better performance than other existing cancer survival prediction methods for multiple data.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Semi-supervised learning effectively mitigates the lack of labeled data by introducing extensive unlabeled data. Despite achieving success in respiratory sound classification, in practice, it usually takes years to acquire a sufficiently sizeable unlabeled set, which consequently results in an extension of the research timeline. Considering that there are also respiratory sounds available in other related tasks, like breath phase detection and COVID-19 detection, it might be an alternative manner to treat these external samples as unlabeled data for respiratory sound classification. However, since these external samples are collected in different scenarios via different devices, there inevitably exists a distribution mismatch between the labeled and external unlabeled data. For existing methods, they usually assume that the labeled and unlabeled data follow the same data distribution. Therefore, they cannot benefit from external samples. To utilize external unlabeled data, we propose a semi-supervised method based on Joint Energy-based Model (JEM) in this paper. During training, the method attempts to use only the essential semantic components within the samples to model the data distribution. When non-semantic components like recording environments and devices vary, as these non-semantic components have a small impact on the model training, a relatively accurate distribution estimation is obtained. Therefore, the method exhibits insensitivity to the distribution mismatch, enabling the model to leverage external unlabeled data to mitigate the lack of labeled data. Taking ICBHI 2017 as the labeled set, HF_Lung_V1 and COVID-19 Sounds as the external unlabeled sets, the proposed method exceeds the baseline by 12.86.
{"title":"Joint Energy-based Model for Semi-supervised Respiratory Sound Classification: A Method of Insensitive to Distribution Mismatch.","authors":"Wenjie Song, Jiqing Han, Shiwen Deng, Tieran Zheng, Guibin Zheng, Yongjun He","doi":"10.1109/JBHI.2024.3480999","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3480999","url":null,"abstract":"<p><p>Semi-supervised learning effectively mitigates the lack of labeled data by introducing extensive unlabeled data. Despite achieving success in respiratory sound classification, in practice, it usually takes years to acquire a sufficiently sizeable unlabeled set, which consequently results in an extension of the research timeline. Considering that there are also respiratory sounds available in other related tasks, like breath phase detection and COVID-19 detection, it might be an alternative manner to treat these external samples as unlabeled data for respiratory sound classification. However, since these external samples are collected in different scenarios via different devices, there inevitably exists a distribution mismatch between the labeled and external unlabeled data. For existing methods, they usually assume that the labeled and unlabeled data follow the same data distribution. Therefore, they cannot benefit from external samples. To utilize external unlabeled data, we propose a semi-supervised method based on Joint Energy-based Model (JEM) in this paper. During training, the method attempts to use only the essential semantic components within the samples to model the data distribution. When non-semantic components like recording environments and devices vary, as these non-semantic components have a small impact on the model training, a relatively accurate distribution estimation is obtained. Therefore, the method exhibits insensitivity to the distribution mismatch, enabling the model to leverage external unlabeled data to mitigate the lack of labeled data. Taking ICBHI 2017 as the labeled set, HF_Lung_V1 and COVID-19 Sounds as the external unlabeled sets, the proposed method exceeds the baseline by 12.86.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-15DOI: 10.1109/JBHI.2024.3480998
Xiang Zhang, Jiaxin Hu, Qian Lu, Lu Niu, Xinqi Wang
Automatic extraction of valuable, structured evidence from the exponentially growing clinical trial literature can help physicians practice evidence-based medicine quickly and accurately. However, current research on evidence extraction has been limited by the lack of generalization ability on various clinical topics and the high cost of manual annotation. In this work, we address these challenges by constructing a PICO-based evidence dataset PICO-DS, covering five clinical topics. This dataset was automatically labeled by a distant supervision based on our proposed textual similarity algorithm called ROUGE-Hybrid. We then present an Aceso-DSAL model, an extension of our previous supervised evidence extraction model - Aceso. In Aceso-DSAL, distantly-labelled and multi-topic PICO-DS was exploited as training corpus, which greatly enhances the generalization of the extraction model. To mitigate the influence of noise unavoidably-introduced in distant supervision, we employ TextCNN and MW-Net models and a paradigm of active learning to weigh the value of each sample. We evaluate the effectiveness of our model on the PICO-DS dataset and find that it outperforms state-of-the-art studies in identifying evidential sentences.
{"title":"Aceso-DSAL: Discovering Clinical Evidences from Medical Literature Based on Distant Supervision and Active Learning.","authors":"Xiang Zhang, Jiaxin Hu, Qian Lu, Lu Niu, Xinqi Wang","doi":"10.1109/JBHI.2024.3480998","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3480998","url":null,"abstract":"<p><p>Automatic extraction of valuable, structured evidence from the exponentially growing clinical trial literature can help physicians practice evidence-based medicine quickly and accurately. However, current research on evidence extraction has been limited by the lack of generalization ability on various clinical topics and the high cost of manual annotation. In this work, we address these challenges by constructing a PICO-based evidence dataset PICO-DS, covering five clinical topics. This dataset was automatically labeled by a distant supervision based on our proposed textual similarity algorithm called ROUGE-Hybrid. We then present an Aceso-DSAL model, an extension of our previous supervised evidence extraction model - Aceso. In Aceso-DSAL, distantly-labelled and multi-topic PICO-DS was exploited as training corpus, which greatly enhances the generalization of the extraction model. To mitigate the influence of noise unavoidably-introduced in distant supervision, we employ TextCNN and MW-Net models and a paradigm of active learning to weigh the value of each sample. We evaluate the effectiveness of our model on the PICO-DS dataset and find that it outperforms state-of-the-art studies in identifying evidential sentences.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}