Pub Date : 2025-03-05DOI: 10.1109/JBHI.2025.3544265
Lanqing Liu, Jing Zou, Cheng Xu, Kang Wang, Jun Lyu, Xuemiao Xu, Zhanli Hu, Jing Qin
Diffusion models have garnered significant attention for MRI Super-Resolution (SR) and have achieved promising results. However, existing diffusion-based SR models face two formidable challenges: 1) insufficient exploitation of complementary information from multi-contrast images, which hinders the faithful reconstruction of texture details and anatomical structures; and 2) reliance on fixed magnification factors, such as 2× or 4×, which is impractical for clinical scenarios that require arbitrary scale magnification. To circumvent these issues, this paper introduces IM-Diff, an implicit multi-contrast diffusion model for arbitrary-scale MRI SR, leveraging the merits of both multi-contrast information and the continuous nature of implicit neural representation (INR). Firstly, we propose an innovative hierarchical multi-contrast fusion (HMF) module with reference-aware cross Mamba (RCM) to effectively incorporate target-relevant information from the reference image into the target image, while ensuring a substantial receptive field with computational efficiency. Secondly, we introduce multiple wavelet INR magnification (WINRM) modules into the denoising process by integrating the wavelet implicit neural non-linearity, enabling effective learning of continuous representations of MR images. The involved wavelet activation enhances space-frequency concentration, further bolstering representation accuracy and robustness in INR. Extensive experiments on three public datasets demonstrate the superiority of our method over existing state-of-the-art SR models across various magnification factors.
{"title":"IM-Diff: Implicit Multi-Contrast Diffusion Model for Arbitrary Scale MRI Super-Resolution.","authors":"Lanqing Liu, Jing Zou, Cheng Xu, Kang Wang, Jun Lyu, Xuemiao Xu, Zhanli Hu, Jing Qin","doi":"10.1109/JBHI.2025.3544265","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3544265","url":null,"abstract":"<p><p>Diffusion models have garnered significant attention for MRI Super-Resolution (SR) and have achieved promising results. However, existing diffusion-based SR models face two formidable challenges: 1) insufficient exploitation of complementary information from multi-contrast images, which hinders the faithful reconstruction of texture details and anatomical structures; and 2) reliance on fixed magnification factors, such as 2× or 4×, which is impractical for clinical scenarios that require arbitrary scale magnification. To circumvent these issues, this paper introduces IM-Diff, an implicit multi-contrast diffusion model for arbitrary-scale MRI SR, leveraging the merits of both multi-contrast information and the continuous nature of implicit neural representation (INR). Firstly, we propose an innovative hierarchical multi-contrast fusion (HMF) module with reference-aware cross Mamba (RCM) to effectively incorporate target-relevant information from the reference image into the target image, while ensuring a substantial receptive field with computational efficiency. Secondly, we introduce multiple wavelet INR magnification (WINRM) modules into the denoising process by integrating the wavelet implicit neural non-linearity, enabling effective learning of continuous representations of MR images. The involved wavelet activation enhances space-frequency concentration, further bolstering representation accuracy and robustness in INR. Extensive experiments on three public datasets demonstrate the superiority of our method over existing state-of-the-art SR models across various magnification factors.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143566929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-05DOI: 10.1109/JBHI.2025.3548048
Zheyi Ji, Yongxin Ge, Chijioke Chukwudi, Kaicheng U, Sophia Meixuan Zhang, Yulong Peng, Junyou Zhu, Hossam Zaki, Xueling Zhang, Sen Yang, Xiyue Wang, Yijiang Chen, Junhan Zhao
Applying deep learning to predict patient prognostic survival outcomes using histological whole-slide images (WSIs) and genomic data is challenging due to the morphological and transcriptomic heterogeneity present in the tumor microenvironment. Existing deep learning-enabled methods often exhibit learning biases, primarily because the genomic knowledge used to guide directional feature extraction from WSIs may be irrelevant or incomplete. This results in a suboptimal and sometimes myopic understanding of the overall pathological landscape, potentially overlooking crucial histological insights. To tackle these challenges, we propose the CounterFactual Bidirectional Co-Attention Transformer framework. By integrating a bidirectional co-attention layer, our framework fosters effective feature interactions between the genomic and histology modalities and ensures consistent identification of prognostic features from WSIs. Using counterfactual reasoning, our model utilizes causality to model unimodal and multimodal knowledge for cancer risk stratification. This approach directly addresses and reduces bias, enables the exploration of 'what-if' scenarios, and offers a deeper understanding of how different features influence survival outcomes. Our framework, validated across eight diverse cancer benchmark datasets from The Cancer Genome Atlas (TCGA), represents a major improvement over current histology-genomic model learning methods. It shows an average 2.5% improvement in c-index performance over 18 state-of-the-art models in predicting patient prognoses across eight cancer types. Our code is released at https://github.com/BusyJzy599/CFBCT-main.
{"title":"Counterfactual Bidirectional Co-Attention Transformer for Integrative Histology-Genomic Cancer Risk Stratification.","authors":"Zheyi Ji, Yongxin Ge, Chijioke Chukwudi, Kaicheng U, Sophia Meixuan Zhang, Yulong Peng, Junyou Zhu, Hossam Zaki, Xueling Zhang, Sen Yang, Xiyue Wang, Yijiang Chen, Junhan Zhao","doi":"10.1109/JBHI.2025.3548048","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3548048","url":null,"abstract":"<p><p>Applying deep learning to predict patient prognostic survival outcomes using histological whole-slide images (WSIs) and genomic data is challenging due to the morphological and transcriptomic heterogeneity present in the tumor microenvironment. Existing deep learning-enabled methods often exhibit learning biases, primarily because the genomic knowledge used to guide directional feature extraction from WSIs may be irrelevant or incomplete. This results in a suboptimal and sometimes myopic understanding of the overall pathological landscape, potentially overlooking crucial histological insights. To tackle these challenges, we propose the CounterFactual Bidirectional Co-Attention Transformer framework. By integrating a bidirectional co-attention layer, our framework fosters effective feature interactions between the genomic and histology modalities and ensures consistent identification of prognostic features from WSIs. Using counterfactual reasoning, our model utilizes causality to model unimodal and multimodal knowledge for cancer risk stratification. This approach directly addresses and reduces bias, enables the exploration of 'what-if' scenarios, and offers a deeper understanding of how different features influence survival outcomes. Our framework, validated across eight diverse cancer benchmark datasets from The Cancer Genome Atlas (TCGA), represents a major improvement over current histology-genomic model learning methods. It shows an average 2.5% improvement in c-index performance over 18 state-of-the-art models in predicting patient prognoses across eight cancer types. Our code is released at https://github.com/BusyJzy599/CFBCT-main.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143566928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-05DOI: 10.1109/JBHI.2025.3548263
Dan Shao, Guangzhao Zhang, Lin Lin, Yucong Xiong, Kai He, Liyan Sun
Bronchoalveolar lavage fluid (BALF) is a liquid obtained from the alveoli and bronchi, often used to study pulmonary diseases. So far, proteomic analyses have identified over three thousand proteins in BALF. However, the comprehensive characterization of these proteins remains challenging due to their complexity and technological limitations. This paper presented a novel deep learning framework called SecProGNN, designed to predict secretory proteins in BALF. Firstly, SecProGNN represented proteins as graph-structured data, with amino acids connected based on their interactions. Then, these graphs were processed through graph neural networks (GNNs) model to extract graph features. Finally, the extracted feature vectors were fed into a multi-layer perceptron (MLP) module to predict BALF secreted proteins. Additionally, by utilizing SecProGNN, we investigated potential biomarkers for lung adenocarcinoma and identified 16 promising candidates that may be secreted into BALF.
{"title":"SecProGNN: Predicting Bronchoalveolar Lavage Fluid Secreted Protein Using Graph Neural Network.","authors":"Dan Shao, Guangzhao Zhang, Lin Lin, Yucong Xiong, Kai He, Liyan Sun","doi":"10.1109/JBHI.2025.3548263","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3548263","url":null,"abstract":"<p><p>Bronchoalveolar lavage fluid (BALF) is a liquid obtained from the alveoli and bronchi, often used to study pulmonary diseases. So far, proteomic analyses have identified over three thousand proteins in BALF. However, the comprehensive characterization of these proteins remains challenging due to their complexity and technological limitations. This paper presented a novel deep learning framework called SecProGNN, designed to predict secretory proteins in BALF. Firstly, SecProGNN represented proteins as graph-structured data, with amino acids connected based on their interactions. Then, these graphs were processed through graph neural networks (GNNs) model to extract graph features. Finally, the extracted feature vectors were fed into a multi-layer perceptron (MLP) module to predict BALF secreted proteins. Additionally, by utilizing SecProGNN, we investigated potential biomarkers for lung adenocarcinoma and identified 16 promising candidates that may be secreted into BALF.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143566930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, functional magnetic resonance imaging (fMRI)-based brain networks have been shown to be an effective diagnostic tool with great potential for accurately detecting autism spectrum disorders (ASD). Meanwhile, the successful use of graph convolution networks (GCNs) methods based on fMRI information has improved the classification accuracy of ASD. However, many graph convolution-based methods do not fully utilize the topological information of the brain functional connectivity network (BFCN) or ignore the effect of non-imaging information. Therefore, we propose a hierarchical graph embedding model that leverage both the topological information of the BFCN and the non-imaging information of the subjects to improve the classification accuracy. Specifically, our model first use the Infomax Module to automatically identify embedded features in regions of interests (ROIs) in the brain. Then, these features, along with non-imaging information, is used to construct a population graph model. Finally, we design a graph convolution framework to propagate and aggregate the node features and obtain the results for ASD detection. Our model takes into account both the significance of the BFCN to individual subjects and relationships between subjects in the population graph. The model performed autism detection using the Autism Brain Imaging Data Exchange (ABIDE) dataset and obtained an average accuracy of 77.2% and an AUC of 87.2%. These results exceed those of the baseline approach. Through extensive experiments, we demonstrate the competitiveness, robustness and effectiveness of our model in aiding ASD diagnosis.
{"title":"A Hierarchical Graph Convolutional Network with Infomax-Guided Graph Embedding for Population-Based ASD Detection.","authors":"Xiaoke Hao, Mingming Ma, Jiaqing Tao, Jiahui Cao, Jing Qin, Feng Liu, Daoqiang Zhang, Dong Ming","doi":"10.1109/JBHI.2025.3544302","DOIUrl":"10.1109/JBHI.2025.3544302","url":null,"abstract":"<p><p>Recently, functional magnetic resonance imaging (fMRI)-based brain networks have been shown to be an effective diagnostic tool with great potential for accurately detecting autism spectrum disorders (ASD). Meanwhile, the successful use of graph convolution networks (GCNs) methods based on fMRI information has improved the classification accuracy of ASD. However, many graph convolution-based methods do not fully utilize the topological information of the brain functional connectivity network (BFCN) or ignore the effect of non-imaging information. Therefore, we propose a hierarchical graph embedding model that leverage both the topological information of the BFCN and the non-imaging information of the subjects to improve the classification accuracy. Specifically, our model first use the Infomax Module to automatically identify embedded features in regions of interests (ROIs) in the brain. Then, these features, along with non-imaging information, is used to construct a population graph model. Finally, we design a graph convolution framework to propagate and aggregate the node features and obtain the results for ASD detection. Our model takes into account both the significance of the BFCN to individual subjects and relationships between subjects in the population graph. The model performed autism detection using the Autism Brain Imaging Data Exchange (ABIDE) dataset and obtained an average accuracy of 77.2% and an AUC of 87.2%. These results exceed those of the baseline approach. Through extensive experiments, we demonstrate the competitiveness, robustness and effectiveness of our model in aiding ASD diagnosis.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143556708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-04DOI: 10.1109/JBHI.2025.3547741
Guanyu Song, Meifeng Deng, Yunzhi Chen, Shijie Jia, Zhenguo Nie
Precise quantification of protein-ligand interaction is critical in early-stage drug discovery. Artificial intelligence (AI) has gained massive popularity in this area, with deep-learning models used to extract features from ligand and protein molecules. However, these models often fail to capture intermolecular non-covalent interactions, the primary factor influencing binding, leading to lower accuracy and interpretability. Moreover, such models overlook the spatial structure of protein-ligand complexes, resulting in weaker generalization. To address these issues, we propose Non-covalent Interaction-aware Graph Neural Network (NciaNet), a novel method that effectively utilizes intermolecular non-covalent interactions and 3D protein-ligand structure. Our approach achieves excellent predictive performance on multiple benchmark datasets and outperforms competitive baseline models in the binding affinity task, with the benchmark core set v.2016 achieving an RMSE of 1.208 and an R of 0.833, and the core set v.2013 achieving an RMSE of 1.409 and an R of 0.805, under the high-quality refined v.2016 training conditions. Importantly, NciaNet successfully learns vital features related to protein-ligand interactions, providing biochemical insights and demonstrating practical utility and reliability. However, despite these strengths, there may still be limitations in generalizability to unseen protein-ligand complexes, suggesting potential avenues for future work.
{"title":"NciaNet: A Non-Covalent Interaction-Aware Graph Neural Network for the Prediction of Protein-Ligand Interaction in Drug Discovery.","authors":"Guanyu Song, Meifeng Deng, Yunzhi Chen, Shijie Jia, Zhenguo Nie","doi":"10.1109/JBHI.2025.3547741","DOIUrl":"10.1109/JBHI.2025.3547741","url":null,"abstract":"<p><p>Precise quantification of protein-ligand interaction is critical in early-stage drug discovery. Artificial intelligence (AI) has gained massive popularity in this area, with deep-learning models used to extract features from ligand and protein molecules. However, these models often fail to capture intermolecular non-covalent interactions, the primary factor influencing binding, leading to lower accuracy and interpretability. Moreover, such models overlook the spatial structure of protein-ligand complexes, resulting in weaker generalization. To address these issues, we propose Non-covalent Interaction-aware Graph Neural Network (NciaNet), a novel method that effectively utilizes intermolecular non-covalent interactions and 3D protein-ligand structure. Our approach achieves excellent predictive performance on multiple benchmark datasets and outperforms competitive baseline models in the binding affinity task, with the benchmark core set v.2016 achieving an RMSE of 1.208 and an R of 0.833, and the core set v.2013 achieving an RMSE of 1.409 and an R of 0.805, under the high-quality refined v.2016 training conditions. Importantly, NciaNet successfully learns vital features related to protein-ligand interactions, providing biochemical insights and demonstrating practical utility and reliability. However, despite these strengths, there may still be limitations in generalizability to unseen protein-ligand complexes, suggesting potential avenues for future work.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143556725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-04DOI: 10.1109/JBHI.2025.3547744
Jingliang Zhao, An Zeng, Jiayu Ye, Dan Pan
Pre-extracted aortic dissection (AD) centerline is very useful for quantitative diagnosis and treatment of AD disease. However, centerline extraction is challenging because (i) the lumen of AD is very narrow and irregular, yielding failure in feature extraction and interrupted topology; and (ii) the acute nature of AD requires a quick algorithm, however, AD scans usually contain thousands of slices, centerline extraction is very time-consuming. In this paper, a fast AD centerline extraction algorithm, which is based on a local conformal deep reinforced agent and dynamic tracking framework, is presented. The potential dependence of adjacent center points is utilized to form the novel 2.5D state and locally constrains the shape of the centerline, which improves overlap ratio and accuracy of the tracked path. Moreover, we dynamically modify the width and direction of the detection window to focus on vessel-relevant regions and improve the ability in tracking small vessels. On a public AD dataset that involves 100 CTA scans, the proposed method obtains average overlap of 97.23% and mean distance error of 1.28 voxels, which outperforms four state-of-the-art AD centerline extraction methods. The proposed algorithm is very fast with average processing time of 9.54s, indicating that this method is very suitable for clinical practice.
{"title":"Dynamic Local Conformal Reinforcement Network (DLCR) for Aortic Dissection Centerline Tracking.","authors":"Jingliang Zhao, An Zeng, Jiayu Ye, Dan Pan","doi":"10.1109/JBHI.2025.3547744","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3547744","url":null,"abstract":"<p><p>Pre-extracted aortic dissection (AD) centerline is very useful for quantitative diagnosis and treatment of AD disease. However, centerline extraction is challenging because (i) the lumen of AD is very narrow and irregular, yielding failure in feature extraction and interrupted topology; and (ii) the acute nature of AD requires a quick algorithm, however, AD scans usually contain thousands of slices, centerline extraction is very time-consuming. In this paper, a fast AD centerline extraction algorithm, which is based on a local conformal deep reinforced agent and dynamic tracking framework, is presented. The potential dependence of adjacent center points is utilized to form the novel 2.5D state and locally constrains the shape of the centerline, which improves overlap ratio and accuracy of the tracked path. Moreover, we dynamically modify the width and direction of the detection window to focus on vessel-relevant regions and improve the ability in tracking small vessels. On a public AD dataset that involves 100 CTA scans, the proposed method obtains average overlap of 97.23% and mean distance error of 1.28 voxels, which outperforms four state-of-the-art AD centerline extraction methods. The proposed algorithm is very fast with average processing time of 9.54s, indicating that this method is very suitable for clinical practice.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143556720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate identification of protein-nucleotide binding residues is essential for protein functional annotation and drug discovery. Advancements in computational methods for predicting binding residues from protein sequences have significantly improved predictive accuracy. However, it remains a challenge for current methodologies to extract discriminative features and assimilate heterogeneous data from different nucleotide binding residues. To address this, we introduce NucMoMTL, a novel predictor specifically designed for identifying protein-nucleotide binding residues. Specifically, NucMoMTL leverages a pre-trained language model for robust sequence embedding and utilizes deep multi-task and multi-scale learning within parameter-based orthogonal constraints to extract shared representations, capitalizing on auxiliary information from diverse nucleotides binding residues. Evaluation of NucMoMTL on the benchmark datasets demonstrates that it outperforms state-of-the-art methods, achieving an average AUROC and AUPRC of 0.961 and 0.566, respectively. NucMoMTL can be explored as a reliable computational tool for identifying protein-nucleotide binding residues and facilitating drug discovery. The dataset used and source code are freely available at: https://github.com/jerry1984Y/NucMoMTL.
{"title":"Identification of Protein-nucleotide Binding Residues with Deep Multi-task and Multi-scale Learning.","authors":"Jiashun Wu, Fang Ge, Shanruo Xu, Yan Liu, Jiangning Song, Dong-Jun Yu","doi":"10.1109/JBHI.2025.3547386","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3547386","url":null,"abstract":"<p><p>Accurate identification of protein-nucleotide binding residues is essential for protein functional annotation and drug discovery. Advancements in computational methods for predicting binding residues from protein sequences have significantly improved predictive accuracy. However, it remains a challenge for current methodologies to extract discriminative features and assimilate heterogeneous data from different nucleotide binding residues. To address this, we introduce NucMoMTL, a novel predictor specifically designed for identifying protein-nucleotide binding residues. Specifically, NucMoMTL leverages a pre-trained language model for robust sequence embedding and utilizes deep multi-task and multi-scale learning within parameter-based orthogonal constraints to extract shared representations, capitalizing on auxiliary information from diverse nucleotides binding residues. Evaluation of NucMoMTL on the benchmark datasets demonstrates that it outperforms state-of-the-art methods, achieving an average AUROC and AUPRC of 0.961 and 0.566, respectively. NucMoMTL can be explored as a reliable computational tool for identifying protein-nucleotide binding residues and facilitating drug discovery. The dataset used and source code are freely available at: https://github.com/jerry1984Y/NucMoMTL.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143556724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-03DOI: 10.1109/JBHI.2025.3547263
Khalil Ur Rehman, Li Jianqiang, Anaa Yasin, Anas Bilal, Shakila Basheer, Inam Ullah, Muhammad Kashif Jabbar, Yibin Tian
Architectural Distortion (AD) is a common abnormality in digital mammograms, alongside masses and microcalcifications. Detecting AD in dense breast tissue is particularly challenging due to its heterogeneous asymmetries and subtle presentation. Factors such as location, size, shape, texture, and variability in patterns contribute to reduced sensitivity. To address these challenges, we propose a novel feature fusion-based Vision Transformer (ViT) attention network, combined with VGG-16, to improve accuracy and efficiency in AD detection. Our approach mitigates issues related to texture fixation, background boundaries, and deep neural network limitations, enhancing the robustness of AD classification in mammograms. Experimental results demonstrate that the proposed model achieves state-of-the-art performance, outperforming eight existing deep learning models. On the PINUM dataset, it attains 0.97 sensitivity, 0.92 F1-score, 0.93 precision, 0.94 specificity, and 0.96 accuracy. On the DDSM dataset, it records 0.93 sensitivity, 0.91 F1-score, 0.94 precision, 0.92 specificity, and 0.95 accuracy. These results highlight the potential of our method for computer-aided breast cancer diagnosis, particularly in low-resource settings where access to high-end imaging technology is limited. By enabling more accurate and timely AD detection, our approach could significantly improve breast cancer screening and early intervention worldwide.
{"title":"A Feature Fusion Attention-based Deep Learning Algorithm for Mammographic Architectural Distortion Classification.","authors":"Khalil Ur Rehman, Li Jianqiang, Anaa Yasin, Anas Bilal, Shakila Basheer, Inam Ullah, Muhammad Kashif Jabbar, Yibin Tian","doi":"10.1109/JBHI.2025.3547263","DOIUrl":"10.1109/JBHI.2025.3547263","url":null,"abstract":"<p><p>Architectural Distortion (AD) is a common abnormality in digital mammograms, alongside masses and microcalcifications. Detecting AD in dense breast tissue is particularly challenging due to its heterogeneous asymmetries and subtle presentation. Factors such as location, size, shape, texture, and variability in patterns contribute to reduced sensitivity. To address these challenges, we propose a novel feature fusion-based Vision Transformer (ViT) attention network, combined with VGG-16, to improve accuracy and efficiency in AD detection. Our approach mitigates issues related to texture fixation, background boundaries, and deep neural network limitations, enhancing the robustness of AD classification in mammograms. Experimental results demonstrate that the proposed model achieves state-of-the-art performance, outperforming eight existing deep learning models. On the PINUM dataset, it attains 0.97 sensitivity, 0.92 F1-score, 0.93 precision, 0.94 specificity, and 0.96 accuracy. On the DDSM dataset, it records 0.93 sensitivity, 0.91 F1-score, 0.94 precision, 0.92 specificity, and 0.95 accuracy. These results highlight the potential of our method for computer-aided breast cancer diagnosis, particularly in low-resource settings where access to high-end imaging technology is limited. By enabling more accurate and timely AD detection, our approach could significantly improve breast cancer screening and early intervention worldwide.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-03DOI: 10.1109/JBHI.2025.3546844
Abhidnya Patharkar, Firas Al-Hindawi, Teresa Wu
In biomedical datasets pertaining to disease detection, data typically falls into two classes: healthy and diseased. The diseased cohort often exhibits inherent heterogeneity due to clinical subtyping. Although the healthy cohort is presumed to be homogeneous, it contains heterogeneities arising from inter-subject variation, which affects the effectiveness of classification. To address this issue, we propose a novel methodology for multivariate time series data that discerns a homogeneous sub-cohort of healthy samples, referred to as the 'Healthy Bio-Core' (HBC). The employment of HBC augments the discriminative capacity of classification models. The selection process for HBC integrates dynamic time warping (DTW), and the accuracy of the ROCKET (RandOm Convolutional KErnel Transform) classifier, treating the entire time series as a single instance. Empirical results indicate that utilizing HBC enhances classification performance in comparison to utilizing the complete healthy dataset. We substantiate this approach with three classifiers: HIVE-COTE (Hierarchical Vote Collective of Transformation-based Ensembles), MUSE (Multi-variate Unsupervised Symbols and Derivatives), and DTW-NN (DTW with Nearest Neighbor), assessing metrics such as accuracy, precision, recall, and F1-score. Although our approach relies on DTW, it is limited to cases where a DTW path can be identified; otherwise, another distance metric must be used. Currently, the efficiency depends on the classifier used. Future studies might investigate combining different classifiers for HBC sample selection and devise a method to synthesize their outcomes. Moreover, assuming that the dataset is predominantly healthy may not hold true in contexts with significant noise. Notwithstanding these limitations, our approach results in significant improvements in classification, with average accuracy increases of 5.49%, 14.28%, and 6.16% for the sepsis, gait, and EMO pain datasets, respectively.
{"title":"Healthy Bio-Core: A Framework for Selection of Homogeneous Healthy Biomedical Multivariate Time Series Employing Classification Performance.","authors":"Abhidnya Patharkar, Firas Al-Hindawi, Teresa Wu","doi":"10.1109/JBHI.2025.3546844","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3546844","url":null,"abstract":"<p><p>In biomedical datasets pertaining to disease detection, data typically falls into two classes: healthy and diseased. The diseased cohort often exhibits inherent heterogeneity due to clinical subtyping. Although the healthy cohort is presumed to be homogeneous, it contains heterogeneities arising from inter-subject variation, which affects the effectiveness of classification. To address this issue, we propose a novel methodology for multivariate time series data that discerns a homogeneous sub-cohort of healthy samples, referred to as the 'Healthy Bio-Core' (HBC). The employment of HBC augments the discriminative capacity of classification models. The selection process for HBC integrates dynamic time warping (DTW), and the accuracy of the ROCKET (RandOm Convolutional KErnel Transform) classifier, treating the entire time series as a single instance. Empirical results indicate that utilizing HBC enhances classification performance in comparison to utilizing the complete healthy dataset. We substantiate this approach with three classifiers: HIVE-COTE (Hierarchical Vote Collective of Transformation-based Ensembles), MUSE (Multi-variate Unsupervised Symbols and Derivatives), and DTW-NN (DTW with Nearest Neighbor), assessing metrics such as accuracy, precision, recall, and F1-score. Although our approach relies on DTW, it is limited to cases where a DTW path can be identified; otherwise, another distance metric must be used. Currently, the efficiency depends on the classifier used. Future studies might investigate combining different classifiers for HBC sample selection and devise a method to synthesize their outcomes. Moreover, assuming that the dataset is predominantly healthy may not hold true in contexts with significant noise. Notwithstanding these limitations, our approach results in significant improvements in classification, with average accuracy increases of 5.49%, 14.28%, and 6.16% for the sepsis, gait, and EMO pain datasets, respectively.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alzheimer's disease (AD), as the most prevalent form of dementia, necessitates early identification and treatment for the critical enhancement of patients' quality of life. Recent studies strive to explore advanced machine learning approaches with multiple information cues, such as speech and text, to automatically and precisely detect this disease from conversations. However, these multi-modality-based approaches often suffer from a modality- imbalance challenge that leads to performance degradation. That is, the multi-modal model performs worse than the best mono-modal model, although the former contains more information. To address this issue, we propose a Dynamic Multi-Modal Knowledge Distillation (DMMKD) approach, which dynamically identify the dominant modality and the weak modality, and opt to conduct an inter(cross)-modal or intra-modal knowledge distillation. The core idea is to balance the individual learning speed in the multi-modal learning process by boosting the weak modality with the dominant modality. To evaluate the effectiveness of the introduced DMMKD algorithm, we conducted extensive experiments on two publicly available and widely used AD datasets, i.e. ADReSSo and ADReSS-M. Compared to the multi-modal approaches without dealing with the modality imbalance issue, the introduced DMMKD indicates substantial performance improvements by 15.4% and 10.9% in terms of relative accuracy on the ADReSSo and ADReSS-M datasets, respectively. Moreover, when compared to the state-of-the-art models for automatic AD detection, the DMMKD achieves the best performance of 91.5% and 87.0% accuracies on the two datasets, respectively.
{"title":"Modality Imbalance? Dynamic Multi-Modal Knowledge Distillation in Automatic Alzheimer's Disease Recognition.","authors":"Zhongren Dong, Weixiang Xu, Xinzhou Xu, Zixing Zhang","doi":"10.1109/JBHI.2025.3546950","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3546950","url":null,"abstract":"<p><p>Alzheimer's disease (AD), as the most prevalent form of dementia, necessitates early identification and treatment for the critical enhancement of patients' quality of life. Recent studies strive to explore advanced machine learning approaches with multiple information cues, such as speech and text, to automatically and precisely detect this disease from conversations. However, these multi-modality-based approaches often suffer from a modality- imbalance challenge that leads to performance degradation. That is, the multi-modal model performs worse than the best mono-modal model, although the former contains more information. To address this issue, we propose a Dynamic Multi-Modal Knowledge Distillation (DMMKD) approach, which dynamically identify the dominant modality and the weak modality, and opt to conduct an inter(cross)-modal or intra-modal knowledge distillation. The core idea is to balance the individual learning speed in the multi-modal learning process by boosting the weak modality with the dominant modality. To evaluate the effectiveness of the introduced DMMKD algorithm, we conducted extensive experiments on two publicly available and widely used AD datasets, i.e. ADReSSo and ADReSS-M. Compared to the multi-modal approaches without dealing with the modality imbalance issue, the introduced DMMKD indicates substantial performance improvements by 15.4% and 10.9% in terms of relative accuracy on the ADReSSo and ADReSS-M datasets, respectively. Moreover, when compared to the state-of-the-art models for automatic AD detection, the DMMKD achieves the best performance of 91.5% and 87.0% accuracies on the two datasets, respectively.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}