Brain age transformation aims to convert reference brain images into synthesized images that accurately reflect the age-specific features of a target age group. The primary objective of this task is to modify only the age-related attributes of the reference image while preserving all other age-irrelevant attributes. However, achieving this goal poses substantial challenges due to the inherent entanglement of various image attributes within features extracted from a backbone encoder, resulting in simultaneous alterations during image generation. To address this challenge, we propose a novel architecture that employs disentangled representation learning for identity-preserved brain age transformation, called IdenBAT. This approach facilitates the decomposition of image features, ensuring the preservation of individual traits while selectively transforming age-related characteristics to match those of the target age group. Through comprehensive experiments conducted on both 2D and full-size 3D brain datasets, our method adeptly converts input images to target age while retaining individual characteristics accurately. Furthermore, our approach demonstrates superiority over existing state-of-the-art regarding performance fidelity. The code is available at: https://github.com/ku-milab/IdenBAT.
{"title":"IdenBAT: Disentangled representation learning for identity-preserved brain age transformation","authors":"Junyeong Maeng , Kwanseok Oh , Wonsik Jung , Heung-Il Suk","doi":"10.1016/j.artmed.2025.103115","DOIUrl":"10.1016/j.artmed.2025.103115","url":null,"abstract":"<div><div>Brain age transformation aims to convert reference brain images into synthesized images that accurately reflect the age-specific features of a target age group. The primary objective of this task is to modify only the age-related attributes of the reference image while preserving all other age-irrelevant attributes. However, achieving this goal poses substantial challenges due to the inherent entanglement of various image attributes within features extracted from a backbone encoder, resulting in simultaneous alterations during image generation. To address this challenge, we propose a novel architecture that employs disentangled representation learning for identity-preserved brain age transformation, called IdenBAT. This approach facilitates the decomposition of image features, ensuring the preservation of individual traits while selectively transforming age-related characteristics to match those of the target age group. Through comprehensive experiments conducted on both 2D and full-size 3D brain datasets, our method adeptly converts input images to target age while retaining individual characteristics accurately. Furthermore, our approach demonstrates superiority over existing state-of-the-art regarding performance fidelity. The code is available at: <span><span>https://github.com/ku-milab/IdenBAT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"164 ","pages":"Article 103115"},"PeriodicalIF":6.1,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143739419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-25DOI: 10.1016/j.artmed.2025.103113
Xipeng Pan , Shilong Song , Zhenbing Liu , Huadeng Wang , Lingqiao Li , Haoxiang Lu , Rushi Lan , Xiaonan Luo
Nuclei segmentation plays a vital role in computer-aided histopathology image analysis. Numerous fully supervised learning approaches exhibit amazing performance relying on pathological image with precisely annotations. Whereas, it is difficult and time-consuming in accurate manual labeling on pathological images. Hence, this paper presents a two-stage weakly supervised model including coarse and fine phases, which can achieve nuclei segmentation on whole slide images using only point annotations. In the coarse segmentation step, Voronoi diagram and K-means cluster results are generated based on the point annotations to supervise the training network. In order to cope with the different imaging conditions, an image adaptive clustering pseudo label algorithm is proposed to adapt the color distribution of different images. A Multi-scale Feature Fusion (MFF) module is designed in the decoder to better fusion the feature outputs. Additionally, to reduce the interference of erroneous cluster label, an Exponential Moving Average for cluster label Correction (EMAC) strategy is proposed. After the first step, an uncertainty estimation pseudo label denoising strategy is introduced to denoise Voronoi diagram and adaptive cluster label. In the fine segmentation step, the optimized labels are used for training to obtain the final predicted probability map. Extensive experiments are performed on MoNuSeg and TNBC public benchmarks, which demonstrate our proposed method is superior to other existing nuclei segmentation methods based on point labels. Codes are available at: https://github.com/SSL-droid/WNS-PLCUD.
{"title":"Weakly supervised nuclei segmentation based on pseudo label correction and uncertainty denoising","authors":"Xipeng Pan , Shilong Song , Zhenbing Liu , Huadeng Wang , Lingqiao Li , Haoxiang Lu , Rushi Lan , Xiaonan Luo","doi":"10.1016/j.artmed.2025.103113","DOIUrl":"10.1016/j.artmed.2025.103113","url":null,"abstract":"<div><div>Nuclei segmentation plays a vital role in computer-aided histopathology image analysis. Numerous fully supervised learning approaches exhibit amazing performance relying on pathological image with precisely annotations. Whereas, it is difficult and time-consuming in accurate manual labeling on pathological images. Hence, this paper presents a two-stage weakly supervised model including coarse and fine phases, which can achieve nuclei segmentation on whole slide images using only point annotations. In the coarse segmentation step, Voronoi diagram and K-means cluster results are generated based on the point annotations to supervise the training network. In order to cope with the different imaging conditions, an image adaptive clustering pseudo label algorithm is proposed to adapt the color distribution of different images. A Multi-scale Feature Fusion (MFF) module is designed in the decoder to better fusion the feature outputs. Additionally, to reduce the interference of erroneous cluster label, an Exponential Moving Average for cluster label Correction (EMAC) strategy is proposed. After the first step, an uncertainty estimation pseudo label denoising strategy is introduced to denoise Voronoi diagram and adaptive cluster label. In the fine segmentation step, the optimized labels are used for training to obtain the final predicted probability map. Extensive experiments are performed on MoNuSeg and TNBC public benchmarks, which demonstrate our proposed method is superior to other existing nuclei segmentation methods based on point labels. Codes are available at: <span><span>https://github.com/SSL-droid/WNS-PLCUD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"164 ","pages":"Article 103113"},"PeriodicalIF":6.1,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143746835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-24DOI: 10.1016/j.artmed.2025.103112
Xianlun Tang , Yuze Tang , Xinran Liu , Haochuan Zhang , Xiaoyuan Dang , Ying Wang , Zihui Xu
Traditional Chinese herbal medicine has long been recognized as an effective natural therapy. Recently, the development of recommendation systems for herbs has garnered widespread academic attention, as these systems significantly impact the application of traditional Chinese medicine. However, existing herb recommendation systems are limited by data sparsity, insufficient correlation between prescriptions, and inadequate representation of symptoms and herb characteristics. To address these issues, this paper introduces an approach to herb recommendation based on semantically enhanced self-supervised graph convolution and multi-head attention fusion (BSGAM). This method involves efficient embedding of entities following fine-tuning of BERT; leveraging the attributes of herbs to optimize feature representation through a residual graph convolution network and self-supervised learning; and ultimately employing a multi-head attention mechanism for feature integration and recommendation. Experiments conducted on a publicly available traditional Chinese medicine prescription dataset demonstrate that our method achieves improvements of 6.80%, 7.46%, and 6.60% in F1-Score@5, F1-Score@10, and F1-Score@20, respectively, compared to baseline methods. These results confirm the effectiveness of our approach in enhancing the accuracy of herb recommendations.
{"title":"Utilizing semantically enhanced self-supervised graph convolution and multi-head attention fusion for herb recommendation","authors":"Xianlun Tang , Yuze Tang , Xinran Liu , Haochuan Zhang , Xiaoyuan Dang , Ying Wang , Zihui Xu","doi":"10.1016/j.artmed.2025.103112","DOIUrl":"10.1016/j.artmed.2025.103112","url":null,"abstract":"<div><div>Traditional Chinese herbal medicine has long been recognized as an effective natural therapy. Recently, the development of recommendation systems for herbs has garnered widespread academic attention, as these systems significantly impact the application of traditional Chinese medicine. However, existing herb recommendation systems are limited by data sparsity, insufficient correlation between prescriptions, and inadequate representation of symptoms and herb characteristics. To address these issues, this paper introduces an approach to herb recommendation based on semantically enhanced self-supervised graph convolution and multi-head attention fusion (BSGAM). This method involves efficient embedding of entities following fine-tuning of BERT; leveraging the attributes of herbs to optimize feature representation through a residual graph convolution network and self-supervised learning; and ultimately employing a multi-head attention mechanism for feature integration and recommendation. Experiments conducted on a publicly available traditional Chinese medicine prescription dataset demonstrate that our method achieves improvements of 6.80%, 7.46%, and 6.60% in F1-Score@5, F1-Score@10, and F1-Score@20, respectively, compared to baseline methods. These results confirm the effectiveness of our approach in enhancing the accuracy of herb recommendations.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"164 ","pages":"Article 103112"},"PeriodicalIF":6.1,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143739420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-23DOI: 10.1016/j.artmed.2025.103111
Zhan Xiong , Junling He , Pieter Valkema , Tri Q. Nguyen , Maarten Naesens , Jesper Kers , Fons J. Verbeek
Renal biopsies are the gold standard for the diagnosis of kidney diseases. Lesion scores made by renal pathologists are semi-quantitative and exhibit high inter-observer variability. Automating lesion classification within segmented anatomical structures can provide decision support in quantification analysis, thereby reducing inter-observer variability. Nevertheless, classifying lesions in regions-of-interest (ROIs) is clinically challenging due to (a) a large amount of densely packed anatomical objects, (b) class imbalance across different compartments (at least 3), (c) significant variation in size and shape of anatomical objects and (d) the presence of multi-label lesions per anatomical structure. Existing models cannot address these complexities in an efficient and generic manner. This paper presents an analysis for a generalized solution to datasets from various sources (pathology departments) with different types of lesions. Our approach utilizes two sub-networks: dense instance segmentation and lesion classification. We introduce DiffRegFormer, an end-to-end dense instance segmentation sub-network designed for multi-class, multi-scale objects within ROIs. Combining diffusion models, transformers, and RCNNs, DiffRegFormer is a computational-friendly framework that can efficiently recognize over 500 objects across three anatomical classes, i.e., glomeruli, tubuli, and arteries, within ROIs. In a dataset of 303 ROIs from 148 Jones’ silver-stained renal Whole Slide Images (WSIs), our approach outperforms previous methods, achieving an Average Precision of 52.1% (detection) and 46.8% (segmentation). Moreover, our lesion classification sub-network achieves 89.2% precision and 64.6% recall on 21889 object patches out of the 303 ROIs. Lastly, our model demonstrates direct domain transfer to PAS-stained renal WSIs without fine-tuning.
{"title":"Advances in kidney biopsy lesion assessment through dense instance segmentation","authors":"Zhan Xiong , Junling He , Pieter Valkema , Tri Q. Nguyen , Maarten Naesens , Jesper Kers , Fons J. Verbeek","doi":"10.1016/j.artmed.2025.103111","DOIUrl":"10.1016/j.artmed.2025.103111","url":null,"abstract":"<div><div>Renal biopsies are the gold standard for the diagnosis of kidney diseases. Lesion scores made by renal pathologists are semi-quantitative and exhibit high inter-observer variability. Automating lesion classification within segmented anatomical structures can provide decision support in quantification analysis, thereby reducing inter-observer variability. Nevertheless, classifying lesions in regions-of-interest (ROIs) is clinically challenging due to (a) a large amount of densely packed anatomical objects, (b) class imbalance across different compartments (at least 3), (c) significant variation in size and shape of anatomical objects and (d) the presence of multi-label lesions per anatomical structure. Existing models cannot address these complexities in an efficient and generic manner. This paper presents an analysis for a <strong>generalized solution</strong> to datasets from various sources (pathology departments) with different types of lesions. Our approach utilizes two sub-networks: dense instance segmentation and lesion classification. We introduce <strong>DiffRegFormer</strong>, an end-to-end dense instance segmentation sub-network designed for multi-class, multi-scale objects within ROIs. Combining diffusion models, transformers, and RCNNs, DiffRegFormer is a computational-friendly framework that can efficiently recognize over 500 objects across three anatomical classes, i.e., glomeruli, tubuli, and arteries, within ROIs. In a dataset of 303 ROIs from 148 Jones’ silver-stained renal Whole Slide Images (WSIs), our approach outperforms previous methods, achieving an Average Precision of 52.1% (detection) and 46.8% (segmentation). Moreover, our lesion classification sub-network achieves 89.2% precision and 64.6% recall on 21889 object patches out of the 303 ROIs. Lastly, our model demonstrates direct domain transfer to PAS-stained renal WSIs without fine-tuning.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"164 ","pages":"Article 103111"},"PeriodicalIF":6.1,"publicationDate":"2025-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143746871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-17DOI: 10.1016/j.artmed.2025.103110
Pin Wang , Jinhua Zhang , Yongming Li , Yurou Guo , Pufei Li , Rui Chen
Deep learning has been successfully applied to histopathology image classification tasks. However, the performance of deep models is data-driven, and the acquisition and annotation of pathological image samples are difficult, which limit the model's performance. Compared to whole slide images (WSI) of patients, histopathology image datasets of animal models are easier to acquire and annotate. Therefore, this paper proposes an unsupervised domain adaptation method based on semantic correlation clustering for histopathology image classification. The aim is to utilize Minmice model histopathology image dataset to achieve the classification and recognition of human WSIs. Firstly, the multi-scale fused features extracted from the source and target domains are normalized and mapped. In the new feature space, the cosine distance between class centers is used to measure the semantic correlation between categories. Then, the domain centers, class centers, and sample distributions are self-constrainedly aligned. Multi-granular information is applied to achieve cross-domain semantic correlation knowledge transfer between classes. Finally, the probabilistic heatmap is used to visualize the model's prediction results and annotate the cancerous regions in WSIs. Experimental results show that the proposed method has high classification accuracy for WSI, and the annotated result is close to manual annotation, indicating its potential for clinical applications.
{"title":"Histopathology image classification based on semantic correlation clustering domain adaptation","authors":"Pin Wang , Jinhua Zhang , Yongming Li , Yurou Guo , Pufei Li , Rui Chen","doi":"10.1016/j.artmed.2025.103110","DOIUrl":"10.1016/j.artmed.2025.103110","url":null,"abstract":"<div><div>Deep learning has been successfully applied to histopathology image classification tasks. However, the performance of deep models is data-driven, and the acquisition and annotation of pathological image samples are difficult, which limit the model's performance. Compared to whole slide images (WSI) of patients, histopathology image datasets of animal models are easier to acquire and annotate. Therefore, this paper proposes an unsupervised domain adaptation method based on semantic correlation clustering for histopathology image classification. The aim is to utilize Minmice model histopathology image dataset to achieve the classification and recognition of human WSIs. Firstly, the multi-scale fused features extracted from the source and target domains are normalized and mapped. In the new feature space, the cosine distance between class centers is used to measure the semantic correlation between categories. Then, the domain centers, class centers, and sample distributions are self-constrainedly aligned. Multi-granular information is applied to achieve cross-domain semantic correlation knowledge transfer between classes. Finally, the probabilistic heatmap is used to visualize the model's prediction results and annotate the cancerous regions in WSIs. Experimental results show that the proposed method has high classification accuracy for WSI, and the annotated result is close to manual annotation, indicating its potential for clinical applications.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"163 ","pages":"Article 103110"},"PeriodicalIF":6.1,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143644539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parkinson’s disease is a neurodegenerative disease. It is often diagnosed at an advanced stage, which can influence the control over the illness. Therefore, the possibility of diagnosing Parkinson’s disease at an earlier stage, and possibly prognosticate it, could be an advantage. Given this, a literature review that covers current studies in the field is relevant.
Methods:
The aim of this study is to present a systematic literature review in which the models used for the diagnosis and prognosis of Parkinson’s disease through voice and speech assessment are elucidated. Three databases were consulted to obtain the studies between 2019 and 2023: SienceDirect, IEEE Xplore and ACM Library .
Results:
One hundred and six studies were considered eligible, considering the definition of inclusion and exclusion criteria. The vast majority of these studies (94.34%) focus on diagnosing the disease, while the remainder (11.32%) focus on prognosis.
Conclusion:
Voice analysis for the diagnosis and prognosis of Parkinson’s disease using machine learning techniques can be achieved, with very satisfactory performance results, like is demonstrated in this systematic literature review.
{"title":"Voice analysis in Parkinson’s disease - a systematic literature review","authors":"Daniela Xavier , Virginie Felizardo , Beatriz Ferreira , Henriques Zacarias , Mehran Pourvahab , Leonice Souza-Pereira , Nuno M. Garcia","doi":"10.1016/j.artmed.2025.103109","DOIUrl":"10.1016/j.artmed.2025.103109","url":null,"abstract":"<div><h3>Background and aim:</h3><div>Parkinson’s disease is a neurodegenerative disease. It is often diagnosed at an advanced stage, which can influence the control over the illness. Therefore, the possibility of diagnosing Parkinson’s disease at an earlier stage, and possibly prognosticate it, could be an advantage. Given this, a literature review that covers current studies in the field is relevant.</div></div><div><h3>Methods:</h3><div>The aim of this study is to present a systematic literature review in which the models used for the diagnosis and prognosis of Parkinson’s disease through voice and speech assessment are elucidated. Three databases were consulted to obtain the studies between 2019 and 2023: SienceDirect, IEEE Xplore and ACM Library .</div></div><div><h3>Results:</h3><div>One hundred and six studies were considered eligible, considering the definition of inclusion and exclusion criteria. The vast majority of these studies (94.34%) focus on diagnosing the disease, while the remainder (11.32%) focus on prognosis.</div></div><div><h3>Conclusion:</h3><div>Voice analysis for the diagnosis and prognosis of Parkinson’s disease using machine learning techniques can be achieved, with very satisfactory performance results, like is demonstrated in this systematic literature review.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"163 ","pages":"Article 103109"},"PeriodicalIF":6.1,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143686171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Malaria remains a leading cause of global morbidity and mortality, responsible for approximately 5,97,000 deaths according to World Malaria Report 2024. The study aims to systematically review current methodologies for automated analysis of the Plasmodium genus in malaria diagnostics. Specifically, it focuses on computer-assisted methods, examining databases, blood smear types, staining techniques, and diagnostic models used for malaria characterization while identifying the limitations and contributions of recent studies.
Methods
A systematic literature review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Peer-reviewed and published studies from 2020 to 2024 were retrieved from Web of Science and Scopus. Inclusion criteria focused on studies utilizing deep learning and machine learning models for automated malaria detection from microscopic blood smears. The review considered various blood smear types, staining techniques, and diagnostic models, providing a comprehensive evaluation of the automated diagnostic landscape for malaria.
Results
The NIH database is the standardized and most widely tested database for malaria diagnostics. Giemsa stained-thin blood smear is the most efficient diagnostic method for the detection and observation of the plasmodium lifecycle. This study has been able to identify three categories of ML models most suitable for digital diagnostic of malaria, i.e., Most Accurate- ResNet and VGG with peak accuracy of 99.12 %, Most Popular- custom CNN-based models used by 58 % of studies, and least complex- CADx model. A few pre and post-processing techniques like Gaussian filter and auto encoder for noise reduction have also been discussed for improved accuracy of models.
Conclusion
Automated methods for malaria diagnostics show considerable promise in improving diagnostic accuracy and reducing human error. While deep learning models have demonstrated high performance, challenges remain in data standardization and real-world application. Addressing these gaps could lead to more reliable and scalable diagnostic tools, aiding global malaria control efforts.
{"title":"Deep learning method for malaria parasite evaluation from microscopic blood smear","authors":"Abhinav Dahiya , Devvrat Raghuvanshi , Chhaya Sharma , Kamaldeep Joshi , Ashima Nehra , Archana Sharma , Radha Jangra , Parul Badhwar , Renu Tuteja , Sarvajeet S. Gill , Ritu Gill","doi":"10.1016/j.artmed.2025.103114","DOIUrl":"10.1016/j.artmed.2025.103114","url":null,"abstract":"<div><h3>Objective</h3><div>Malaria remains a leading cause of global morbidity and mortality, responsible for approximately 5,97,000 deaths according to World Malaria Report 2024. The study aims to systematically review current methodologies for automated analysis of the <em>Plasmodium</em> genus in malaria diagnostics. Specifically, it focuses on computer-assisted methods, examining databases, blood smear types, staining techniques, and diagnostic models used for malaria characterization while identifying the limitations and contributions of recent studies.</div></div><div><h3>Methods</h3><div>A systematic literature review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Peer-reviewed and published studies from 2020 to 2024 were retrieved from Web of Science and Scopus. Inclusion criteria focused on studies utilizing deep learning and machine learning models for automated malaria detection from microscopic blood smears. The review considered various blood smear types, staining techniques, and diagnostic models, providing a comprehensive evaluation of the automated diagnostic landscape for malaria.</div></div><div><h3>Results</h3><div>The NIH database is the standardized and most widely tested database for malaria diagnostics. Giemsa stained-thin blood smear is the most efficient diagnostic method for the detection and observation of the <em>plasmodium</em> lifecycle. This study has been able to identify three categories of ML models most suitable for digital diagnostic of malaria, i.e., Most Accurate- ResNet and VGG with peak accuracy of 99.12 %, Most Popular- custom CNN-based models used by 58 % of studies, and least complex- CADx model. A few pre and post-processing techniques like Gaussian filter and auto encoder for noise reduction have also been discussed for improved accuracy of models.</div></div><div><h3>Conclusion</h3><div>Automated methods for malaria diagnostics show considerable promise in improving diagnostic accuracy and reducing human error. While deep learning models have demonstrated high performance, challenges remain in data standardization and real-world application. Addressing these gaps could lead to more reliable and scalable diagnostic tools, aiding global malaria control efforts.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"163 ","pages":"Article 103114"},"PeriodicalIF":6.1,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143644540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-10DOI: 10.1016/j.artmed.2025.103108
Alban Bornet , Dimitrios Proios , Anthony Yazdani , Fernando Jaume-Santero , Guy Haller , Edward Choi , Douglas Teodoro
Effective representation of medical concepts is crucial for secondary analyses of electronic health records. Neural language models have shown promise in automatically deriving medical concept representations from clinical data. However, the comparative performance of different language models for creating these empirical representations, and the extent to which they encode medical semantics, has not been extensively studied. This study aims to address this gap by evaluating the effectiveness of three popular language models - word2vec, fastText, and GloVe - in creating medical concept embeddings that capture their semantic meaning. By using a large dataset of digital health records, we created patient trajectories and used them to train the language models. We then assessed the ability of the learned embeddings to encode semantics through an explicit comparison with biomedical terminologies, and implicitly by predicting patient outcomes and trajectories with different levels of available information. Our qualitative analysis shows that empirical clusters of embeddings learned by fastText exhibit the highest similarity with theoretical clustering patterns obtained from biomedical terminologies, with a similarity score between empirical and theoretical clusters of 0.88, 0.80, and 0.92 for diagnosis, procedure, and medication codes, respectively. Conversely, for outcome prediction, word2vec and GloVe tend to outperform fastText, with the former achieving AUROC as high as 0.78, 0.62, and 0.85 for length-of-stay, readmission, and mortality prediction, respectively. In predicting medical codes in patient trajectories, GloVe achieves the highest performance for diagnosis and medication codes (AUPRC of 0.45 and of 0.81, respectively) at the highest level of the semantic hierarchy, while fastText outperforms the other models for procedure codes (AUPRC of 0.66). Our study demonstrates that subword information is crucial for learning medical concept representations, but global embedding vectors are better suited for more high-level downstream tasks, such as trajectory prediction. Thus, these models can be harnessed to learn representations that convey clinical meaning, and our insights highlight the potential of using machine learning techniques to semantically encode medical data.
{"title":"Comparing neural language models for medical concept representation and patient trajectory prediction","authors":"Alban Bornet , Dimitrios Proios , Anthony Yazdani , Fernando Jaume-Santero , Guy Haller , Edward Choi , Douglas Teodoro","doi":"10.1016/j.artmed.2025.103108","DOIUrl":"10.1016/j.artmed.2025.103108","url":null,"abstract":"<div><div>Effective representation of medical concepts is crucial for secondary analyses of electronic health records. Neural language models have shown promise in automatically deriving medical concept representations from clinical data. However, the comparative performance of different language models for creating these empirical representations, and the extent to which they encode medical semantics, has not been extensively studied. This study aims to address this gap by evaluating the effectiveness of three popular language models - word2vec, fastText, and GloVe - in creating medical concept embeddings that capture their semantic meaning. By using a large dataset of digital health records, we created patient trajectories and used them to train the language models. We then assessed the ability of the learned embeddings to encode semantics through an explicit comparison with biomedical terminologies, and implicitly by predicting patient outcomes and trajectories with different levels of available information. Our qualitative analysis shows that empirical clusters of embeddings learned by fastText exhibit the highest similarity with theoretical clustering patterns obtained from biomedical terminologies, with a similarity score between empirical and theoretical clusters of 0.88, 0.80, and 0.92 for diagnosis, procedure, and medication codes, respectively. Conversely, for outcome prediction, word2vec and GloVe tend to outperform fastText, with the former achieving AUROC as high as 0.78, 0.62, and 0.85 for length-of-stay, readmission, and mortality prediction, respectively. In predicting medical codes in patient trajectories, GloVe achieves the highest performance for diagnosis and medication codes (AUPRC of 0.45 and of 0.81, respectively) at the highest level of the semantic hierarchy, while fastText outperforms the other models for procedure codes (AUPRC of 0.66). Our study demonstrates that subword information is crucial for learning medical concept representations, but global embedding vectors are better suited for more high-level downstream tasks, such as trajectory prediction. Thus, these models can be harnessed to learn representations that convey clinical meaning, and our insights highlight the potential of using machine learning techniques to semantically encode medical data.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"163 ","pages":"Article 103108"},"PeriodicalIF":6.1,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143609348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-05DOI: 10.1016/j.artmed.2025.103093
Tiantian Wang , Joël M.H. Karel , Niels Osnabrugge , Kurt Driessens , Job Stoks , Matthijs J.M. Cluitmans , Paul G.A. Volders , Pietro Bonizzi , Ralf L.M. Peeters
Electrocardiographic imaging (ECGI) aims to noninvasively estimate heart surface potentials starting from body surface potentials. This is classically based on geometric information on the torso and the heart from imaging, which complicates clinical application. In this study, we aim to develop a deep learning framework to estimate heart surface potentials solely from body surface potentials, enabling wider clinical use. The framework introduces two main components: the transformation of 3D torso and heart geometries into standard 2D representations, and the development of a customized deep learning network model. The 2D torso and heart representations maintain a consistent layout across different subjects, making the proposed framework applicable to different torso-heart geometries. With spatial information incorporated in the 2D representations, the torso-heart physiological relationship can be learnt by the network. The deep learning model is based on a Pix2Pix network, adapted to work with 2.5D data in our task, i.e., 2D body surface potential maps (BSPMs) and 2D heart surface potential maps (HSPMs) with time sequential information. We propose a new loss function tailored to this specific task, which uses a cosine similarity and different weights for different inputs. BSPMs and HSPMs from 11 healthy subjects (8 females and 3 males) and 29 idiopathic ventricular fibrillation (IVF) patients (11 females and 18 males) were used in this study. Performance was assessed on a test set by measuring the similarity and error between the output of the proposed model and the solution provided by mainstream ECGI, by comparing HSPMs, the concatenated electrograms (EGMs), and the estimated activation time (AT) and recovery time (RT). The mean of the mean absolute error (MAE) for the HSPMs was 0.012 ± 0.011, and the mean of the corresponding structural similarity index measure (SSIM) was 0.984 ± 0.026. The mean of the MAE for the EGMs was 0.004 ± 0.004, and the mean of the corresponding Pearson correlation coefficient (PCC) was 0.643 ± 0.352. Results suggest that the model is able to precisely capture the structural and temporal characteristics of the HSPMs. The mean of the absolute time differences between estimated and reference activation times was 6.048 ± 5.188 ms, and the mean of the absolute differences for recovery times was 18.768 ± 17.299 ms. Overall, results show similar performance between the proposed model and standard ECGI, exhibiting low error and consistent clinical patterns, without the need for CT/MRI. The model shows to be effective across diverse torso-heart geometries, and it successfully integrates temporal information in the input. This in turn suggests the possible use of this model in cost effective clinical scenarios like patient screening or post-operative follow-up.
{"title":"Deep learning based estimation of heart surface potentials","authors":"Tiantian Wang , Joël M.H. Karel , Niels Osnabrugge , Kurt Driessens , Job Stoks , Matthijs J.M. Cluitmans , Paul G.A. Volders , Pietro Bonizzi , Ralf L.M. Peeters","doi":"10.1016/j.artmed.2025.103093","DOIUrl":"10.1016/j.artmed.2025.103093","url":null,"abstract":"<div><div>Electrocardiographic imaging (ECGI) aims to noninvasively estimate heart surface potentials starting from body surface potentials. This is classically based on geometric information on the torso and the heart from imaging, which complicates clinical application. In this study, we aim to develop a deep learning framework to estimate heart surface potentials solely from body surface potentials, enabling wider clinical use. The framework introduces two main components: the transformation of 3D torso and heart geometries into standard 2D representations, and the development of a customized deep learning network model. The 2D torso and heart representations maintain a consistent layout across different subjects, making the proposed framework applicable to different torso-heart geometries. With spatial information incorporated in the 2D representations, the torso-heart physiological relationship can be learnt by the network. The deep learning model is based on a Pix2Pix network, adapted to work with 2.5D data in our task, i.e., 2D body surface potential maps (BSPMs) and 2D heart surface potential maps (HSPMs) with time sequential information. We propose a new loss function tailored to this specific task, which uses a cosine similarity and different weights for different inputs. BSPMs and HSPMs from 11 healthy subjects (8 females and 3 males) and 29 idiopathic ventricular fibrillation (IVF) patients (11 females and 18 males) were used in this study. Performance was assessed on a test set by measuring the similarity and error between the output of the proposed model and the solution provided by mainstream ECGI, by comparing HSPMs, the concatenated electrograms (EGMs), and the estimated activation time (AT) and recovery time (RT). The mean of the mean absolute error (MAE) for the HSPMs was 0.012 ± 0.011, and the mean of the corresponding structural similarity index measure (SSIM) was 0.984 ± 0.026. The mean of the MAE for the EGMs was 0.004 ± 0.004, and the mean of the corresponding Pearson correlation coefficient (PCC) was 0.643 ± 0.352. Results suggest that the model is able to precisely capture the structural and temporal characteristics of the HSPMs. The mean of the absolute time differences between estimated and reference activation times was 6.048 ± 5.188 ms, and the mean of the absolute differences for recovery times was 18.768 ± 17.299 ms. Overall, results show similar performance between the proposed model and standard ECGI, exhibiting low error and consistent clinical patterns, without the need for CT/MRI. The model shows to be effective across diverse torso-heart geometries, and it successfully integrates temporal information in the input. This in turn suggests the possible use of this model in cost effective clinical scenarios like patient screening or post-operative follow-up.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"163 ","pages":"Article 103093"},"PeriodicalIF":6.1,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143591454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-04DOI: 10.1016/j.artmed.2025.103101
Haoyuan Shi , Tao Xu , Xiaodi Li , Qian Gao , Zhiwei Xiong , Junfeng Xia , Zhenyu Yue
Predicting the response of a cancer cell line to a therapeutic drug is pivotal for personalized medicine. Despite numerous deep learning methods that have been developed for drug response prediction, integrating diverse information about biological entities and predicting the directional response remain major challenges. Here, we propose a novel interpretable predictive model, DRExplainer, which leverages a directed graph convolutional network to enhance the prediction in a directed bipartite network framework. DRExplainer constructs a directed bipartite network integrating multi-omics profiles of cell lines, the chemical structure of drugs and known drug response to achieve directed prediction. Then, DRExplainer identifies the most relevant subgraph to each prediction in this directed bipartite network by learning a mask, facilitating critical medical decision-making. Additionally, we introduce a quantifiable method for model interpretability that leverages a ground truth benchmark dataset curated from biological features. In computational experiments, DRExplainer outperforms state-of-the-art predictive methods and another graph-based explanation method under the same experimental setting. Finally, the case studies further validate the interpretability and the effectiveness of DRExplainer in predictive novel drug response. Our code is available at: https://github.com/vshy-dream/DRExplainer.
{"title":"DRExplainer: Quantifiable interpretability in drug response prediction with directed graph convolutional network","authors":"Haoyuan Shi , Tao Xu , Xiaodi Li , Qian Gao , Zhiwei Xiong , Junfeng Xia , Zhenyu Yue","doi":"10.1016/j.artmed.2025.103101","DOIUrl":"10.1016/j.artmed.2025.103101","url":null,"abstract":"<div><div>Predicting the response of a cancer cell line to a therapeutic drug is pivotal for personalized medicine. Despite numerous deep learning methods that have been developed for drug response prediction, integrating diverse information about biological entities and predicting the directional response remain major challenges. Here, we propose a novel interpretable predictive model, DRExplainer, which leverages a directed graph convolutional network to enhance the prediction in a directed bipartite network framework. DRExplainer constructs a directed bipartite network integrating multi-omics profiles of cell lines, the chemical structure of drugs and known drug response to achieve directed prediction. Then, DRExplainer identifies the most relevant subgraph to each prediction in this directed bipartite network by learning a mask, facilitating critical medical decision-making. Additionally, we introduce a quantifiable method for model interpretability that leverages a ground truth benchmark dataset curated from biological features. In computational experiments, DRExplainer outperforms state-of-the-art predictive methods and another graph-based explanation method under the same experimental setting. Finally, the case studies further validate the interpretability and the effectiveness of DRExplainer in predictive novel drug response. Our code is available at: <span><span>https://github.com/vshy-dream/DRExplainer</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"163 ","pages":"Article 103101"},"PeriodicalIF":6.1,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143563727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}